1. DuckDB Memory#
The memory DuckDB database can be thought of as a normalized source of truth. The memory module is the primary way pyrit keeps track of requests and responses to targets and scores. Most of this is done automatically. All Prompt Targets write to memory for later retrieval. All scorers also write to memory when scoring.
The schema is found in memory_models.py
and can be programatically viewed as follows
from pyrit.memory import DuckDBMemory
memory = DuckDBMemory()
memory.print_schema()
memory.dispose_engine()
C:\Users\Roman\.conda\envs\release-test-v0.5.0\Lib\site-packages\duckdb_engine\__init__.py:565: SAWarning: Did not recognize type 'list' of column 'embedding'
columns = self._get_columns_info(rows, domains, enums, schema) # type: ignore[attr-defined]
C:\Users\Roman\.conda\envs\release-test-v0.5.0\Lib\site-packages\duckdb_engine\__init__.py:180: DuckDBEngineWarning: duckdb-engine doesn't yet support reflection on indices
warnings.warn(
Schema for EmbeddingData:
Column id (UUID)
Column embedding (NULL)
Column embedding_type_name (VARCHAR)
Schema for PromptMemoryEntries:
Column id (UUID)
Column role (VARCHAR)
Column conversation_id (VARCHAR)
Column sequence (INTEGER)
Column timestamp (TIMESTAMP)
Column labels (VARCHAR)
Column prompt_metadata (VARCHAR)
Column converter_identifiers (VARCHAR)
Column prompt_target_identifier (VARCHAR)
Column orchestrator_identifier (VARCHAR)
Column response_error (VARCHAR)
Column original_value_data_type (VARCHAR)
Column original_value (VARCHAR)
Column original_value_sha256 (VARCHAR)
Column converted_value_data_type (VARCHAR)
Column converted_value (VARCHAR)
Column converted_value_sha256 (VARCHAR)
Column original_prompt_id (UUID)
Schema for ScoreEntries:
Column id (UUID)
Column score_value (VARCHAR)
Column score_value_description (VARCHAR)
Column score_type (VARCHAR)
Column score_category (VARCHAR)
Column score_rationale (VARCHAR)
Column score_metadata (VARCHAR)
Column scorer_class_identifier (VARCHAR)
Column prompt_request_response_id (UUID)
Column timestamp (TIMESTAMP)
Column task (VARCHAR)
Schema for SeedPromptEntries:
Column id (UUID)
Column value (VARCHAR)
Column data_type (VARCHAR)
Column name (VARCHAR)
Column dataset_name (VARCHAR)
Column harm_categories (VARCHAR)
Column description (VARCHAR)
Column authors (VARCHAR)
Column groups (VARCHAR)
Column source (VARCHAR)
Column date_added (TIMESTAMP)
Column added_by (VARCHAR)
Column prompt_metadata (VARCHAR)
Column parameters (VARCHAR)
Column prompt_group_id (UUID)
Column sequence (INTEGER)