9. Exporting Data Example#
This notebook shows different ways to export data from memory. This first example exports all conversations from local DuckDB memory with their respective score values in a JSON format. The data can currently be exported both as JSON file or a CSV file that will be saved in your results folder within PyRIT. The CSV export is commented out below. In this example, all conversations are exported, but by using other export functions from memory_interface
, we can export by specific labels and other methods.
from uuid import uuid4
from pyrit.common import DUCK_DB, initialize_pyrit
from pyrit.common.path import DB_DATA_PATH
from pyrit.memory import CentralMemory
from pyrit.models import PromptRequestPiece, PromptRequestResponse
initialize_pyrit(memory_db_type=DUCK_DB)
conversation_id = str(uuid4())
print(conversation_id)
message_list = [
PromptRequestPiece(
role="user", original_value="Hi, chat bot! This is my initial prompt.", conversation_id=conversation_id
),
PromptRequestPiece(
role="assistant", original_value="Nice to meet you! This is my response.", conversation_id=conversation_id
),
PromptRequestPiece(
role="user",
original_value="Wonderful! This is my second prompt to the chat bot!",
conversation_id=conversation_id,
),
]
duckdb_memory = CentralMemory.get_memory_instance()
duckdb_memory.add_request_response_to_memory(request=PromptRequestResponse([message_list[0]]))
duckdb_memory.add_request_response_to_memory(request=PromptRequestResponse([message_list[1]]))
duckdb_memory.add_request_response_to_memory(request=PromptRequestResponse([message_list[2]]))
entries = duckdb_memory.get_conversation(conversation_id=conversation_id)
for entry in entries:
print(entry)
# Define file path for export
json_file_path = DB_DATA_PATH / "conversation_and_scores_json_example.json"
# csv_file_path = DB_DATA_PATH / "conversation_and_scores_csv_example.csv"
# Export the data to a JSON file
conversation_with_scores = duckdb_memory.export_conversations(file_path=json_file_path, export_type="json")
print(f"Exported conversation with scores to JSON: {json_file_path}")
# Export the data to a CSV file
# conversation_with_scores = duckdb_memory.export_conversations(file_path=csv_file_path, export_type="csv")
# print(f"Exported conversation with scores to CSV: {csv_file_path}")
# Cleanup memory resources
duckdb_memory.dispose_engine()
f8ceb37b-ebd5-4efd-83b8-f9e579801db7
None: user: Hi, chat bot! This is my initial prompt.
None: assistant: Nice to meet you! This is my response.
None: user: Wonderful! This is my second prompt to the chat bot!
Exported conversation with scores to JSON: C:\Users\nichikan\source\repos\PyRIT-internal\PyRIT\dbdata\conversation_and_scores_json_example.json
You can also use the exported JSON or CSV files to import the data as a NumPy DataFrame. This can be useful for various data manipulation and analysis tasks.
import pandas as pd # type: ignore
df = pd.read_json(json_file_path)
df.head(1)
id | role | conversation_id | sequence | timestamp | labels | prompt_metadata | converter_identifiers | prompt_target_identifier | orchestrator_identifier | ... | original_value_data_type | original_value | original_value_sha256 | converted_value_data_type | converted_value | converted_value_sha256 | response_error | originator | original_prompt_id | scores | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 598e333a-ff30-403c-bda7-53b18e06e173 | user | f8ceb37b-ebd5-4efd-83b8-f9e579801db7 | 0 | 2025-01-07 15:11:26.206293 | NaN | NaN | NaN | NaN | NaN | ... | text | Hi, chat bot! This is my initial prompt. | NaN | text | Hi, chat bot! This is my initial prompt. | NaN | none | undefined | 598e333a-ff30-403c-bda7-53b18e06e173 | [] |
1 rows × 21 columns
Next, we can export data from our Azure SQL database. In this example, we export the data by conversation_id
and to a CSV file.
from pyrit.common import AZURE_SQL
initialize_pyrit(memory_db_type=AZURE_SQL)
azure_memory = CentralMemory.get_memory_instance()
conversation_id = str(uuid4())
message_list = [
PromptRequestPiece(
role="user", original_value="Hi, chat bot! This is my initial prompt.", conversation_id=conversation_id
),
PromptRequestPiece(
role="assistant", original_value="Nice to meet you! This is my response.", conversation_id=conversation_id
),
PromptRequestPiece(
role="user",
original_value="Wonderful! This is my second prompt to the chat bot!",
conversation_id=conversation_id,
),
]
azure_memory.add_request_response_to_memory(request=PromptRequestResponse([message_list[0]]))
azure_memory.add_request_response_to_memory(request=PromptRequestResponse([message_list[1]]))
azure_memory.add_request_response_to_memory(request=PromptRequestResponse([message_list[2]]))
entries = azure_memory.get_conversation(conversation_id=conversation_id)
for entry in entries:
print(entry)
# Define file path for export
# json_file_path = DB_DATA_PATH / "conversation_and_scores_json_example.json"
csv_file_path = DB_DATA_PATH / "conversation_and_scores_csv_example.csv"
# Export the data to a JSON file
# conversation_with_scores = azure_memory.export_conversations(conversation_id=conversation_id, file_path=json_file_path, export_type="json")
# print(f"Exported conversation with scores to JSON: {json_file_path}")
# Export the data to a CSV file
conversation_with_scores = azure_memory.export_conversations(
conversation_id=conversation_id, file_path=json_file_path, export_type="csv"
)
print(f"Exported conversation with scores to CSV: {csv_file_path}")
# Cleanup memory resources
azure_memory.dispose_engine()
None: user: Hi, chat bot! This is my initial prompt.
None: assistant: Nice to meet you! This is my response.
None: user: Wonderful! This is my second prompt to the chat bot!
Exported conversation with scores to CSV: C:\Users\nichikan\source\repos\PyRIT-internal\PyRIT\dbdata\conversation_and_scores_csv_example.csv