API Reference#
pyrit.analytics
#
Handles analytics operations on conversation data, such as finding similar chat messages based on conversation history or embedding similarity. |
pyrit.auth
#
Azure CLI Authentication. |
|
A utility class for Azure Storage authentication, providing methods to generate SAS tokens using user delegation keys. |
pyrit.auxiliary_attacks
#
pyrit.chat_message_normalizer
#
This class enables you to apply the chat template stored in a Hugging Face tokenizer to a list of chat messages. |
pyrit.common
#
pyrit.datasets
#
Fetch DecodingTrust examples and create a SeedPromptDataset. |
|
Fetch examples from a specified source with caching support. |
|
Fetch HarmBench examples and create a SeedPromptDataset. |
|
Fetch many-shot jailbreaking examples from a specified source. |
|
Fetch SecLists AI LLM Bias Testing examples from a specified source and create a SeedPromptDataset. |
|
Fetch XSTest examples and create a SeedPromptDataset. |
|
Fetch PKU-SafeRLHF examples and create a SeedPromptDataset. |
|
Fetch AdvBench examples and create a SeedPromptDataset. |
|
Fetch WMDP examples and create a QuestionAnsweringDataset. |
|
Fetch Forbidden question dataset and return it as a SeedPromptDataset |
|
Fetch TDC23-RedTeaming examples and create a SeedPromptDataset. |
pyrit.embedding
#
pyrit.exceptions
#
Exception class for bad client requests. |
|
Exception class for empty response errors. |
|
Exception class for blocked content errors. |
|
Exception class for missing prompt placeholder errors. |
|
A decorator to apply retry logic with exponential backoff to a function. |
|
A decorator to apply retry logic with exponential backoff to a function. |
|
A decorator to apply retry logic. |
|
Exception class for authentication errors. |
|
Checks if the response message is in JSON format and removes Markdown formatting if present. |
pyrit.memory
#
A class to manage conversation memory using Azure SQL Server as the backend database. |
|
Provides a centralized memory instance across the framework. |
|
A class to manage conversation memory using DuckDB as the backend database. |
|
Represents the embedding data associated with conversation entries in the database. |
|
Represents a conversation memory that stores chat messages. |
|
The MemoryEmbedding class is responsible for encoding the memory embeddings. |
|
Handles the export of data to various formats, currently supporting only JSON format. |
|
Represents the prompt data. |
pyrit.models
#
Built-in mutable sequence. |
|
Implementation of StorageIO for Azure Blob Storage. |
|
Represents a dataset of chat messages. |
|
alias of |
|
Constructs a response entry from a request. |
|
Abstract base class for data type normalizers. |
|
Implementation of StorageIO for local disk storage. |
|
Groups prompt request pieces from the same conversation into PromptRequestResponses. |
|
Represents a prompt request piece. |
|
alias of |
|
alias of |
|
Represents a response to a prompt request. |
|
Represents a dataset for question answering. |
|
Represents a question model. |
|
Represents a choice for a question. |
|
alias of |
|
Represents a seed prompt with various attributes and metadata. |
|
SeedPromptDataset class helps to read a list of seed prompts, which can be loaded from a YAML file. |
|
A group of prompts that need to be sent together. |
|
Abstract interface for storage systems (local disk, Azure Storage Account, etc.). |
|
Score is an object that validates all the fields. |
pyrit.orchestrator
#
The CrescendoOrchestrator class represents an orchestrator that executes the Crescendo attack. |
|
This orchestrator implements the Flip Attack method found here: https://arxiv.org/html/2410.02832v1. |
|
The result of a multi-turn attack. |
|
The MultiTurnOrchestrator is an interface that coordinates attacks and conversations between a adversarial_chat target and an objective_target. |
|
This orchestrator implements the Prompt Automatic Iterative Refinement (PAIR) algorithm |
|
This orchestrator takes a set of prompts, converts them using the list of PromptConverters, sends them to a target, and scores the resonses with scorers (if provided). |
|
The RedTeamingOrchestrator class orchestrates a multi-turn red teaming attack on a target system. |
|
This orchestrator scores prompts in a parallelizable and convenient way. |
|
Creates an orchestrator that executes a skeleton key jailbreak. |
|
pyrit.prompt_converter
#
Adds a string to an image and wraps the text into multiple lines if necessary. |
|
Adds a string to an image and wraps the text into multiple lines if necessary. |
|
Converts a string to ASCII art |
|
Converter to encode prompt using atbash cipher. |
|
The AudioFrequencyConverter takes an audio file and shifts its frequency, by default it will shift it above human range (=20kHz). |
|
The AzureSpeechAudioTextConverter takes a .wav file and transcribes it into text. |
|
The AzureSpeechTextToAudio takes a prompt and generates a wave file. |
|
Converter to encode prompt using caesar cipher. |
|
The CodeChameleon Converter uses a combination of personal encryption and decryption functions, code nesting, as well as a set of instructions for the response to bypass LLM safeguards. |
|
Fuzzer converter that uses multiple prompt templates to generate new prompts. |
|
Allows review of each prompt sent to a target before sending it. |
|
Converts a string to a leetspeak version |
|
A PromptConverter that generates malicious questions using an LLM via an existing PromptTarget (like Azure OpenAI). |
|
A PromptConverter that converts natural language instructions into symbolic mathematics problems using an LLM via an existing PromptTarget (like Azure OpenAI or other supported backends). |
|
Converter to encode prompts using morse code. |
|
Converter to rephrase prompts using a variety of persuasion techniques. |
|
A prompt converter is responsible for converting prompts into a different representation. |
|
Converts a text string to a QR code image. |
|
This converter takes a prompt and randomly capitalizes it by a percentage of the total characters. |
|
Repeat a specified token a specified number of times in addition to a given prompt. |
|
Converts a string by replacing chosen phrase with a new phrase of choice |
|
A PromptConverter that applies substitutions to words in the prompt to test adversarial textual robustness by replacing characters with visually similar ones. |
|
pyrit.prompt_normalizer
#
Represents the configuration for a prompt response converter. |
|
pyrit.prompt_target
#
The AzureBlobStorageTarget takes prompts, saves the prompts to a file, and stores them as a blob in a provided storage account container. |
|
HTTP_Target is for endpoints that do not have an API and instead require HTTP request(s) to send a prompt |
|
The HuggingFaceChatTarget interacts with HuggingFace models, specifically for conducting red teaming activities. |
|
The HuggingFaceEndpointTarget interacts with HuggingFace models hosted on cloud endpoints. |
|
A decorator to enforce rate limit of the target through setting requests per minute. |
|
The Dalle3Target takes a prompt and generates images This class initializes a DALL-E image target |
|
This class facilitates multimodal (image and text) input and text output generation |
|
PromptShield is an endpoint which detects the presence of a jailbreak. |
|
The TextTarget takes prompts, adds them to memory and writes them to io which is sys.stdout by default |
pyrit.score
#
A scorer that applies a threshold to a float scale score to make it a true/false score. |
|
Create scores from manual human input and adds them to the database. |
|
Returns true if an attack or jailbreak has been detected by Prompt Shield. |
|
Abstract base class for scorers. |
|
A class that represents a self-ask score for text classification and scoring. |
|
A class that represents a "self-ask" score for text scoring for a likert scale. |
|
A self-ask scorer detects a refusal. |
|
A class that represents a "self-ask" score for text scoring for a customizable numeric scale. |
|
A class that represents a self-ask true/false for scoring. |
|
Scorer that checks if a given substring is present in the text. |
|
A scorer that inverts a true false score. |
|
A class that represents a true/false question. |
|