API Reference#

pyrit.analytics#

ConversationAnalytics

Handles analytics operations on conversation data, such as finding similar chat messages based on conversation history or embedding similarity.

pyrit.auth#

This module contains authentication functionality for a variety of services.

Authenticator

Abstract base class for authenticators.

AzureAuth

Azure CLI Authentication.

AzureStorageAuth

A utility class for Azure Storage authentication, providing methods to generate SAS tokens using user delegation keys.

pyrit.auxiliary_attacks#

pyrit.chat_message_normalizer#

This module contains the functionality to normalize chat messages into compatible formats for targets.

ChatMessageNormalizer

ChatMessageNop

A no-op chat message normalizer that does not modify the input messages.

GenericSystemSquash

ChatMessageNormalizerChatML

A chat message normalizer that converts a list of chat messages to a ChatML string.

ChatMessageNormalizerTokenizerTemplate

This class enables you to apply the chat template stored in a Hugging Face tokenizer to a list of chat messages.

pyrit.cli#

This module provides command-line interface functionalities for PyRIT. The CLI module is currently experimental.

pyrit.common#

This module contains common utilities for PyRIT.

combine_dict

Combines two dictionaries containing string keys and values into one.

combine_list

Combines two lists containing string keys, keeping only unique values.

convert_local_image_to_data_url

Converts a local image file to a data URL encoded in base64.

display_image_response

Displays response images if running in notebook environment.

download_chunk

Download a chunk of the file with a specified byte range.

download_file

Download a file in multiple segments (splits) using byte-range requests.

download_files

Download multiple files with parallel downloads and segmented downloading.

download_specific_files

Downloads specific files from a Hugging Face model repository.

get_available_files

Fetches available files for a model from the Hugging Face repository.

get_httpx_client

Get the httpx client for making requests.

get_non_required_value

Gets a non-required value from an environment variable or a passed value, prefering the passed value.

get_random_indices

Generate a list of random indices based on the specified proportion of a given size.

get_required_value

Gets a required value from an environment variable or a passed value, prefering the passed value

initialize_pyrit

Initializes PyRIT with the provided memory instance and loads environment files.

is_in_ipython_session

Determines if the code is running in an IPython session.

make_request_and_raise_if_error_async

Make a request and raise an exception if it fails.

print_chat_messages_with_color

Print chat messages with color to console.

Singleton

A metaclass for creating singleton classes.

YamlLoadable

Abstract base class for objects that can be loaded from YAML files.

pyrit.datasets#

fetch_adv_bench_dataset

Retrieve AdvBench examples enhanced with categories from a collaborative and human-centered harms taxonomy.

fetch_aya_redteaming_dataset

Fetch examples from the Aya Red-teaming dataset with optional filtering and create a SeedPromptDataset.

fetch_babelscape_alert_dataset

Fetch the Babelscape/ALERT dataset and create a SeedPromptDataset.

fetch_ccp_sensitive_prompts_dataset

Fetch CCP-sensitive-prompts examples and create a SeedPromptDataset.

fetch_darkbench_dataset

Fetch DarkBench examples and create a SeedPromptDataset.

fetch_decoding_trust_stereotypes_dataset

Fetch DecodingTrust Stereotypes examples and create a SeedPromptDataset.

fetch_equitymedqa_dataset_unique_values

Fetches the EquityMedQA dataset from Hugging Face and returns a SeedPromptDataset.

fetch_examples

Fetch examples from a specified source with caching support.

fetch_forbidden_questions_dataset

Fetch Forbidden question dataset and return it as a SeedPromptDataset

fetch_harmbench_dataset

Fetch HarmBench examples and create a SeedPromptDataset.

fetch_librAI_do_not_answer_dataset

Fetch the LibrAI 'Do Not Answer' dataset and return it as a SeedPromptDataset.

fetch_llm_latent_adversarial_training_harmful_dataset

fetch_many_shot_jailbreaking_dataset

Fetch many-shot jailbreaking dataset from a specified source.

fetch_mlcommons_ailuminate_demo_dataset

Fetch examples from AILuminate v1.0 DEMO Prompt Set and create a SeedPromptDataset.

fetch_multilingual_vulnerability_dataset

Fetch multilingual vulnerability examples from "A Framework to Assess Multilingual Vulnerabilities of LLMs" and create a SeedPromptDataset.

fetch_pku_safe_rlhf_dataset

Fetch PKU-SafeRLHF examples and create a SeedPromptDataset.

fetch_seclists_bias_testing_dataset

Fetch SecLists AI LLM Bias Testing examples from a specified source and create a SeedPromptDataset.

fetch_sosbench_dataset

Fetch SOSBench dataset and create a SeedPromptDataset.

fetch_tdc23_redteaming_dataset

Fetch TDC23-RedTeaming examples and create a SeedPromptDataset.

fetch_wmdp_dataset

Fetch WMDP examples and create a QuestionAnsweringDataset.

fetch_xstest_dataset

Fetch XSTest examples and create a SeedPromptDataset.

pyrit.embedding#

pyrit.exceptions#

BadRequestException

Exception class for bad client requests.

EmptyResponseException

Exception class for empty response errors.

handle_bad_request_exception

InvalidJsonException

Exception class for blocked content errors.

MissingPromptPlaceholderException

Exception class for missing prompt placeholder errors.

PyritException

pyrit_json_retry

A decorator to apply retry logic with exponential backoff to a function.

pyrit_target_retry

A decorator to apply retry logic with exponential backoff to a function.

pyrit_placeholder_retry

A decorator to apply retry logic.

RateLimitException

Exception class for authentication errors.

remove_markdown_json

Checks if the response message is in JSON format and removes Markdown formatting if present.

pyrit.memory#

AzureSQLMemory

A class to manage conversation memory using Azure SQL Server as the backend database.

CentralMemory

Provides a centralized memory instance across the framework.

DuckDBMemory

A class to manage conversation memory using DuckDB as the backend database.

EmbeddingDataEntry

Represents the embedding data associated with conversation entries in the database.

MemoryInterface

Abstract interface for conversation memory storage systems.

MemoryEmbedding

The MemoryEmbedding class is responsible for encoding the memory embeddings.

MemoryExporter

Handles the export of data to various formats, currently supporting only JSON format.

PromptMemoryEntry

Represents the prompt data.

pyrit.models#

ALLOWED_CHAT_MESSAGE_ROLES

Built-in mutable sequence.

AudioPathDataTypeSerializer

AzureBlobStorageIO

Implementation of StorageIO for Azure Blob Storage.

ChatMessage

ChatMessagesDataset

Represents a dataset of chat messages.

ChatMessageRole

alias of Literal['system', 'user', 'assistant']

ChatMessageListDictContent

construct_response_from_request

Constructs a response entry from a request.

DataTypeSerializer

Abstract base class for data type normalizers.

data_serializer_factory

Factory method to create a DataTypeSerializer instance.

DiskStorageIO

Implementation of StorageIO for local disk storage.

EmbeddingData

EmbeddingResponse

EmbeddingSupport

EmbeddingUsageInformation

ErrorDataTypeSerializer

group_conversation_request_pieces_by_sequence

Groups prompt request pieces from the same conversation into PromptRequestResponses.

Identifier

ImagePathDataTypeSerializer

PromptRequestPiece

Represents a piece of a prompt request to a target.

PromptResponse

PromptResponseError

alias of Literal['blocked', 'none', 'processing', 'empty', 'unknown']

PromptDataType

alias of Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'reasoning', 'error']

PromptRequestResponse

Represents a response to a prompt request.

QuestionAnsweringDataset

Represents a dataset for question answering.

QuestionAnsweringEntry

Represents a question model.

QuestionChoice

Represents a choice for a question.

Score

ScoreType

alias of Literal['true_false', 'float_scale']

SeedPrompt

Represents a seed prompt with various attributes and metadata.

SeedPromptDataset

SeedPromptDataset manages seed prompts plus optional top-level defaults.

SeedPromptGroup

A group of prompts that need to be sent together, along with an objective.

StorageIO

Abstract interface for storage systems (local disk, Azure Storage Account, etc.).

TextDataTypeSerializer

UnvalidatedScore

Score is an object that validates all the fields.

pyrit.orchestrator#

CrescendoOrchestrator

The CrescendoOrchestrator class represents an orchestrator that executes the Crescendo attack.

FlipAttackOrchestrator

This orchestrator implements the Flip Attack method found here: https://arxiv.org/html/2410.02832v1.

FuzzerOrchestrator

An orchestrator that explores a variety of jailbreak options via fuzzing.

MultiTurnOrchestrator

The MultiTurnOrchestrator is an interface that coordinates attacks and conversations between a adversarial_chat target and an objective_target.

Orchestrator

OrchestratorResult

The result of an orchestrator.

PAIROrchestrator

This orchestrator implements the Prompt Automatic Iterative Refinement (PAIR) algorithm

PromptSendingOrchestrator

QuestionAnsweringBenchmarkOrchestrator

Question Answering Benchmark Orchestrator for processing multiple choice questions.

RedTeamingOrchestrator

ScoringOrchestrator

This orchestrator scores prompts in a parallelizable and convenient way.

SkeletonKeyOrchestrator

Creates an orchestrator that executes a skeleton key jailbreak.

TreeOfAttacksWithPruningOrchestrator

TreeOfAttacksWithPruningOrchestrator follows the TAP alogrithm to attack a chat target.

XPIAManualProcessingOrchestrator

XPIAOrchestrator

XPIATestOrchestrator

pyrit.prompt_converter#

AddImageTextConverter

Adds a string to an image and wraps the text into multiple lines if necessary.

AddImageVideoConverter

Adds an image to a video at a specified position.

AddTextImageConverter

Adds a string to an image and wraps the text into multiple lines if necessary.

AnsiAttackConverter

Generates prompts with ANSI codes to evaluate LLM behavior and system risks.

AsciiArtConverter

Uses the art package to convert text into ASCII art.

AsciiSmugglerConverter

Implements encoding and decoding using Unicode Tags.

AtbashConverter

Encodes text using the Atbash cipher.

AudioFrequencyConverter

Shifts the frequency of an audio file by a specified value.

AzureSpeechAudioToTextConverter

Transcribes a .wav audio file into text using Azure AI Speech service.

AzureSpeechTextToAudioConverter

Generates a wave file from a text prompt using Azure AI Speech service.

Base64Converter

Converter that encodes text to base64 format.

BinaryConverter

Transforms input text into its binary representation with configurable bits per character (8, 16, or 32).

CaesarConverter

Encodes text using the Caesar cipher with a specified offset.

CharacterSpaceConverter

Spaces out the input prompt and removes specified punctuations.

CharSwapConverter

Applies character swapping to words in the prompt to test adversarial textual robustness.

CodeChameleonConverter

Encrypts user prompt, adds stringified decrypt function in markdown and instructions.

ColloquialWordswapConverter

Converts text into colloquial Singaporean context.

ConverterResult

The result of a prompt conversion, containing the converted output and its type.

DenylistConverter

Replaces forbidden words or phrases in a prompt with synonyms using an LLM.

DiacriticConverter

Applies diacritics to specified characters in a string.

EmojiConverter

Converts English text to randomly chosen circle or square character emojis.

FlipConverter

Flips the input text prompt.

FuzzerCrossOverConverter

Uses multiple prompt templates to generate new prompts.

FuzzerExpandConverter

Generates versions of a prompt with new, prepended sentences.

FuzzerRephraseConverter

Generates versions of a prompt with rephrased sentences.

FuzzerShortenConverter

Generates versions of a prompt with shortened sentences.

FuzzerSimilarConverter

Generates versions of a prompt with similar sentences.

HumanInTheLoopConverter

Allows review of each prompt sent to a target before sending it.

InsertPunctuationConverter

Inserts punctuation into a prompt to test robustness.

LeetspeakConverter

Converts a string to a leetspeak version.

LLMGenericTextConverter

Represents a generic LLM converter that expects text to be transformed (e.g. no JSON parsing or format).

MaliciousQuestionGeneratorConverter

Generates malicious questions using an LLM.

MathPromptConverter

Converts natural language instructions into symbolic mathematics problems using an LLM.

MorseConverter

Encodes prompts using morse code.

NoiseConverter

Injects noise errors into a conversation using an LLM.

PDFConverter

Converts a text prompt into a PDF file.

PersuasionConverter

Rephrases prompts using a variety of persuasion techniques.

PromptConverter

Base class for converters that transform prompts into a different representation or format.

QRCodeConverter

Converts a text string to a QR code image.

RandomCapitalLettersConverter

Takes a prompt and randomly capitalizes it by a percentage of the total characters.

RepeatTokenConverter

Repeats a specified token a specified number of times in addition to a given prompt.

ROT13Converter

Encodes prompts using the ROT13 cipher.

SearchReplaceConverter

Converts a string by replacing chosen phrase with a new phrase of choice.

SneakyBitsSmugglerConverter

Encodes and decodes text using a bit-level approach.

StringJoinConverter

Converts text by joining its characters with the specified join value.

SuffixAppendConverter

Appends a specified suffix to the prompt.

SuperscriptConverter

Converts text to superscript.

TemplateSegmentConverter

Uses a template to randomly split a prompt into segments defined by the template.

TenseConverter

Converts a conversation to a different tense using an LLM.

TextJailbreakConverter

Uses a jailbreak template to create a prompt.

TextToHexConverter

Converts text to a hexadecimal encoded utf-8 string.

ToneConverter

Converts a conversation to a different tone using an LLM.

ToxicSentenceGeneratorConverter

Generates toxic sentence starters using an LLM.

TranslationConverter

Translates prompts into different languages using an LLM.

UnicodeConfusableConverter

Applies substitutions to words in the prompt to test adversarial textual robustness by replacing characters with visually similar ones.

UnicodeReplacementConverter

Converts a prompt to its unicode representation.

UnicodeSubstitutionConverter

Encodes the prompt using any unicode starting point.

UrlConverter

Converts a prompt to a URL-encoded string.

VariationConverter

Generates variations of the input prompts using the converter target.

VariationSelectorSmugglerConverter

Encodes and decodes text using Unicode Variation Selectors.

ZalgoConverter

Converts text into cursed Zalgo text using combining Unicode marks.

ZeroWidthConverter

Injects zero-width spaces between characters in the provided text to bypass content safety mechanisms.

pyrit.prompt_normalizer#

PromptNormalizer

PromptConverterConfiguration

Represents the configuration for a prompt response converter.

NormalizerRequest

Represents a single request sent to normalizer.

pyrit.prompt_target#

AzureBlobStorageTarget

The AzureBlobStorageTarget takes prompts, saves the prompts to a file, and stores them as a blob in a provided storage account container.

AzureMLChatTarget

CrucibleTarget

GandalfLevel

GandalfTarget

HTTPTarget

HTTP_Target is for endpoints that do not have an API and instead require HTTP request(s) to send a prompt

HuggingFaceChatTarget

The HuggingFaceChatTarget interacts with HuggingFace models, specifically for conducting red teaming activities.

HuggingFaceEndpointTarget

The HuggingFaceEndpointTarget interacts with HuggingFace models hosted on cloud endpoints.

limit_requests_per_minute

A decorator to enforce rate limit of the target through setting requests per minute.

OpenAICompletionTarget

OpenAIDALLETarget

OpenAI DALL-E Target for generating images from text prompts.

OpenAIChatTarget

This class facilitates multimodal (image and text) input and text output generation

OpenAIResponseTarget

This class enables communication with endpoints that support the OpenAI Response API.

OpenAITTSTarget

OpenAITarget

PromptChatTarget

A prompt chat target is a target where you can explicitly set the conversation history using memory.

PromptShieldTarget

PromptShield is an endpoint which detects the presence of a jailbreak.

PromptTarget

TextTarget

The TextTarget takes prompts, adds them to memory and writes them to io which is sys.stdout by default

pyrit.score#

AzureContentFilterScorer

ContentClassifierPaths

CompositeScorer

A scorer that aggregates other true_false scorers using a specified aggregation function.

FloatScaleThresholdScorer

A scorer that applies a threshold to a float scale score to make it a true/false score.

GandalfScorer

HumanInTheLoopScorer

Create scores from manual human input and adds them to the database.

HumanInTheLoopScorerGradio

Create scores from manual human input using Gradio and adds them to the database.

LikertScalePaths

LookBackScorer

Create a score from analyzing the entire conversation and adds them to the database.

MarkdownInjectionScorer

PromptShieldScorer

Returns true if an attack or jailbreak has been detected by Prompt Shield.

QuestionAnswerScorer

A class that represents a question answering scorer.

Scorer

Abstract base class for scorers.

ScoreAggregator

alias of Callable[[Iterable[Score]], ScoreAggregatorResult]

SelfAskCategoryScorer

A class that represents a self-ask score for text classification and scoring.

SelfAskLikertScorer

A class that represents a "self-ask" score for text scoring for a likert scale.

SelfAskRefusalScorer

A self-ask scorer that detects refusal in AI responses.

SelfAskScaleScorer

A class that represents a "self-ask" score for text scoring for a customizable numeric scale.

SelfAskTrueFalseScorer

A class that represents a self-ask true/false for scoring.

SelfAskQuestionAnswerScorer

A class that represents a self-ask question answering scorer.

SubStringScorer

Scorer that checks if a given substring is present in the text.

TrueFalseInverterScorer

A scorer that inverts a true false score.

TrueFalseQuestion

A class that represents a true/false question.

TrueFalseQuestionPaths