pyrit.orchestrator.ContextComplianceOrchestrator

pyrit.orchestrator.ContextComplianceOrchestrator#

class ContextComplianceOrchestrator(**kwargs)[source]#

Bases: PromptSendingOrchestrator

Warning

ContextComplianceOrchestrator is deprecated and will be removed in v0.12.0; use pyrit.executor.attack.ContextComplianceAttack instead.

This orchestrator implements a context compliance attack that attempts to bypass safety measures by rephrasing the objective into a more benign context. It uses an adversarial chat target to: 1. Rephrase the objective as a more benign question 2. Generate a response to the benign question 3. Rephrase the original objective as a follow-up question This creates a context that makes it harder for the target to detect the true intent.

__init__(objective_target: PromptChatTarget, adversarial_chat: PromptChatTarget, affirmative_response: str = 'yes.', context_description_instructions_path: Path | None = None, request_converter_configurations: list[PromptConverterConfiguration] | None = None, response_converter_configurations: list[PromptConverterConfiguration] | None = None, objective_scorer: Scorer | None = None, auxiliary_scorers: list[Scorer] | None = None, batch_size: int = 10, retries_on_objective_failure: int = 0, verbose: bool = False) → None[source]#

Parameters:

objective_target (PromptChatTarget) – The target for sending prompts.
adversarial_chat (PromptChatTarget) – The target used to rephrase objectives into benign contexts.
affirmative_response (str, Optional) – The affirmative response to be used in the conversation history.
context_description_instructions_path (pathlib.Path, Optional) – Path to the context description instructions YAML file.
request_converter_configurations (list[PromptConverterConfiguration], Optional) – List of prompt converters.
response_converter_configurations (list[PromptConverterConfiguration], Optional) – List of response converters.
objective_scorer (Scorer, Optional) – Scorer to use for evaluating if the objective was achieved.
auxiliary_scorers (list[Scorer], Optional) – List of additional scorers to use for each prompt request response.
batch_size (int, Optional) – The (max) batch size for sending prompts. Defaults to 10. Note: If providing max requests per minute on the prompt_target, this should be set to 1 to ensure proper rate limit management.
retries_on_objective_failure (int, Optional) – Number of retries to attempt if objective fails. Defaults to 0.
verbose (bool, Optional) – Whether to log debug information. Defaults to False.

Methods

`__init__`(objective_target, adversarial_chat)
`dispose_db_engine`()	Dispose database engine to release database connections and resources.
`get_identifier`()
`get_memory`()	Retrieves the memory associated with this orchestrator.
`get_score_memory`()	Retrieves the scores of the PromptRequestPieces associated with this orchestrator.
`run_attack_async`(*, objective[, memory_labels])	Runs the attack.
`run_attacks_async`(*, objectives[, memory_labels])	Runs multiple attacks in parallel using batch_size.
`set_skip_criteria`(*, skip_criteria[, ...])	Sets the skip criteria for the orchestrator.

async run_attack_async(*, objective: str, memory_labels: dict[str, str] | None = None) → OrchestratorResult[source]#

Runs the attack.

Parameters:

objective (str) – The objective of the attack.
seed_prompt (SeedPromptGroup, Optional) – The seed prompt group to start the conversation. By default the objective is used.
prepended_conversation (list[PromptRequestResponse], Optional) – The conversation to prepend to the attack. Sent to objective target.
memory_labels (dict[str, str], Optional) – The memory labels to use for the attack.

async run_attacks_async(*, objectives: list[str], memory_labels: dict[str, str] | None = None) → list[OrchestratorResult][source]#

Runs multiple attacks in parallel using batch_size.

Parameters:

objectives (list[str]) – List of objectives for the attacks.
seed_prompts (list[SeedPromptGroup], Optional) – List of seed prompt groups to start the conversations. If not provided, each objective will be used as its own seed prompt.
prepended_conversation (list[PromptRequestResponse], Optional) – The conversation to prepend to each attack.
memory_labels (dict[str, str], Optional) – The memory labels to use for the attacks.

Returns:

List of results from each attack.

Return type:

list[OrchestratorResult]

pyrit.orchestrator.ContextComplianceOrchestrator

Contents

pyrit.orchestrator.ContextComplianceOrchestrator#