pyrit.executor.attack.ContextComplianceAttack#
- class ContextComplianceAttack(*, objective_target: PromptChatTarget, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0, context_description_instructions_path: Path | None = None, affirmative_response: str | None = None)[source]#
Bases:
PromptSendingAttack
Implementation of the context compliance attack strategy.
This attack attempts to bypass safety measures by rephrasing the objective into a more benign context. It uses an adversarial chat target to: 1. Rephrase the objective as a more benign question 2. Generate a response to the benign question 3. Rephrase the original objective as a follow-up question
This creates a context that makes it harder for the target to detect the true intent.
- __init__(*, objective_target: PromptChatTarget, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0, context_description_instructions_path: Path | None = None, affirmative_response: str | None = None) None [source]#
Initialize the context compliance attack strategy.
- Parameters:
objective_target (PromptChatTarget) – The target system to attack. Must be a PromptChatTarget.
attack_adversarial_config (AttackAdversarialConfig) – Configuration for the adversarial component, including the adversarial chat target used for rephrasing.
attack_converter_config (Optional[AttackConverterConfig]) – Configuration for attack converters, including request and response converters.
attack_scoring_config (Optional[AttackScoringConfig]) – Configuration for attack scoring.
prompt_normalizer (Optional[PromptNormalizer]) – The prompt normalizer to use for sending prompts.
max_attempts_on_failure (int) – Maximum number of attempts to retry on failure.
context_description_instructions_path (Optional[Path]) – Path to the context description instructions YAML file. If not provided, uses the default path.
affirmative_response (Optional[str]) – The affirmative response to be used in the conversation history. If not provided, uses the default “yes.”.
- Raises:
ValueError – If the context description instructions file is invalid.
Methods
__init__
(*, objective_target, ...[, ...])Initialize the context compliance attack strategy.
execute_async
(**kwargs)Execute the attack strategy asynchronously with the provided parameters.
execute_with_context_async
(*, context)Execute strategy with complete lifecycle management.
get_identifier
()Attributes