pyrit.executor.attack.ContextComplianceAttack#

class ContextComplianceAttack(*, objective_target: PromptChatTarget, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0, context_description_instructions_path: Path | None = None, affirmative_response: str | None = None)[source]#

Bases: PromptSendingAttack

Implementation of the context compliance attack strategy.

This attack attempts to bypass safety measures by rephrasing the objective into a more benign context. It uses an adversarial chat target to: 1. Rephrase the objective as a more benign question 2. Generate a response to the benign question 3. Rephrase the original objective as a follow-up question

This creates a context that makes it harder for the target to detect the true intent.

__init__(*, objective_target: PromptChatTarget, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0, context_description_instructions_path: Path | None = None, affirmative_response: str | None = None) None[source]#

Initialize the context compliance attack strategy.

Parameters:
  • objective_target (PromptChatTarget) – The target system to attack. Must be a PromptChatTarget.

  • attack_adversarial_config (AttackAdversarialConfig) – Configuration for the adversarial component, including the adversarial chat target used for rephrasing.

  • attack_converter_config (Optional[AttackConverterConfig]) – Configuration for attack converters, including request and response converters.

  • attack_scoring_config (Optional[AttackScoringConfig]) – Configuration for attack scoring.

  • prompt_normalizer (Optional[PromptNormalizer]) – The prompt normalizer to use for sending prompts.

  • max_attempts_on_failure (int) – Maximum number of attempts to retry on failure.

  • context_description_instructions_path (Optional[Path]) – Path to the context description instructions YAML file. If not provided, uses the default path.

  • affirmative_response (Optional[str]) – The affirmative response to be used in the conversation history. If not provided, uses the default “yes.”.

Raises:

ValueError – If the context description instructions file is invalid.

Methods

__init__(*, objective_target, ...[, ...])

Initialize the context compliance attack strategy.

execute_async(**kwargs)

Execute the attack strategy asynchronously with the provided parameters.

execute_with_context_async(*, context)

Execute strategy with complete lifecycle management.

get_identifier()

Attributes

DEFAULT_AFFIRMATIVE_RESPONSE: str = 'yes.'#
DEFAULT_CONTEXT_DESCRIPTION_PATH: Path = PosixPath('/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/pyrit/datasets/executors/context_compliance/context_description.yaml')#