pyrit.executor.attack.ContextComplianceAttack

pyrit.executor.attack.ContextComplianceAttack#

class ContextComplianceAttack(*, objective_target: PromptChatTarget = REQUIRED_VALUE, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0, context_description_instructions_path: Path | None = None, affirmative_response: str | None = None)[source]#

Bases: PromptSendingAttack

Implementation of the context compliance attack strategy.

This attack attempts to bypass safety measures by rephrasing the objective into a more benign context. It uses an adversarial chat target to: 1. Rephrase the objective as a more benign question 2. Generate a response to the benign question 3. Rephrase the original objective as a follow-up question

This creates a context that makes it harder for the target to detect the true intent.

__init__(*, objective_target: PromptChatTarget = REQUIRED_VALUE, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0, context_description_instructions_path: Path | None = None, affirmative_response: str | None = None) → None[source]#

Initialize the context compliance attack strategy.

Parameters:

objective_target (PromptChatTarget) – The target system to attack. Must be a PromptChatTarget.
attack_adversarial_config (AttackAdversarialConfig) – Configuration for the adversarial component, including the adversarial chat target used for rephrasing.
attack_converter_config (Optional[AttackConverterConfig]) – Configuration for attack converters, including request and response converters.
attack_scoring_config (Optional[AttackScoringConfig]) – Configuration for attack scoring.
prompt_normalizer (Optional[PromptNormalizer]) – The prompt normalizer to use for sending prompts.
max_attempts_on_failure (int) – Maximum number of attempts to retry on failure.
context_description_instructions_path (Optional[Path]) – Path to the context description instructions YAML file. If not provided, uses the default path.
affirmative_response (Optional[str]) – The affirmative response to be used in the conversation history. If not provided, uses the default “yes.”.

Raises:

ValueError – If the context description instructions file is invalid.

Methods

`__init__`(*[, objective_target, ...])	Initialize the context compliance attack strategy.
`execute_async`(**kwargs)	Execute the attack strategy asynchronously with the provided parameters.
`execute_with_context_async`(*, context)	Execute strategy with complete lifecycle management.
`get_attack_scoring_config`()	Get the attack scoring configuration used by this strategy.
`get_identifier`()	Get a serializable identifier for the strategy instance.
`get_objective_target`()	Get the objective target for this attack strategy.

Attributes

`DEFAULT_AFFIRMATIVE_RESPONSE`
`DEFAULT_CONTEXT_DESCRIPTION_PATH`
`params_type`	Get the parameters type for this attack strategy.

DEFAULT_AFFIRMATIVE_RESPONSE: str = 'yes.'#

DEFAULT_CONTEXT_DESCRIPTION_PATH: Path = PosixPath('/home/runner/work/PyRIT/PyRIT/pyrit/datasets/executors/context_compliance/context_description.yaml')#

pyrit.executor.attack.ContextComplianceAttack

Contents

pyrit.executor.attack.ContextComplianceAttack#