pyrit.executor.attack.CrescendoAttack#
- class CrescendoAttack(*, objective_target: PromptChatTarget, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_backtracks: int = 10, max_turns: int = 10)[source]#
Bases:
MultiTurnAttackStrategy
[CrescendoAttackContext
,CrescendoAttackResult
]Implementation of the Crescendo attack strategy.
The Crescendo Attack is a multi-turn strategy that progressively guides the model to generate harmful content through small, benign steps. It leverages the model’s recency bias, pattern-following tendency, and trust in self-generated text.
The attack flow consists of: 1. Generating progressively harmful prompts using an adversarial chat model. 2. Sending prompts to the target and evaluating responses for refusal. 3. Backtracking when the target refuses to respond. 4. Scoring responses to determine if the objective has been achieved. 5. Continuing until the objective is met or maximum turns/backtracks are reached.
You can learn more about the Crescendo attack at: https://crescendo-the-multiturn-jailbreak.github.io/
- __init__(*, objective_target: PromptChatTarget, attack_adversarial_config: AttackAdversarialConfig, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_backtracks: int = 10, max_turns: int = 10) None [source]#
Initialize the Crescendo attack strategy.
- Parameters:
objective_target (PromptChatTarget) – The target system to attack. Must be a PromptChatTarget.
attack_adversarial_config (AttackAdversarialConfig) – Configuration for the adversarial component, including the adversarial chat target and optional system prompt path.
attack_converter_config (Optional[AttackConverterConfig]) – Configuration for attack converters, including request and response converters.
attack_scoring_config (Optional[AttackScoringConfig]) – Configuration for scoring responses.
prompt_normalizer (Optional[PromptNormalizer]) – Normalizer for prompts.
max_backtracks (int) – Maximum number of backtracks allowed.
max_turns (int) – Maximum number of turns allowed.
Methods
__init__
(*, objective_target, ...[, ...])Initialize the Crescendo attack strategy.
execute_async
(**kwargs)Execute the attack strategy asynchronously with the provided parameters.
execute_with_context_async
(*, context)Execute strategy with complete lifecycle management.
get_identifier
()Attributes