pyrit.executor.attack.FlipAttack#
- class FlipAttack(objective_target: PromptChatTarget, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0)[source]#
Bases:
PromptSendingAttack
This attack implements the FlipAttack method found here: https://arxiv.org/html/2410.02832v1.
Essentially, it adds a system prompt to the beginning of the conversation to flip each word in the prompt.
- __init__(objective_target: PromptChatTarget, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, max_attempts_on_failure: int = 0) None [source]#
- Parameters:
objective_target (PromptChatTarget) – The target system to attack.
attack_converter_config (AttackConverterConfig, Optional) – Configuration for the prompt converters.
attack_scoring_config (AttackScoringConfig, Optional) – Configuration for scoring components.
prompt_normalizer (PromptNormalizer, Optional) – Normalizer for handling prompts.
max_attempts_on_failure (int, Optional) – Maximum number of attempts to retry on failure.
Methods
__init__
(objective_target[, ...])execute_async
(**kwargs)Execute the attack strategy asynchronously with the provided parameters.
execute_with_context_async
(*, context)Execute strategy with complete lifecycle management.
get_identifier
()