pyrit.executor.attack.SkeletonKeyAttack#
- class SkeletonKeyAttack(*, objective_target: PromptTarget, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, skeleton_key_prompt: str | None = None, max_attempts_on_failure: int = 0)[source]#
Bases:
PromptSendingAttack
Implementation of the skeleton key jailbreak attack strategy.
This attack sends an initial skeleton key prompt to the target, and then follows up with a separate attack prompt. If successful, the first prompt makes the target comply even with malicious follow-up prompts.
The attack flow consists of: 1. Sending a skeleton key prompt to bypass the target’s safety mechanisms. 2. Sending the actual objective prompt to the primed target. 3. Evaluating the response using configured scorers to determine success.
Learn more about attack at the link below: https://www.microsoft.com/en-us/security/blog/2024/06/26/mitigating-skeleton-key-a-new-type-of-generative-ai-jailbreak-technique/
- __init__(*, objective_target: PromptTarget, attack_converter_config: AttackConverterConfig | None = None, attack_scoring_config: AttackScoringConfig | None = None, prompt_normalizer: PromptNormalizer | None = None, skeleton_key_prompt: str | None = None, max_attempts_on_failure: int = 0) None [source]#
Initialize the skeleton key attack strategy.
- Parameters:
objective_target (PromptTarget) – The target system to attack.
attack_converter_config (Optional[AttackConverterConfig]) – Configuration for prompt converters.
attack_scoring_config (Optional[AttackScoringConfig]) – Configuration for scoring components.
prompt_normalizer (Optional[PromptNormalizer]) – Normalizer for handling prompts.
skeleton_key_prompt (Optional[str]) – The skeleton key prompt to use. If not provided, uses the default skeleton key prompt.
max_attempts_on_failure (int) – Maximum number of attempts to retry on failure.
Methods
__init__
(*, objective_target[, ...])Initialize the skeleton key attack strategy.
execute_async
(**kwargs)Execute the attack strategy asynchronously with the provided parameters.
execute_with_context_async
(*, context)Execute strategy with complete lifecycle management.
get_identifier
()Attributes