pyrit.orchestrator.SkeletonKeyOrchestrator#
- class SkeletonKeyOrchestrator(*, skeleton_key_prompt: str | None = None, prompt_target: PromptTarget, prompt_converters: list[PromptConverter] | None = None, batch_size: int = 10, verbose: bool = False)[source]#
Bases:
Orchestrator
Creates an orchestrator that executes a skeleton key jailbreak.
The orchestrator sends an inital skeleton key prompt to the target, and then follows up with a separate attack prompt. If successful, the first prompt makes the target comply even with malicious follow-up prompts. In our experiments, using two separate prompts was significantly more effective than using a single combined prompt.
Learn more about attack at the link below: https://www.microsoft.com/en-us/security/blog/2024/06/26/mitigating-skeleton-key-a-new-type-of-generative-ai-jailbreak-technique/
- __init__(*, skeleton_key_prompt: str | None = None, prompt_target: PromptTarget, prompt_converters: list[PromptConverter] | None = None, batch_size: int = 10, verbose: bool = False) None [source]#
- Parameters:
skeleton_key_prompt (str, Optional) – The skeleton key sent to the target, Default: skeleton_key.prompt
prompt_target (PromptTarget) – The target for sending prompts.
prompt_converters (list[PromptConverter], Optional) – List of prompt converters. These are stacked in the order they are provided. E.g. the output of converter1 is the input of converter2.
batch_size (int, Optional) – The (max) batch size for sending prompts. Defaults to 10. Note: If providing max requests per minute on the prompt_target, this should be set to 1 to ensure proper rate limit management.
verbose (bool, Optional) – If set to True, verbose output will be enabled. Defaults to False.
Methods
__init__
(*[, skeleton_key_prompt, ...])dispose_db_engine
()Dispose database engine to release database connections and resources.
get_identifier
()get_memory
()Retrieves the memory associated with this orchestrator.
get_score_memory
()Retrieves the scores of the PromptRequestPieces associated with this orchestrator.
Prints all the conversations that have occured with the prompt target.
send_skeleton_key_with_prompt_async
(*, prompt)Sends a skeleton key, followed by the attack prompt to the target.
Sends a skeleton key and prompt to the target for each prompt in a list of prompts.
- print_conversation() None [source]#
Prints all the conversations that have occured with the prompt target.
- async send_skeleton_key_with_prompt_async(*, prompt: str) PromptRequestResponse [source]#
Sends a skeleton key, followed by the attack prompt to the target.
Args
prompt (str): The prompt to be sent. prompt_type (PromptDataType, Optional): The type of the prompt (e.g., “text”). Defaults to “text”.
- Returns:
The response from the prompt target.
- Return type: