pyrit.orchestrator.SkeletonKeyOrchestrator#

class SkeletonKeyOrchestrator(*, skeleton_key_prompt: str | None = None, prompt_target: PromptTarget, prompt_converters: list[PromptConverter] | None = None, batch_size: int = 10, verbose: bool = False)[source]#

Bases: Orchestrator

Creates an orchestrator that executes a skeleton key jailbreak.

The orchestrator sends an inital skeleton key prompt to the target, and then follows up with a separate attack prompt. If successful, the first prompt makes the target comply even with malicious follow-up prompts. In our experiments, using two separate prompts was significantly more effective than using a single combined prompt.

Learn more about attack at the link below: https://www.microsoft.com/en-us/security/blog/2024/06/26/mitigating-skeleton-key-a-new-type-of-generative-ai-jailbreak-technique/

__init__(*, skeleton_key_prompt: str | None = None, prompt_target: PromptTarget, prompt_converters: list[PromptConverter] | None = None, batch_size: int = 10, verbose: bool = False) None[source]#
Parameters:
  • skeleton_key_prompt (str, Optional) – The skeleton key sent to the target, Default: skeleton_key.prompt

  • prompt_target (PromptTarget) – The target for sending prompts.

  • prompt_converters (list[PromptConverter], Optional) – List of prompt converters. These are stacked in the order they are provided. E.g. the output of converter1 is the input of converter2.

  • batch_size (int, Optional) – The (max) batch size for sending prompts. Defaults to 10. Note: If providing max requests per minute on the prompt_target, this should be set to 1 to ensure proper rate limit management.

  • verbose (bool, Optional) – If set to True, verbose output will be enabled. Defaults to False.

Methods

__init__(*[, skeleton_key_prompt, ...])

dispose_db_engine()

Dispose database engine to release database connections and resources.

get_identifier()

get_memory()

Retrieves the memory associated with this orchestrator.

get_score_memory()

Retrieves the scores of the PromptRequestPieces associated with this orchestrator.

print_conversation()

Prints all the conversations that have occured with the prompt target.

send_skeleton_key_with_prompt_async(*, prompt)

Sends a skeleton key, followed by the attack prompt to the target.

send_skeleton_key_with_prompts_async(*, ...)

Sends a skeleton key and prompt to the target for each prompt in a list of prompts.

print_conversation() None[source]#

Prints all the conversations that have occured with the prompt target.

async send_skeleton_key_with_prompt_async(*, prompt: str) PromptRequestResponse[source]#

Sends a skeleton key, followed by the attack prompt to the target.

Args

prompt (str): The prompt to be sent. prompt_type (PromptDataType, Optional): The type of the prompt (e.g., “text”). Defaults to “text”.

Returns:

The response from the prompt target.

Return type:

PromptRequestResponse

async send_skeleton_key_with_prompts_async(*, prompt_list: list[str]) list[PromptRequestResponse][source]#

Sends a skeleton key and prompt to the target for each prompt in a list of prompts.

Parameters:
  • prompt_list (list[str]) – The list of prompts to be sent.

  • prompt_type (PromptDataType, Optional) – The type of the prompts (e.g., “text”). Defaults to “text”.

Returns:

The responses from the prompt target.

Return type:

list[PromptRequestResponse]