pyrit.executor.attack.generate_simulated_conversation_async

pyrit.executor.attack.generate_simulated_conversation_async#

async generate_simulated_conversation_async(*, objective: str, adversarial_chat: PromptChatTarget, objective_scorer: TrueFalseScorer, num_turns: int = 3, starting_sequence: int = 0, adversarial_chat_system_prompt_path: str | Path, simulated_target_system_prompt_path: str | Path | None = None, next_message_system_prompt_path: str | Path | None = None, attack_converter_config: AttackConverterConfig | None = None, memory_labels: dict[str, str] | None = None) → list[SeedPrompt][source]#

Generate a simulated conversation between an adversarial chat and a target.

This utility runs a RedTeamingAttack with score_last_turn_only=True against a simulated target (the same LLM as adversarial_chat, optionally configured with a system prompt). The resulting conversation is returned as a list of SeedPrompts that can be merged with other SeedPrompts in a SeedGroup for use as prepended_conversation and next_message.

Use cases: - Creating role-play scenarios dynamically (e.g., movie script, video game) - Establishing conversational context before attacking a real target - Generating multi-turn jailbreak setups without hardcoded responses

Parameters:

objective – The objective for the adversarial chat to work toward.
adversarial_chat – The adversarial LLM that generates attack prompts. This same LLM is also used as the simulated target.
objective_scorer – Scorer to evaluate the final turn.
num_turns – Number of conversation turns to generate. Defaults to 3.
starting_sequence – The starting sequence number for the generated SeedPrompts. Each message gets an incrementing sequence number. Defaults to 0.
adversarial_chat_system_prompt_path – Path to the system prompt for the adversarial chat.
simulated_target_system_prompt_path – Path to the system prompt for the simulated target. If None, no system prompt is used for the simulated target.
next_message_system_prompt_path – Optional path to a system prompt for generating a final user message. If provided, after the simulated conversation, a single LLM call generates a user message that attempts to get the target to fulfill the objective in their next response. The prompt template receives objective and conversation_so_far parameters.
attack_converter_config – Converter configuration for the attack. Defaults to None.
memory_labels – Labels to associate with the conversation in memory. Defaults to None.

Returns:

List of SeedPrompts representing the generated conversation, with sequence numbers starting from starting_sequence and incrementing by 1 for each message. User messages have role=”user”, assistant messages have role=”assistant”. If next_message_system_prompt_path is provided, the last message will be a user message generated to elicit the objective fulfillment.

Raises:

ValueError – If num_turns is not a positive integer.

pyrit.executor.attack.generate_simulated_conversation_async

Contents

pyrit.executor.attack.generate_simulated_conversation_async#