pyrit.scenario.airt.ContentHarms#
- class ContentHarms(*, adversarial_chat: PromptChatTarget | None = None, objective_scorer: TrueFalseScorer | None = None, scenario_result_id: str | None = None, objectives_by_harm: Dict[str, Sequence[SeedGroup]] | None = None)[source]#
Bases:
ScenarioContent Harms Scenario implementation for PyRIT.
This scenario contains various harm-based checks that you can run to get a quick idea about model behavior with respect to certain harm categories.
- __init__(*, adversarial_chat: PromptChatTarget | None = None, objective_scorer: TrueFalseScorer | None = None, scenario_result_id: str | None = None, objectives_by_harm: Dict[str, Sequence[SeedGroup]] | None = None)[source]#
Initialize the Content Harms Scenario.
- Parameters:
adversarial_chat (Optional[PromptChatTarget]) – Additionally used for scoring defaults. If not provided, a default OpenAI target will be created using environment variables.
objective_scorer (Optional[TrueFalseScorer]) – Scorer to evaluate attack success. If not provided, creates a default composite scorer using Azure Content Filter and SelfAsk Refusal scorers. seed_dataset_prefix (Optional[str]): Prefix of the dataset to use to retrieve the objectives. This will be used to retrieve the appropriate seed groups from CentralMemory. If not provided, defaults to “content_harm”.
scenario_result_id (Optional[str]) – Optional ID of an existing scenario result to resume.
objectives_by_harm (Optional[Dict[str, Sequence[SeedGroup]]]) – A dictionary mapping harm strategies to their corresponding SeedGroups. If not provided, default seed groups will be loaded from datasets.
Methods
__init__(*[, adversarial_chat, ...])Initialize the Content Harms Scenario.
Return the default dataset configuration for this scenario.
Get the default strategy used when no strategies are specified.
Get the strategy enum class for this scenario.
initialize_async(*[, objective_target, ...])Initialize the scenario by populating self._atomic_attacks and creating the ScenarioResult.
run_async()Execute all atomic attacks in the scenario sequentially.
Attributes
atomic_attack_countGet the number of atomic attacks in this scenario.
nameGet the name of the scenario.
- classmethod default_dataset_config() DatasetConfiguration[source]#
Return the default dataset configuration for this scenario.
- Returns:
Configuration with all content harm datasets.
- Return type:
- classmethod get_default_strategy() ScenarioStrategy[source]#
Get the default strategy used when no strategies are specified.
- Returns:
ContentHarmsStrategy.ALL
- Return type:
- classmethod get_strategy_class() Type[ScenarioStrategy][source]#
Get the strategy enum class for this scenario.
- Returns:
The ContentHarmsStrategy enum class.
- Return type:
Type[ScenarioStrategy]