pyrit.scenario.airt.ContentHarms#

class ContentHarms(*, adversarial_chat: PromptChatTarget | None = None, objective_scorer: TrueFalseScorer | None = None, scenario_result_id: str | None = None, objectives_by_harm: Dict[str, Sequence[SeedGroup]] | None = None)[source]#

Bases: Scenario

Content Harms Scenario implementation for PyRIT.

This scenario contains various harm-based checks that you can run to get a quick idea about model behavior with respect to certain harm categories.

__init__(*, adversarial_chat: PromptChatTarget | None = None, objective_scorer: TrueFalseScorer | None = None, scenario_result_id: str | None = None, objectives_by_harm: Dict[str, Sequence[SeedGroup]] | None = None)[source]#

Initialize the Content Harms Scenario.

Parameters:
  • adversarial_chat (Optional[PromptChatTarget]) – Additionally used for scoring defaults. If not provided, a default OpenAI target will be created using environment variables.

  • objective_scorer (Optional[TrueFalseScorer]) – Scorer to evaluate attack success. If not provided, creates a default composite scorer using Azure Content Filter and SelfAsk Refusal scorers. seed_dataset_prefix (Optional[str]): Prefix of the dataset to use to retrieve the objectives. This will be used to retrieve the appropriate seed groups from CentralMemory. If not provided, defaults to “content_harm”.

  • scenario_result_id (Optional[str]) – Optional ID of an existing scenario result to resume.

  • objectives_by_harm (Optional[Dict[str, Sequence[SeedGroup]]]) – A dictionary mapping harm strategies to their corresponding SeedGroups. If not provided, default seed groups will be loaded from datasets.

Methods

__init__(*[, adversarial_chat, ...])

Initialize the Content Harms Scenario.

default_dataset_config()

Return the default dataset configuration for this scenario.

get_default_strategy()

Get the default strategy used when no strategies are specified.

get_strategy_class()

Get the strategy enum class for this scenario.

initialize_async(*[, objective_target, ...])

Initialize the scenario by populating self._atomic_attacks and creating the ScenarioResult.

run_async()

Execute all atomic attacks in the scenario sequentially.

Attributes

atomic_attack_count

Get the number of atomic attacks in this scenario.

name

Get the name of the scenario.

version

classmethod default_dataset_config() DatasetConfiguration[source]#

Return the default dataset configuration for this scenario.

Returns:

Configuration with all content harm datasets.

Return type:

DatasetConfiguration

classmethod get_default_strategy() ScenarioStrategy[source]#

Get the default strategy used when no strategies are specified.

Returns:

ContentHarmsStrategy.ALL

Return type:

ScenarioStrategy

classmethod get_strategy_class() Type[ScenarioStrategy][source]#

Get the strategy enum class for this scenario.

Returns:

The ContentHarmsStrategy enum class.

Return type:

Type[ScenarioStrategy]

version: int = 1#