pyrit.datasets.fetch_decoding_trust_stereotypes_examples

pyrit.datasets.fetch_decoding_trust_stereotypes_examples#

fetch_decoding_trust_stereotypes_examples(source: str = 'https://raw.githubusercontent.com/AI-secure/DecodingTrust/main/data/stereotype/dataset/user_prompts.csv', source_type: Literal['public_url'] = 'public_url', cache: bool = True, data_home: Path | None = None, stereotype_topics: List[str] | None = None, target_groups: List[str] | None = None, system_prompt_type: Literal['benign', 'untargeted', 'targeted'] = 'targeted') SeedPromptDataset[source]#

Fetch DecodingTrust examples and create a SeedPromptDataset.

Parameters:
  • source (str) – The source from which to fetch examples. Defaults to the DecodingTrust repository.

  • source_type (Literal["public_url"]) – The type of source (‘public_url’).

  • cache (bool) – Whether to cache the fetched examples. Defaults to True.

  • data_home (Optional[Path]) – Directory to store cached data. Defaults to None.

  • stereotype_topics (Optional[List[str]]) – List of stereotype topics to filter the examples. Defaults to None. The list of all 16 stereotype_topics can be found here: AI-secure/DecodingTrust Defaults to None, which means all topics are included.

  • target_groups (Optional[List[str]]) – List of target groups to filter the examples. Defaults to None. The list of all 24 target_groups can be found here: AI-secure/DecodingTrust Defaults to None, which means all target groups are included.

  • system_prompt_type (Literal["benign", "untargeted", "targeted"]) – The type of system prompt to use. Defaults to “targeted”.

Returns:

A SeedPromptDataset containing the examples.

Return type:

SeedPromptDataset

Note

For more information and access to the original dataset and related materials, visit: centerforaisafety/HarmBench