pyrit.datasets.fetch_harmbench_multimodal_dataset_async#
- async fetch_harmbench_multimodal_dataset_async(*, source: str = 'https://raw.githubusercontent.com/centerforaisafety/HarmBench/c0423b9/data/behavior_datasets/harmbench_behaviors_multimodal_all.csv', source_type: Literal['public_url', 'file'] = 'public_url', cache: bool = True, data_home: Path | None = None, categories: List[SemanticCategory] | None = None) SeedPromptDataset [source]#
Fetch HarmBench multimodal examples and create a SeedPromptDataset.
The HarmBench multimodal dataset contains 110 harmful behaviors. Each example consists of an image (“image_path”) and a behavior string referencing the image (“text”). The text and image prompts that belong to the same example are linked using the same
prompt_group_id
. You can extract the grouped prompts using thegroup_seed_prompts_by_prompt_group_id
method.Note: The first call may be slow as images need to be downloaded from the remote repository. Subsequent calls will be faster since images are cached locally and won’t need to be re-downloaded.
- Parameters:
source (str) – The source from which to fetch examples. Defaults to the HarmBench repository.
source_type (Literal["public_url", "file"]) – The type of source. Defaults to ‘public_url’.
cache (bool) – Whether to cache the fetched examples. Defaults to True.
data_home (Optional[Path]) – Directory to store cached data. Defaults to None.
categories (Optional[List[SemanticCategory]]) – List of semantic categories to filter examples. If None, all categories are included (default).
- Returns:
A SeedPromptDataset containing the multimodal examples.
- Return type:
- Raises:
ValueError – If any of the specified categories are invalid.
Note
For more information related to the HarmBench project and the original dataset, visit: https://www.harmbench.org/
Paper: https://arxiv.org/abs/2402.04249
- Authors:
Mantas Mazeika & Long Phan & Xuwang Yin & Andy Zou & Zifan Wang & Norman Mu & Elham Sakhaee & Nathaniel Li & Steven Basart & Bo Li & David Forsyth & Dan Hendrycks