pyrit.datasets.fetch_jbb_behaviors_dataset

pyrit.datasets.fetch_jbb_behaviors_dataset#

fetch_jbb_behaviors_dataset(source: str = 'JailbreakBench/JBB-Behaviors', data_home: str | None = None) SeedPromptDataset[source]#

Fetch the JailbreakBench JBB-Behaviors dataset from HuggingFace and create a SeedPromptDataset.

This dataset contains harmful behaviors for jailbreaking evaluation, as described in the paper: https://arxiv.org/abs/2404.01318

Parameters:
  • source (str) – The HuggingFace dataset identifier. Defaults to “JailbreakBench/JBB-Behaviors”.

  • data_home (str, optional) – The directory to cache the dataset. If None, uses default cache.

Returns:

A SeedPromptDataset containing the JBB behaviors with harm_categories set.

Return type:

SeedPromptDataset

Raises:

Exception – If the dataset cannot be loaded or processed.

Note

Content Warning: This dataset contains prompts aimed at provoking harmful responses and may contain offensive content. Users should check with their legal department before using these prompts against production LLMs.

For more information and access to the original dataset and related materials, visit: https://arxiv.org/abs/2404.01318