Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1. Loading Built-in Datasets

PyRIT includes many built-in datasets to help you get started with AI red teaming. While PyRIT aims to be unopinionated about what constitutes harmful content, it provides easy mechanisms to use datasets—whether built-in, community-contributed, or your own custom datasets.

Important Note: Datasets are best managed through PyRIT memory, where data is normalized and can be queried efficiently. However, this guide demonstrates how to load datasets directly as a starting point, and these can easily be imported into the database later.

The following command lists all built-in datasets available in PyRIT. Some datasets are stored locally, while others are fetched remotely from sources like HuggingFace.

Many of these datasets come from published research, including Aegis Ghosh et al., 2025, ALERT Tedeschi et al., 2024, BeaverTails Ji et al., 2023, CBT-Bench Zhang et al., 2024, DarkBench Apart Research, 2025, Do Anything Now Shen et al., 2023, Do-Not-Answer Wang et al., 2023, EquityMedQA Pfohl et al., 2024, HarmBench Mazeika et al., 2024, HarmfulQA Chu et al., 2023, JailbreakBench Chao et al., 2024, LLM-LAT Sheshadri et al., 2024, MedSafetyBench Han et al., 2024, Multilingual Alignment Prism Aakanksha et al., 2024, Multilingual Vulnerabilities Tang et al., 2025, OR-Bench Cui et al., 2024, PKU-SafeRLHF Ji et al., 2024, SALAD-Bench Li et al., 2024, SimpleSafetyTests Vidgen et al., 2023, SORRY-Bench Xie et al., 2024, SOSBench Jiang et al., 2025, TDC23 Mazeika et al., 2023, ToxicChat Lin et al., 2023, VLSU Palaskar et al., 2025, XSTest Röttger et al., 2023, AILuminate Vidgen et al., 2024, Transphobia Awareness Scheuerman et al., 2025, Red Team Social Bias Taylor, 2024, and PromptIntel Roccia, 2024. Some datasets also originate from tools like garak Derczynski et al., 2024 and AdvBench Zou et al., 2023.

from pyrit.datasets import SeedDatasetProvider
from pyrit.memory import CentralMemory
from pyrit.setup.initialization import IN_MEMORY, initialize_pyrit_async

await SeedDatasetProvider.get_all_dataset_names_async()

Loading Specific Datasets

You can retrieve all built-in datasets using SeedDatasetProvider.fetch_datasets_async(), or fetch specific ones by providing dataset names. This returns a list of SeedDataset objects containing the seeds.

# type: ignore
datasets = await SeedDatasetProvider.fetch_datasets_async(dataset_names=["airt_illegal", "airt_malware"])

for dataset in datasets:
    for seed in dataset.seeds:
        print(seed.value)

Adding Datasets to Memory

While loading datasets directly is useful for quick exploration, storing them in PyRIT memory provides significant advantages for managing and querying your test data. Memory allows you to:

  • Query seeds by harm category, data type, or custom metadata

  • Track provenance and versions

  • Share datasets across team members (when using Azure SQL)

  • Avoid duplicate entries

The following example demonstrates adding datasets to memory. For comprehensive details on memory capabilities, see the memory documentation and seed database guide.

await initialize_pyrit_async(memory_db_type=IN_MEMORY)  # type: ignore

memory = CentralMemory().get_memory_instance()
# type: ignore
await memory.add_seed_datasets_to_memory_async(datasets=datasets, added_by="pyrit")

# Memory has flexible querying capabilities
memory.get_seeds(harm_categories=["illegal"], seed_type="objective")
References
  1. Ghosh, S., Varshney, P., Sreedhar, M. N., Padmakumar, A., Rebedea, T., Varghese, J. R., & Parisien, C. (2025). Aegis 2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails. arXiv Preprint arXiv:2501.09004. https://arxiv.org/abs/2501.09004
  2. Tedeschi, S., Friedrich, F., Schramowski, P., Kersting, K., Navigli, R., Nguyen, H., & Li, B. (2024). ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming. arXiv Preprint arXiv:2404.08676. https://arxiv.org/abs/2404.08676
  3. Ji, J., Liu, M., Dai, J., Pan, X., Zhang, C., Bian, C., Zhang, C., Sun, R., Wang, Y., & Yang, Y. (2023). BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset. arXiv Preprint arXiv:2307.04657. https://arxiv.org/abs/2307.04657
  4. Zhang, M., Yang, X., Zhang, X., Labrum, T., Chiu, J. C., Eack, S. M., Fang, F., Wang, W. Y., & Chen, Z. Z. (2024). CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy. arXiv Preprint arXiv:2410.13218. https://arxiv.org/abs/2410.13218
  5. Apart Research. (2025). DarkBench: A Comprehensive Benchmark for Dark Design Patterns in Large Language Models. https://darkbench.ai/
  6. Shen, X., Chen, Z., Backes, M., Shen, Y., & Zhang, Y. (2023). “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models. arXiv Preprint arXiv:2308.03825. https://arxiv.org/abs/2308.03825
  7. Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. arXiv Preprint arXiv:2308.13387. https://arxiv.org/abs/2308.13387
  8. Pfohl, S. R., Cole-Lewis, H., Sayres, R., Neal, D., Asiedu, M., Dieng, A., Tomasev, N., Rashid, Q. M., Azizi, S., Rostamzadeh, N., McCoy, L. G., Celi, L. A., Liu, Y., Schaekermann, M., Walton, A., Parrish, A., Nagpal, C., Singh, P., Dewitt, A., … Singhal, K. (2024). A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models. Nature Medicine. 10.1038/s41591-024-03258-2
  9. Mazeika, M., Phan, L., Yin, X., Zou, A., Wang, Z., Mu, N., Sakhaee, E., Li, N., Basart, S., Li, B., Forsyth, D., & Hendrycks, D. (2024). HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal. arXiv Preprint arXiv:2402.04249. https://arxiv.org/abs/2402.04249
  10. Chu, J., Yang, Z., Li, M., Leng, Y., Lin, C., Shen, C., Backes, M., Shen, Y., & Zhang, Y. (2023). HarmfulQA: A Benchmark for Robustly Evaluating Jailbreaks in Alignment Testing. arXiv Preprint arXiv:2310.18469. https://arxiv.org/abs/2310.18469
  11. Chao, P., Debenedetti, E., Robey, A., Andriushchenko, M., Croce, F., Sehwag, V., Dobriban, E., Flammarion, N., Pappas, G. J., Tramer, F., Hassani, H., & Wong, E. (2024). JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models. arXiv Preprint arXiv:2404.01318. https://arxiv.org/abs/2404.01318
  12. Sheshadri, A., Ewart, A., Guo, P., Lynch, A., Wu, C., Hebbar, V., Sleight, H., Stickland, A. C., Perez, E., Hadfield-Menell, D., & Casper, S. (2024). Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs. arXiv Preprint arXiv:2407.15549. https://arxiv.org/abs/2407.15549
  13. Han, T., Kumar, A., Agarwal, C., & Lakkaraju, H. (2024). MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models. arXiv Preprint arXiv:2403.03744. https://arxiv.org/abs/2403.03744
  14. Aakanksha, Ahmadian, A., Ermis, B., Goldfarb-Tarrant, S., Kreutzer, J., Fadaee, M., & Hooker, S. (2024). The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm. arXiv Preprint arXiv:2406.18682. https://arxiv.org/abs/2406.18682
  15. Tang, L., Bogahawatta, N., Ginige, Y., Xu, J., Sun, S., Ranathunga, S., & Seneviratne, S. (2025). A Framework to Assess Multilingual Vulnerabilities of LLMs. arXiv Preprint arXiv:2503.13081. https://arxiv.org/abs/2503.13081