PyRIT includes many built-in datasets to help you get started with AI red teaming. While PyRIT aims to be unopinionated about what constitutes harmful content, it provides easy mechanisms to use datasets—whether built-in, community-contributed, or your own custom datasets.
Important Note: Datasets are best managed through PyRIT memory, where data is normalized and can be queried efficiently. However, this guide demonstrates how to load datasets directly as a starting point, and these can easily be imported into the database later.
The following command lists all built-in datasets available in PyRIT. Some datasets are stored locally, while others are fetched remotely from sources like HuggingFace.
Many of these datasets come from published research, including Aegis Ghosh et al., 2025, ALERT Tedeschi et al., 2024, BeaverTails Ji et al., 2023, CBT-Bench Zhang et al., 2024, DarkBench Apart Research, 2025, Do Anything Now Shen et al., 2023, Do-Not-Answer Wang et al., 2023, EquityMedQA Pfohl et al., 2024, HarmBench Mazeika et al., 2024, HarmfulQA Chu et al., 2023, JailbreakBench Chao et al., 2024, LLM-LAT Sheshadri et al., 2024, MedSafetyBench Han et al., 2024, Multilingual Alignment Prism Aakanksha et al., 2024, Multilingual Vulnerabilities Tang et al., 2025, OR-Bench Cui et al., 2024, PKU-SafeRLHF Ji et al., 2024, SALAD-Bench Li et al., 2024, SimpleSafetyTests Vidgen et al., 2023, SORRY-Bench Xie et al., 2024, SOSBench Jiang et al., 2025, TDC23 Mazeika et al., 2023, ToxicChat Lin et al., 2023, VLSU Palaskar et al., 2025, XSTest Röttger et al., 2023, AILuminate Vidgen et al., 2024, Transphobia Awareness Scheuerman et al., 2025, Red Team Social Bias Taylor, 2024, and PromptIntel Roccia, 2024. Some datasets also originate from tools like garak Derczynski et al., 2024 and AdvBench Zou et al., 2023.
from pyrit.datasets import SeedDatasetProvider
from pyrit.memory import CentralMemory
from pyrit.setup.initialization import IN_MEMORY, initialize_pyrit_async
await SeedDatasetProvider.get_all_dataset_names_async()Loading Specific Datasets¶
You can retrieve all built-in datasets using SeedDatasetProvider.fetch_datasets_async(), or fetch specific ones by providing dataset names. This returns a list of SeedDataset objects containing the seeds.
# type: ignore
datasets = await SeedDatasetProvider.fetch_datasets_async(dataset_names=["airt_illegal", "airt_malware"])
for dataset in datasets:
for seed in dataset.seeds:
print(seed.value)Adding Datasets to Memory¶
While loading datasets directly is useful for quick exploration, storing them in PyRIT memory provides significant advantages for managing and querying your test data. Memory allows you to:
Query seeds by harm category, data type, or custom metadata
Track provenance and versions
Share datasets across team members (when using Azure SQL)
Avoid duplicate entries
The following example demonstrates adding datasets to memory. For comprehensive details on memory capabilities, see the memory documentation and seed database guide.
await initialize_pyrit_async(memory_db_type=IN_MEMORY) # type: ignore
memory = CentralMemory().get_memory_instance()
# type: ignore
await memory.add_seed_datasets_to_memory_async(datasets=datasets, added_by="pyrit")
# Memory has flexible querying capabilities
memory.get_seeds(harm_categories=["illegal"], seed_type="objective")- Ghosh, S., Varshney, P., Sreedhar, M. N., Padmakumar, A., Rebedea, T., Varghese, J. R., & Parisien, C. (2025). Aegis 2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails. arXiv Preprint arXiv:2501.09004. https://arxiv.org/abs/2501.09004
- Tedeschi, S., Friedrich, F., Schramowski, P., Kersting, K., Navigli, R., Nguyen, H., & Li, B. (2024). ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming. arXiv Preprint arXiv:2404.08676. https://arxiv.org/abs/2404.08676
- Ji, J., Liu, M., Dai, J., Pan, X., Zhang, C., Bian, C., Zhang, C., Sun, R., Wang, Y., & Yang, Y. (2023). BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset. arXiv Preprint arXiv:2307.04657. https://arxiv.org/abs/2307.04657
- Zhang, M., Yang, X., Zhang, X., Labrum, T., Chiu, J. C., Eack, S. M., Fang, F., Wang, W. Y., & Chen, Z. Z. (2024). CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy. arXiv Preprint arXiv:2410.13218. https://arxiv.org/abs/2410.13218
- Apart Research. (2025). DarkBench: A Comprehensive Benchmark for Dark Design Patterns in Large Language Models. https://darkbench.ai/
- Shen, X., Chen, Z., Backes, M., Shen, Y., & Zhang, Y. (2023). “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models. arXiv Preprint arXiv:2308.03825. https://arxiv.org/abs/2308.03825
- Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. arXiv Preprint arXiv:2308.13387. https://arxiv.org/abs/2308.13387
- Pfohl, S. R., Cole-Lewis, H., Sayres, R., Neal, D., Asiedu, M., Dieng, A., Tomasev, N., Rashid, Q. M., Azizi, S., Rostamzadeh, N., McCoy, L. G., Celi, L. A., Liu, Y., Schaekermann, M., Walton, A., Parrish, A., Nagpal, C., Singh, P., Dewitt, A., … Singhal, K. (2024). A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models. Nature Medicine. 10.1038/s41591-024-03258-2
- Mazeika, M., Phan, L., Yin, X., Zou, A., Wang, Z., Mu, N., Sakhaee, E., Li, N., Basart, S., Li, B., Forsyth, D., & Hendrycks, D. (2024). HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal. arXiv Preprint arXiv:2402.04249. https://arxiv.org/abs/2402.04249
- Chu, J., Yang, Z., Li, M., Leng, Y., Lin, C., Shen, C., Backes, M., Shen, Y., & Zhang, Y. (2023). HarmfulQA: A Benchmark for Robustly Evaluating Jailbreaks in Alignment Testing. arXiv Preprint arXiv:2310.18469. https://arxiv.org/abs/2310.18469
- Chao, P., Debenedetti, E., Robey, A., Andriushchenko, M., Croce, F., Sehwag, V., Dobriban, E., Flammarion, N., Pappas, G. J., Tramer, F., Hassani, H., & Wong, E. (2024). JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models. arXiv Preprint arXiv:2404.01318. https://arxiv.org/abs/2404.01318
- Sheshadri, A., Ewart, A., Guo, P., Lynch, A., Wu, C., Hebbar, V., Sleight, H., Stickland, A. C., Perez, E., Hadfield-Menell, D., & Casper, S. (2024). Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs. arXiv Preprint arXiv:2407.15549. https://arxiv.org/abs/2407.15549
- Han, T., Kumar, A., Agarwal, C., & Lakkaraju, H. (2024). MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models. arXiv Preprint arXiv:2403.03744. https://arxiv.org/abs/2403.03744
- Aakanksha, Ahmadian, A., Ermis, B., Goldfarb-Tarrant, S., Kreutzer, J., Fadaee, M., & Hooker, S. (2024). The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm. arXiv Preprint arXiv:2406.18682. https://arxiv.org/abs/2406.18682
- Tang, L., Bogahawatta, N., Ginige, Y., Xu, J., Sun, S., Ranathunga, S., & Seneviratne, S. (2025). A Framework to Assess Multilingual Vulnerabilities of LLMs. arXiv Preprint arXiv:2503.13081. https://arxiv.org/abs/2503.13081