pyrit.datasets.fetch_medsafetybench_dataset

pyrit.datasets.fetch_medsafetybench_dataset#

fetch_medsafetybench_dataset(subset_name: Literal['train', 'test', 'generated', 'all'] = 'all', cache: bool = True, data_home: Path | None = None, output_csv_path: str | None = None) SeedPromptDataset[source]#

Fetch MedSafetyBench examples (merged) and return them as a SeedPromptDataset.

Parameters:
  • subset_name (Literal) – Choose from “train”, “test”, “generated”, or “all”.

  • cache (bool) – Whether to cache the data locally.

  • data_home (Optional[Path]) – Optional path to override default cache location.

  • output_csv_path (Optional[str]) – Path where to save the combined CSV. If None, uses default naming.

Returns:

A dataset of prompts from MedSafetyBench.

Return type:

SeedPromptDataset

Note

For more information and access to the original dataset and related materials, visit: AI4LIFE-GROUP/med-safety-bench. Based on research in: https://proceedings.neurips.cc/paper_files/paper/2024/hash/3ac952d0264ef7a505393868a70a46b6-Abstract-Datasets_and_Benchmarks_Track.html Authors: Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju.