pyrit.datasets.fetch_pku_safe_rlhf_dataset#
- fetch_pku_safe_rlhf_dataset(include_safe_prompts: bool = True) SeedPromptDataset [source]#
Fetch PKU-SafeRLHF examples and create a SeedPromptDataset.
- Parameters:
include_safe_prompts (bool) – all prompts in the dataset are returned if True; the dataset has
responses (RLHF markers for unsafe)
subset (so if False we only return the unsafe)
- Returns:
A SeedPromptDataset containing the examples.
- Return type:
Note
For more information and access to the original dataset and related materials, visit: https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF. Based on research in paper: https://arxiv.org/pdf/2406.15513 written by Jiaming Ji and Donghai Hong and Borong Zhang and Boyuan Chen and Josef Dai and Boren Zheng and Tianyi Qiu and Boxun Li and Yaodong Yang