pyrit.models.SeedDataset#
- class SeedDataset(*, seeds: Sequence[Dict[str, Any]] | Sequence[Seed] | None = None, data_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'binary_path', 'url', 'reasoning', 'error', 'function_call', 'tool_call', 'function_call_output'] | None = 'text', name: str | None = None, dataset_name: str | None = None, harm_categories: Sequence[str] | None = None, description: str | None = None, authors: Sequence[str] | None = None, groups: Sequence[str] | None = None, source: str | None = None, date_added: datetime | None = None, added_by: str | None = None, seed_type: Literal['prompt', 'objective', 'simulated_conversation'] | None = None, is_objective: bool = False)[source]#
Bases:
YamlLoadableSeedDataset manages seed prompts plus optional top-level defaults. Prompts are stored as a Sequence[Seed], so references to prompt properties are straightforward (e.g. ds.seeds[0].value).
- __init__(*, seeds: Sequence[Dict[str, Any]] | Sequence[Seed] | None = None, data_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'binary_path', 'url', 'reasoning', 'error', 'function_call', 'tool_call', 'function_call_output'] | None = 'text', name: str | None = None, dataset_name: str | None = None, harm_categories: Sequence[str] | None = None, description: str | None = None, authors: Sequence[str] | None = None, groups: Sequence[str] | None = None, source: str | None = None, date_added: datetime | None = None, added_by: str | None = None, seed_type: Literal['prompt', 'objective', 'simulated_conversation'] | None = None, is_objective: bool = False)[source]#
Initialize the dataset. Typically, you’ll call from_dict or from_yaml_file so that top-level defaults are merged into each seed. If you’re passing seeds directly, they can be either a list of Seed objects or seed dictionaries (which then get converted to Seed objects).
- Parameters:
seeds – List of seed dictionaries or Seed objects.
data_type – Default data type for seeds.
name – Name of the dataset.
dataset_name – Dataset name for categorization.
harm_categories – List of harm categories.
description – Description of the dataset.
authors – List of authors.
groups – List of groups.
source – Source of the dataset.
date_added – Date when the dataset was added.
added_by – User who added the dataset.
seed_type – The type of seeds in this dataset (“prompt”, “objective”, or “simulated_conversation”).
is_objective – Deprecated in 0.13.0. Use seed_type=”objective” instead.
Methods
__init__(*[, seeds, data_type, name, ...])Initialize the dataset.
from_dict(data)Builds a SeedDataset by merging top-level defaults into each item in 'seeds'.
from_yaml_file(file)Create a new object from a YAML file.
get_random_values(*, number[, harm_categories])Extracts and returns a list of random prompt values from the dataset.
get_values(*[, first, last, harm_categories])Extracts and returns a list of prompt values from the dataset.
Groups the given list of Seeds by their prompt_group_id and creates SeedGroup or SeedAttackGroup instances.
render_template_value(**kwargs)Renders self.value as a template, applying provided parameters in kwargs.
Attributes
Returns the seeds grouped by their prompt_group_id.
- classmethod from_dict(data: Dict[str, Any]) SeedDataset[source]#
Builds a SeedDataset by merging top-level defaults into each item in ‘seeds’.
- get_random_values(*, number: Annotated[int, Gt(gt=0)], harm_categories: Sequence[str] | None = None) Sequence[str][source]#
Extracts and returns a list of random prompt values from the dataset.
- get_values(*, first: Annotated[int, Gt(gt=0)] | None = None, last: Annotated[int, Gt(gt=0)] | None = None, harm_categories: Sequence[str] | None = None) Sequence[str][source]#
Extracts and returns a list of prompt values from the dataset. By default, returns all of them.
- Parameters:
- Returns:
A list of prompt values.
- Return type:
Sequence[str]
- static group_seed_prompts_by_prompt_group_id(seeds: Sequence[Seed]) Sequence[SeedGroup][source]#
Groups the given list of Seeds by their prompt_group_id and creates SeedGroup or SeedAttackGroup instances.
For each group, this method first attempts to create a SeedAttackGroup (which has attack-specific properties like objective). If validation fails, it falls back to a basic SeedGroup.
- Parameters:
seeds – A list of Seed objects.
- Returns:
A list of SeedGroup or SeedAttackGroup objects, with seeds grouped by prompt_group_id. Each group will be ordered by the sequence number of the seeds, if available.
- property objectives: Sequence[SeedObjective]#
- property prompts: Sequence[SeedPrompt]#
- render_template_value(**kwargs: object) None[source]#
Renders self.value as a template, applying provided parameters in kwargs.
- Parameters:
kwargs – Key-value pairs to replace in the SeedDataset value.
- Returns:
None
- Raises:
ValueError – If parameters are missing or invalid in the template.