pyrit.score.Scorer#
- class Scorer(*, validator: ScorerPromptValidator)[source]#
Bases:
ABC
Abstract base class for scorers.
- __init__(*, validator: ScorerPromptValidator)[source]#
Methods
__init__
(*, validator)Returns an identifier dictionary for the scorer.
get_scorer_metrics
(dataset_name[, metrics_type])Returns evaluation statistics for the scorer using the dataset_name of the human labeled dataset that this scorer was run against.
scale_value_float
(value, min_value, max_value)Scales a value from 0 to 1 based on the given min and max values.
score_async
(request_response, *[, ...])Score the request_response, add the results to the database and return a list of Score objects.
score_image_async
(image_path, *[, objective])Scores the given image using the chat target.
score_image_batch_async
(*, image_paths[, ...])score_prompts_batch_async
(*, request_responses)Score multiple prompts in batches using the provided objectives.
score_response_async
(*, response[, ...])Score a response using an objective scorer and optional auxiliary scorers.
Score a response using multiple scorers in parallel.
score_text_async
(text, *[, objective])Scores the given text based on the task using the chat target.
validate_return_scores
(scores)Validates the scores returned by the scorer.
Attributes
- get_identifier()[source]#
Returns an identifier dictionary for the scorer.
- Returns:
The identifier dictionary.
- Return type:
- get_scorer_metrics(dataset_name: str, metrics_type: MetricsType | None = None)[source]#
Returns evaluation statistics for the scorer using the dataset_name of the human labeled dataset that this scorer was run against. If you did not evaluate the scorer against your own human labeled dataset, you can use this method to retrieve metrics based on a pre-existing dataset name, which is often a ‘harm_category’ or abbreviated version of the ‘objective’. For example, to retrieve metrics for the ‘hate_speech’ harm, you would pass ‘hate_speech’ as the dataset_name.
The existing metrics can be found in the ‘dataset/score/scorer_evals’ directory within either the ‘harm’ or ‘objective’ subdirectory.
- Parameters:
dataset_name (str) – The name of the dataset on which the scorer evaluation was run. This is used to inform the name of the metrics file to read in the scorer_evals directory.
metrics_type (MetricsType, optional) – The type of metrics to retrieve, either HARM or OBJECTIVE. If not provided, it will default to OBJECTIVE for true/false scorers and HARM for all other scorers.
- Returns:
A ScorerMetrics object containing the saved evaluation statistics for the scorer.
- Return type:
- scale_value_float(value: float, min_value: float, max_value: float) float [source]#
Scales a value from 0 to 1 based on the given min and max values. E.g. 3 stars out of 5 stars would be .5.
- async score_async(request_response: PromptRequestResponse, *, objective: str | None = None, role_filter: Literal['system', 'user', 'assistant', 'tool', 'developer'] | None = None, skip_on_error_result: bool = False, infer_objective_from_request: bool = False) list[Score] [source]#
Score the request_response, add the results to the database and return a list of Score objects.
- Parameters:
request_response (PromptRequestResponse) – The request response to be scored.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).
- Returns:
A list of Score objects representing the results.
- Return type:
- async score_image_async(image_path: str, *, objective: str | None = None) list[Score] [source]#
Scores the given image using the chat target.
- async score_image_batch_async(*, image_paths: Sequence[str], objectives: Sequence[str] | None = None, batch_size: int = 10) list[Score] [source]#
- async score_prompts_batch_async(*, request_responses: Sequence[PromptRequestResponse], objectives: Sequence[str] | None = None, batch_size: int = 10, role_filter: Literal['system', 'user', 'assistant', 'tool', 'developer'] | None = None, skip_on_error_result: bool = False, infer_objective_from_request: bool = False) list[Score] [source]#
Score multiple prompts in batches using the provided objectives.
- Parameters:
request_responses (Sequence[PromptRequestResponse]) – The request responses to be scored.
objectives (Sequence[str]) – The objectives/tasks based on which the prompts should be scored. Must have the same length as request_responses.
batch_size (int) – The maximum batch size for processing prompts. Defaults to 10.
role_filter (Optional[ChatMessageRole]) – If provided, only score pieces with this role. Defaults to None (no filtering).
skip_on_error_result (bool) – If True, skip scoring pieces that have errors. Defaults to False.
infer_objective_from_request (bool) – If True and objective is empty, attempt to infer the objective from the request. Defaults to False.
- Returns:
A flattened list of Score objects from all scored prompts.
- Return type:
- Raises:
ValueError – If objectives is empty or if the number of objectives doesn’t match the number of request_responses.
- async static score_response_async(*, response: PromptRequestResponse, objective_scorer: Scorer | None = None, auxiliary_scorers: List[Scorer] | None = None, role_filter: Literal['system', 'user', 'assistant', 'tool', 'developer'] = 'assistant', objective: str | None = None, skip_on_error_result: bool = True) Dict[str, List[Score]] [source]#
Score a response using an objective scorer and optional auxiliary scorers.
- Parameters:
response (PromptRequestResponse) – Response containing pieces to score
objective_scorer (Scorer) – The main scorer to determine success
auxiliary_scorers (Optional[List[Scorer]]) – List of auxiliary scorers to apply
role_filter (ChatMessageRole) – Only score pieces with this role (default: assistant)
objective (Optional[str]) – Task/objective for scoring context
skip_on_error_result (bool) – If True, skip scoring pieces that have errors (default: True)
- Returns:
- Dictionary with keys auxiliary_scores and objective_scores
containing lists of scores from each type of scorer.
- Return type:
- async static score_response_multiple_scorers_async(*, response: PromptRequestResponse, scorers: List[Scorer], role_filter: Literal['system', 'user', 'assistant', 'tool', 'developer'] = 'assistant', objective: str | None = None, skip_on_error_result: bool = True) List[Score] [source]#
Score a response using multiple scorers in parallel.
This method applies each scorer to the first scorable response piece (filtered by role and error), and returns all scores. This is typically used for auxiliary scoring where all results are needed.
- Parameters:
response (PromptRequestResponse) – The response containing pieces to score.
scorers (List[Scorer]) – List of scorers to apply.
role_filter (ChatMessageRole) – Only score pieces with this role (default: “assistant”).
objective (Optional[str]) – Optional objective description for scoring context.
skip_on_error_result (bool) – If True, skip scoring pieces that have errors (default: True).
- Returns:
All scores from all scorers
- Return type:
List[Score]