pyrit.score.Scorer#
- class Scorer[source]#
Bases:
ABC
Abstract base class for scorers.
- __init__()#
Methods
__init__
()Returns an identifier dictionary for the scorer.
get_scorer_metrics
(dataset_name[, metrics_type])Returns evaluation statistics for the scorer using the dataset_name of the human labeled dataset that this scorer was run against.
scale_value_float
(value, min_value, max_value)Scales a value from 0 to 1 based on the given min and max values.
score_async
(request_response, *[, task])Score the request_response, add the results to the database and return a list of Score objects.
score_image_async
(image_path, *[, task])Scores the given image using the chat target.
score_response_async
(*, response, scorers[, ...])Score a response using multiple scorers in parallel.
Score response pieces sequentially until finding a successful score.
score_response_with_objective_async
(*, response)Score a response using both auxiliary and objective scorers.
Scores a batch of responses (ignores non-assistant messages).
score_text_async
(text, *[, task])Scores the given text based on the task using the chat target.
score_text_batch_async
(*, texts[, tasks, ...])validate
(request_response, *[, task])Validates the request_response piece to score.
Attributes
- get_identifier()[source]#
Returns an identifier dictionary for the scorer.
- Returns:
The identifier dictionary.
- Return type:
- get_scorer_metrics(dataset_name: str, metrics_type: MetricsType | None = None)[source]#
Returns evaluation statistics for the scorer using the dataset_name of the human labeled dataset that this scorer was run against. If you did not evaluate the scorer against your own human labeled dataset, you can use this method to retrieve metrics based on a pre-existing dataset name, which is often a ‘harm_category’ or abbreviated version of the ‘objective’. For example, to retrieve metrics for the ‘hate_speech’ harm, you would pass ‘hate_speech’ as the dataset_name.
The existing metrics can be found in the ‘dataset/score/scorer_evals’ directory within either the ‘harm’ or ‘objective’ subdirectory.
- Parameters:
dataset_name (str) – The name of the dataset on which the scorer evaluation was run. This is used to inform the name of the metrics file to read in the scorer_evals directory.
metrics_type (MetricsType, optional) – The type of metrics to retrieve, either HARM or OBJECTIVE. If not provided, it will default to OBJECTIVE for true/false scorers and HARM for all other scorers.
- Returns:
A ScorerMetrics object containing the saved evaluation statistics for the scorer.
- Return type:
ScorerMetrics
- scale_value_float(value: float, min_value: float, max_value: float) float [source]#
Scales a value from 0 to 1 based on the given min and max values. E.g. 3 stars out of 5 stars would be .5.
- abstract async score_async(request_response: PromptRequestPiece, *, task: str | None = None) list[Score] [source]#
Score the request_response, add the results to the database and return a list of Score objects.
- Parameters:
request_response (PromptRequestPiece) – The request response to be scored.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).
- Returns:
A list of Score objects representing the results.
- Return type:
- async score_image_async(image_path: str, *, task: str | None = None) list[Score] [source]#
Scores the given image using the chat target.
- async score_prompts_with_tasks_batch_async(*, request_responses: Sequence[PromptRequestPiece], tasks: Sequence[str], batch_size: int = 10) list[Score] [source]#
- async static score_response_async(*, response: PromptRequestResponse, scorers: List[Scorer], role_filter: Literal['system', 'user', 'assistant'] = 'assistant', task: str | None = None, skip_on_error: bool = True) List[Score] [source]#
Score a response using multiple scorers in parallel.
This method runs all scorers on all filtered response pieces concurrently for maximum performance. Typically used for auxiliary scoring where all results are needed but not returned.
- Parameters:
response – PromptRequestResponse containing pieces to score
scorers – List of scorers to apply
role_filter – Only score pieces with this role (default: “assistant”)
task – Optional task description for scoring context
skip_on_error – If True, skip scoring pieces that have errors (default: True)
- Returns:
List of all scores from all scorers
- async static score_response_select_first_success_async(*, response: PromptRequestResponse, scorers: List[Scorer], role_filter: Literal['system', 'user', 'assistant'] = 'assistant', task: str | None = None, skip_on_error: bool = True) Score | None [source]#
Score response pieces sequentially until finding a successful score.
This method processes filtered response pieces one by one. For each piece, it runs all scorers in parallel, then checks the results for a successful score (where score.get_value() is truthy). If no successful score is found, it returns the first score as a failure indicator.
- Parameters:
response – PromptRequestResponse containing pieces to score
scorers – List of scorers to use for evaluation
role_filter – Only score pieces with this role (default: “assistant”)
task – Optional task description for scoring context
skip_on_error – If True, skip scoring pieces that have errors (default: True)
- Returns:
The first successful score, or the first score if no success found, or None if no scores
- async static score_response_with_objective_async(*, response: PromptRequestResponse, auxiliary_scorers: List[Scorer] | None = None, objective_scorers: List[Scorer] | None = None, role_filter: Literal['system', 'user', 'assistant'] = 'assistant', task: str | None = None, skip_on_error: bool = True) Dict[str, List[Score]] [source]#
Score a response using both auxiliary and objective scorers.
This method runs auxiliary scorers for collecting metrics and objective scorers for determining success. All scorers are run asynchronously for performance.
- Parameters:
response (PromptRequestResponse) – Response containing pieces to score
auxiliary_scorers (Optional[List[Scorer]]) – List of auxiliary scorers to apply
objective_scorers (Optional[List[Scorer]]) – List of objective scorers to apply
role_filter (ChatMessageRole) – Only score pieces with this role (default: assistant)
task (Optional[str]) – Optional task description for scoring context
skip_on_error (bool) – If True, skip scoring pieces that have errors (default: True)
- Returns:
- Dictionary with keys auxiliary_scores and objective_scores
containing lists of scores from each type of scorer.
- Return type:
- async score_responses_inferring_tasks_batch_async(*, request_responses: Sequence[PromptRequestPiece], batch_size: int = 10) list[Score] [source]#
Scores a batch of responses (ignores non-assistant messages).
This will send the last requests as tasks if it can. If it’s complicated (e.g. non-text) it will send None.
For more control, use score_prompts_with_tasks_batch_async
- async score_text_async(text: str, *, task: str | None = None) list[Score] [source]#
Scores the given text based on the task using the chat target.
- async score_text_batch_async(*, texts: Sequence[str], tasks: Sequence[str] | None = None, batch_size: int = 10) list[Score] [source]#
- abstract validate(request_response: PromptRequestPiece, *, task: str | None = None)[source]#
Validates the request_response piece to score. Because some scorers may require specific PromptRequestPiece types or values.
- Parameters:
request_response (PromptRequestPiece) – The request response to be validated.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).