pyrit.score.ScorerMetrics#

class ScorerMetrics(num_responses: int, num_human_raters: int, *, num_scorer_trials: int = 1, dataset_name: str | None = None, dataset_version: str | None = None, trial_scores: ndarray | None = None, average_score_time_seconds: float = 0.0)[source]#

Bases: object

Base dataclass for storing scorer evaluation metrics.

This class provides methods for serializing metrics to JSON and loading them from JSON files.

Parameters:
  • num_responses (int) – Total number of responses evaluated.

  • num_human_raters (int) – Number of human raters who scored the responses.

  • num_scorer_trials (int) – Number of times the model scorer was run. Defaults to 1.

  • dataset_name (str, optional) – Name of the dataset used for evaluation.

  • dataset_version (str, optional) – Version of the dataset for reproducibility.

  • trial_scores (np.ndarray, optional) – Raw scores from each trial for debugging.

  • average_score_time_seconds (float) – Average time in seconds to score a single item. Defaults to 0.0.

__init__(num_responses: int, num_human_raters: int, *, num_scorer_trials: int = 1, dataset_name: str | None = None, dataset_version: str | None = None, trial_scores: ndarray | None = None, average_score_time_seconds: float = 0.0) None#

Methods

__init__(num_responses, num_human_raters, *)

from_json(file_path)

Load the metrics from a JSON file.

to_json()

Convert the metrics to a JSON string.

Attributes

average_score_time_seconds: float = 0.0#
dataset_name: str | None = None#
dataset_version: str | None = None#
classmethod from_json(file_path: str | Path) T[source]#

Load the metrics from a JSON file.

Parameters:

file_path (Union[str, Path]) – The path to the JSON file.

Returns:

An instance of ScorerMetrics with the loaded data.

Return type:

ScorerMetrics

Raises:

FileNotFoundError – If the specified file does not exist.

num_human_raters: int#
num_responses: int#
num_scorer_trials: int = 1#
to_json() str[source]#

Convert the metrics to a JSON string.

Returns:

The JSON string representation of the metrics.

Return type:

str

trial_scores: ndarray | None = None#