pyrit.score.HarmScorerMetrics#

class HarmScorerMetrics(mean_absolute_error: float, mae_standard_error: float, t_statistic: float, p_value: float, krippendorff_alpha_combined: float, krippendorff_alpha_humans: float | None = None, krippendorff_alpha_model: float | None = None)[source]#

Bases: ScorerMetrics

Metrics for evaluating a harm scorer against a HumanLabeledDataset.

Parameters:
  • mean_absolute_error (float) – The mean absolute error between the model scores and the gold scores.

  • mae_standard_error (float) – The standard error of the mean absolute error. This can be used to calculate a confidence interval for the mean absolute error.

  • t_statistic (float) – The t-statistic for the one-sample t-test comparing model scores to human scores with a null hypothesis that the mean difference is 0. A high positive t-statistic (along with a low p-value) indicates that the model scores are typically higher than the human scores.

  • p_value (float) – The p-value for the one-sample t-test above. It represents the probability of obtaining a difference in means as extreme as the observed difference, assuming the null hypothesis is true.

  • krippendorff_alpha_combined (float) – Krippendorff’s alpha for the reliability data, which includes both human and model scores. This measures the agreement between all the human raters and model scoring trials and ranges between -1.0 to 1.0 where 1.0 indicates perfect agreement, 0.0 indicates no agreement, and negative values indicate systematic disagreement.

  • krippendorff_alpha_humans (float, Optional) – Krippendorff’s alpha for human scores, if there are multiple human raters. This measures the agreement between human raters.

  • krippendorff_alpha_model (float, Optional) – Krippendorff’s alpha for model scores, if there are multiple model scoring trials. This measures the agreement between model scoring trials.

__init__(mean_absolute_error: float, mae_standard_error: float, t_statistic: float, p_value: float, krippendorff_alpha_combined: float, krippendorff_alpha_humans: float | None = None, krippendorff_alpha_model: float | None = None) None#

Methods

__init__(mean_absolute_error, ...[, ...])

from_json(file_path)

Load the metrics from a JSON file.

to_json()

Convert the metrics to a JSON string.

Attributes

krippendorff_alpha_combined: float#
krippendorff_alpha_humans: float | None = None#
krippendorff_alpha_model: float | None = None#
mae_standard_error: float#
mean_absolute_error: float#
p_value: float#
t_statistic: float#