pyrit.score

Scoring functionality for evaluating AI model responses across various dimensions including harm detection, objective completion, and content classification.

Functions¶

`create_conversation_scorer`¶

create_conversation_scorer(scorer: Scorer, validator: Optional[ScorerPromptValidator] = None) → Scorer

Create a ConversationScorer that inherits from the same type as the wrapped scorer.

This factory dynamically creates a ConversationScorer class that inherits from the wrapped scorer’s base class (FloatScaleScorer or TrueFalseScorer), ensuring the returned scorer is an instance of both ConversationScorer and the wrapped scorer’s type.

Parameter	Type	Description
`scorer`	`Scorer`	The scorer to wrap for conversation-level evaluation. Must be an instance of FloatScaleScorer or TrueFalseScorer.
`validator`	`Optional[ScorerPromptValidator]`	Optional validator override. If not provided, uses the wrapped scorer’s validator. Defaults to `None`.

Returns:

Scorer — A ConversationScorer instance that is also an instance of the wrapped scorer’s type.

Raises:

ValueError — If the scorer is not an instance of FloatScaleScorer or TrueFalseScorer.

`get_all_harm_metrics`¶

get_all_harm_metrics(harm_category: str) → list[ScorerMetricsWithIdentity[HarmScorerMetrics]]

Load all harm scorer metrics for a specific harm category.

Returns a list of ScorerMetricsWithIdentity[HarmScorerMetrics] objects that wrap the scorer’s identity information and its performance metrics, enabling clean attribute access like entry.metrics.mean_absolute_error or entry.metrics.harm_category.

Parameter	Type	Description
`harm_category`	`str`	The harm category to load metrics for (e.g., “hate_speech”, “violence”).

Returns:

list[ScorerMetricsWithIdentity[HarmScorerMetrics]] — List[ScorerMetricsWithIdentity[HarmScorerMetrics]]: List of metrics with scorer identity. Access metrics via entry.metrics.mean_absolute_error, entry.metrics.harm_category, etc. Access scorer info via entry.scorer_identifier.class_name, etc.

`get_all_objective_metrics`¶

get_all_objective_metrics(file_path: Optional[Path] = None) → list[ScorerMetricsWithIdentity[ObjectiveScorerMetrics]]

Load all objective scorer metrics with full scorer identity for comparison.

Returns a list of ScorerMetricsWithIdentity[ObjectiveScorerMetrics] objects that wrap the scorer’s identity information and its performance metrics, enabling clean attribute access like entry.metrics.accuracy or entry.metrics.f1_score.

Parameter	Type	Description
`file_path`	`Optional[Path]`	Path to a specific JSONL file to load. If not provided, uses the default path: SCORER_EVALS_PATH / “objective” / “objective_achieved_metrics.jsonl” Defaults to `None`.

Returns:

list[ScorerMetricsWithIdentity[ObjectiveScorerMetrics]] — List[ScorerMetricsWithIdentity[ObjectiveScorerMetrics]]: List of metrics with scorer identity. Access metrics via entry.metrics.accuracy, entry.metrics.f1_score, etc. Access scorer info via entry.scorer_identifier.class_name, etc.

`AudioFloatScaleScorer`¶

Bases: FloatScaleScorer

A scorer that processes audio files by transcribing them and scoring the transcript.

The AudioFloatScaleScorer transcribes audio to text using Azure Speech-to-Text, then scores the transcript using a FloatScaleScorer.

Constructor Parameters:

Parameter	Type	Description
`text_capable_scorer`	`FloatScaleScorer`	A FloatScaleScorer capable of processing text. This scorer will be used to evaluate the transcribed audio content.
`validator`	`Optional[ScorerPromptValidator]`	Validator for the scorer. Defaults to audio_path data type validator. Defaults to `None`.
`use_entra_auth`	`Optional[bool]`	Whether to use Entra ID authentication for Azure Speech. Defaults to True if None. Defaults to `None`.

`AudioTrueFalseScorer`¶

Bases: TrueFalseScorer

A scorer that processes audio files by transcribing them and scoring the transcript.

The AudioTrueFalseScorer transcribes audio to text using Azure Speech-to-Text, then scores the transcript using a TrueFalseScorer.

Constructor Parameters:

Parameter	Type	Description
`text_capable_scorer`	`TrueFalseScorer`	A TrueFalseScorer capable of processing text. This scorer will be used to evaluate the transcribed audio content.
`validator`	`Optional[ScorerPromptValidator]`	Validator for the scorer. Defaults to audio_path data type validator. Defaults to `None`.
`use_entra_auth`	`Optional[bool]`	Whether to use Entra ID authentication for Azure Speech. Defaults to True if None. Defaults to `None`.

`AzureContentFilterScorer`¶

Bases: FloatScaleScorer

A scorer that uses Azure Content Safety API to evaluate text and images for harmful content.

This scorer analyzes content across multiple harm categories (hate, self-harm, sexual, violence) and returns a score for each category in the range [0, 1], where higher scores indicate more severe content. Supports both text and image inputs.

Constructor Parameters:

Parameter	Type	Description
`endpoint`	`Optional[str	None]`
`api_key`	`Optional[str	Callable[[], str
`harm_categories`	`Optional[list[TextCategory]]`	The harm categories you want to query for as defined in azure.ai.contentsafety.models.TextCategory. If not provided, defaults to all categories. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator for the scorer. Defaults to None. Defaults to `None`.

Methods:

`evaluate_async`¶

evaluate_async(file_mapping: Optional[ScorerEvalDatasetFiles] = None, num_scorer_trials: int = 3, update_registry_behavior: RegistryUpdateBehavior = None, max_concurrency: int = 10) → Optional[ScorerMetrics]

Evaluate this scorer against human-labeled datasets.

AzureContentFilterScorer requires exactly one harm category to be configured for evaluation. This ensures each score corresponds to exactly one category in the ground truth dataset.

Parameter	Type	Description
`file_mapping`	`Optional[ScorerEvalDatasetFiles]`	Optional ScorerEvalDatasetFiles configuration. If not provided, uses the mapping based on the configured harm category. Defaults to `None`.
`num_scorer_trials`	`int`	Number of times to score each response. Defaults to 3. Defaults to `3`.
`update_registry_behavior`	`RegistryUpdateBehavior`	Controls how existing registry entries are handled. - SKIP_IF_EXISTS (default): Check registry for existing results. If found, return cached metrics. - ALWAYS_UPDATE: Always run evaluation and overwrite any existing registry entry. - NEVER_UPDATE: Always run evaluation but never write to registry (for debugging). Defaults to RegistryUpdateBehavior.SKIP_IF_EXISTS. Defaults to `None`.
`max_concurrency`	`int`	Maximum concurrent scoring requests. Defaults to 10. Defaults to `10`.

Returns:

Optional[ScorerMetrics] — The evaluation metrics, or None if no datasets found.

Raises:

ValueError — If more than one harm category is configured.

`BatchScorer`¶

A utility class for scoring prompts in batches in a parallelizable and convenient way.

This class provides functionality to score existing prompts stored in memory without any target interaction, making it a pure scoring utility.

Constructor Parameters:

Parameter	Type	Description
`batch_size`	`int`	The (max) batch size for sending prompts. Defaults to 10. Note: If using a scorer that takes a prompt target, and providing max requests per minute on the target, this should be set to 1 to ensure proper rate limit management. Defaults to `10`.

Methods:

`score_responses_by_filters_async`¶

score_responses_by_filters_async(scorer: Scorer, attack_id: Optional[str | uuid.UUID] = None, conversation_id: Optional[str | uuid.UUID] = None, prompt_ids: Optional[list[str] | list[uuid.UUID]] = None, labels: Optional[dict[str, str]] = None, sent_after: Optional[datetime] = None, sent_before: Optional[datetime] = None, original_values: Optional[list[str]] = None, converted_values: Optional[list[str]] = None, data_type: Optional[str] = None, not_data_type: Optional[str] = None, converted_value_sha256: Optional[list[str]] = None, objective: str = '') → list[Score]

Score the responses that match the specified filters.

Parameter	Type	Description
`scorer`	`Scorer`	The Scorer object to use for scoring.
`attack_id`	`Optional[str	uuid.UUID]`
`conversation_id`	`Optional[str	uuid.UUID]`
`prompt_ids`	`Optional[list[str]	list[uuid.UUID]]`
`labels`	`Optional[dict[str, str]]`	A dictionary of labels. Defaults to None. Defaults to `None`.
`sent_after`	`Optional[datetime]`	Filter for prompts sent after this datetime. Defaults to None. Defaults to `None`.
`sent_before`	`Optional[datetime]`	Filter for prompts sent before this datetime. Defaults to None. Defaults to `None`.
`original_values`	`Optional[list[str]]`	A list of original values. Defaults to None. Defaults to `None`.
`converted_values`	`Optional[list[str]]`	A list of converted values. Defaults to None. Defaults to `None`.
`data_type`	`Optional[str]`	The data type to filter by. Defaults to None. Defaults to `None`.
`not_data_type`	`Optional[str]`	The data type to exclude. Defaults to None. Defaults to `None`.
`converted_value_sha256`	`Optional[list[str]]`	A list of SHA256 hashes of converted values. Defaults to None. Defaults to `None`.
`objective`	`str`	A task is used to give the scorer more context on what exactly to score. A task might be the request prompt text or the original attack model’s objective. Note: the same task is applied to all matched prompts. Defaults to an empty string. Defaults to `''`.

Returns:

list[Score] — list[Score]: A list of Score objects for responses that match the specified filters.

Raises:

ValueError — If no entries match the provided filters.

`ConsoleScorerPrinter`¶

Bases: ScorerPrinter

Console printer for scorer information with enhanced formatting.

This printer formats scorer details for console display with optional color coding, proper indentation, and visual hierarchy. Colors can be disabled for consoles that don’t support ANSI characters.

Constructor Parameters:

Parameter	Type	Description
`indent_size`	`int`	Number of spaces for indentation. Must be non-negative. Defaults to 2. Defaults to `2`.
`enable_colors`	`bool`	Whether to enable ANSI color output. When False, all output will be plain text without colors. Defaults to True. Defaults to `True`.

Methods:

`print_harm_scorer`¶

print_harm_scorer(scorer_identifier: ComponentIdentifier, harm_category: str) → None

Print harm scorer information including type, nested scorers, and evaluation metrics.

This method displays:

Scorer type and identity information
Nested sub-scorers (for composite scorers)
Harm evaluation metrics (MAE, Krippendorff alpha) from the registry

Parameter	Type	Description
`scorer_identifier`	`ComponentIdentifier`	The scorer identifier to print information for.
`harm_category`	`str`	The harm category for looking up metrics (e.g., “hate_speech”, “violence”).

`print_objective_scorer`¶

print_objective_scorer(scorer_identifier: ComponentIdentifier) → None

Print objective scorer information including type, nested scorers, and evaluation metrics.

This method displays:

Scorer type and identity information
Nested sub-scorers (for composite scorers)
Objective evaluation metrics (accuracy, precision, recall, F1) from the registry

Parameter	Type	Description
`scorer_identifier`	`ComponentIdentifier`	The scorer identifier to print information for.

`ContentClassifierPaths`¶

Bases: enum.Enum

Paths to content classifier YAML files.

`ConversationScorer`¶

Bases: Scorer, ABC

Scorer that evaluates entire conversation history rather than individual messages.

This scorer wraps another scorer (FloatScaleScorer or TrueFalseScorer) and evaluates the full conversation context. Useful for multi-turn conversations where context matters (e.g., psychosocial harms that emerge over time or persuasion/deception over many messages).

The ConversationScorer dynamically inherits from the same base class as the wrapped scorer, ensuring proper type compatibility.

Note: This class cannot be instantiated directly. Use create_conversation_scorer() factory instead.

Methods:

`validate_return_scores`¶

validate_return_scores(scores: list[Score]) → None

Validate scores by delegating to the wrapped scorer’s validation.

Parameter	Type	Description
`scores`	`list[Score]`	The scores to validate.

`DecodingScorer`¶

Bases: TrueFalseScorer

Scorer that checks if the request values are in the output using a text matching strategy.

This scorer checks if any of the user request values (original_value, converted_value, or metadata decoded_text) match the response converted_value using the configured text matching strategy.

Constructor Parameters:

Parameter	Type	Description
`text_matcher`	`Optional[TextMatching]`	The text matching strategy to use. Defaults to ExactTextMatching with case_sensitive=False. Defaults to `None`.
`categories`	`Optional[list[str]]`	Optional list of categories for the score. Defaults to None. Defaults to `None`.
`aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.

`FloatScaleScoreAggregator`¶

Namespace for float scale score aggregators that return a single aggregated score.

All aggregators return a list containing one ScoreAggregatorResult that combines all input scores together, preserving all categories.

`FloatScaleScorer`¶

Bases: Scorer

Base class for scorers that return floating-point scores in the range [0, 1].

This scorer evaluates prompt responses and returns numeric scores indicating the degree to which a response exhibits certain characteristics. Each piece in a request response is scored independently, returning one score per piece.

Constructor Parameters:

Parameter	Type	Description
`validator`	`ScorerPromptValidator`	A validator object used to validate scores.

Methods:

`get_scorer_metrics`¶

get_scorer_metrics() → Optional[HarmScorerMetrics]

Get evaluation metrics for this scorer from the configured evaluation result file.

Returns:

Optional[HarmScorerMetrics] — The metrics for this scorer, or None if not found or not configured.

`validate_return_scores`¶

validate_return_scores(scores: list[Score]) → None

Validate that the returned scores are within the valid range [0, 1].

Raises:

ValueError — If any score is not between 0 and 1.

`FloatScaleScorerAllCategories`¶

Namespace for float scale score aggregators that combine all categories.

These aggregators ignore category boundaries and aggregate all scores together, returning a single ScoreAggregatorResult with all categories combined.

`FloatScaleScorerByCategory`¶

Namespace for float scale score aggregators that group by category.

These aggregators return multiple ScoreAggregatorResult objects (one per category). Useful for scorers like AzureContentFilterScorer that return multiple scores per item.

`FloatScaleThresholdScorer`¶

Bases: TrueFalseScorer

A scorer that applies a threshold to a float scale score to make it a true/false score.

Constructor Parameters:

Parameter	Type	Description
`scorer`	`FloatScaleScorer`	The underlying float scale scorer to use.
`threshold`	`float`	The threshold value between 0 and 1. Scores >= threshold are True, otherwise False.
`float_scale_aggregator`	`FloatScaleAggregatorFunc`	The aggregator function to use for combining multiple float scale scores. Defaults to FloatScaleScoreAggregator.MAX. Defaults to `FloatScaleScoreAggregator.MAX`.

`GandalfScorer`¶

Bases: TrueFalseScorer

A scorer for evaluating responses in Gandalf challenges.

This scorer checks whether a password or secret has been successfully extracted from a Gandalf-style challenge by submitting the response to the Gandalf API and checking if the password is correct. Returns True if the password was successfully extracted.

Constructor Parameters:

Parameter	Type	Description
`level`	`GandalfLevel`	The Gandalf challenge level to score against.
`chat_target`	`PromptChatTarget`	The chat target used for password extraction.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to text data type validator. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	Aggregator for combining scores. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

`HarmHumanLabeledEntry`¶

Bases: HumanLabeledEntry

A class that represents a human-labeled dataset entry for a specific harm category. This class includes the Messages and a list of human scores, which are floats between 0.0 and 1.0 inclusive, representing the degree of harm severity where 0.0 is minimal and 1.0 is maximal. The harm category is a string that represents the type of harm (e.g., “hate_speech”, “misinformation”, etc.).

`HarmScorerEvaluator`¶

Bases: ScorerEvaluator

A class that evaluates a harm scorer against HumanLabeledDatasets of type HARM.

`HarmScorerMetrics`¶

Bases: ScorerMetrics

Metrics for evaluating a harm scorer against a HumanLabeledDataset.

Methods:

`get_harm_definition`¶

get_harm_definition() → Optional[HarmDefinition]

Load and return the HarmDefinition object for this metrics instance.

Loads the harm definition YAML file specified in harm_definition and returns it as a HarmDefinition object. The result is cached after the first load.

Returns:

Optional[HarmDefinition] — The loaded harm definition object, or None if harm_definition is not set.

Raises:

FileNotFoundError — If the harm definition file does not exist.
ValueError — If the harm definition file is invalid.

`HumanInTheLoopScorerGradio`¶

Bases: TrueFalseScorer

Create scores from manual human input using Gradio and adds them to the database.

In the future this will not be a TrueFalseScorer. However, it is all that is supported currently.

.. deprecated:: This Gradio-based scorer is deprecated and will be removed in v0.13.0. Use the React-based GUI instead.

Constructor Parameters:

Parameter	Type	Description
`open_browser`	`bool`	If True, the scorer will open the Gradio interface in a browser instead of opening it in PyWebview. Defaults to False. Defaults to `False`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	Aggregator for combining scores. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

Methods:

`retrieve_score`¶

retrieve_score(request_prompt: MessagePiece, objective: Optional[str] = None) → list[Score]

Retrieve a score from the human evaluator through the RPC server.

Parameter	Type	Description
`request_prompt`	`MessagePiece`	The message piece to be scored.
`objective`	`Optional[str]`	The objective to evaluate against. Defaults to None. Defaults to `None`.

Returns:

list[Score] — list[Score]: A list containing a single Score object from the human evaluator.

`HumanLabeledDataset`¶

A class that represents a human-labeled dataset, including the entries and each of their corresponding human scores. This dataset is used to evaluate PyRIT scorer performance via the ScorerEvaluator class. HumanLabeledDatasets can be constructed from a CSV file.

Constructor Parameters:

Parameter	Type	Description
`name`	`str`	The name of the human-labeled dataset. For datasets of uniform type, this is often the harm category (e.g. hate_speech) or objective. It will be used in the naming of metrics (JSON) and model scores (CSV) files when evaluation is run on this dataset.
`entries`	`List[HumanLabeledEntry]`	A list of entries in the dataset.
`metrics_type`	`MetricsType`	The type of the human-labeled dataset, either HARM or OBJECTIVE.
`version`	`str`	The version of the human-labeled dataset.
`harm_definition`	`str`	Path to the harm definition YAML file for HARM datasets. Defaults to `None`.
`harm_definition_version`	`str`	Version of the harm definition YAML file. Used to ensure the human labels match the scoring criteria version. Defaults to `None`.

Methods:

`from_csv`¶

from_csv(csv_path: Union[str, Path], metrics_type: MetricsType, dataset_name: Optional[str] = None, version: Optional[str] = None, harm_definition: Optional[str] = None, harm_definition_version: Optional[str] = None) → HumanLabeledDataset

Load a human-labeled dataset from a CSV file with standard column names.

Expected CSV format:

‘assistant_response’: The assistant’s response text
‘human_score’: Human-assigned label (can have multiple columns for multiple raters)
‘objective’: For OBJECTIVE datasets, the objective being evaluated
‘data_type’: Optional data type (defaults to ‘text’ if not present)

You can optionally include a # comment line at the top of the CSV file to specify the dataset version and harm definition path. The format is:

For harm datasets: # dataset_version=x.y, harm_definition=path/to/definition.yaml, harm_definition_version=x.y
For objective datasets: # dataset_version=x.y

Parameter	Type	Description
`csv_path`	`Union[str, Path]`	The path to the CSV file.
`metrics_type`	`MetricsType`	The type of the human-labeled dataset, either HARM or OBJECTIVE.
`dataset_name`	`(str, Optional)`	The name of the dataset. If not provided, it will be inferred from the CSV file name. Defaults to `None`.
`version`	`(str, Optional)`	The version of the dataset. If not provided here, it will be inferred from the CSV file if a dataset_version comment line is present. Defaults to `None`.
`harm_definition`	`(str, Optional)`	Path to the harm definition YAML file. If not provided here, it will be inferred from the CSV file if a harm_definition comment is present. Defaults to `None`.
`harm_definition_version`	`(str, Optional)`	Version of the harm definition YAML file. If not provided here, it will be inferred from the CSV file if a harm_definition_version comment is present. Defaults to `None`.

Returns:

HumanLabeledDataset — The human-labeled dataset object.

Raises:

FileNotFoundError — If the CSV file does not exist.
ValueError — If version is not provided and not found in the CSV file.

`get_harm_definition`¶

get_harm_definition() → Optional[HarmDefinition]

Load and return the HarmDefinition object for this dataset.

For HARM datasets, this loads the harm definition YAML file specified in harm_definition and returns it as a HarmDefinition object. The result is cached after the first load.

Returns:

Optional[HarmDefinition] — The loaded harm definition object, or None if this is not a HARM dataset or harm_definition is not set.

Raises:

FileNotFoundError — If the harm definition file does not exist.
ValueError — If the harm definition file is invalid.

`validate`¶

validate() → None

Validate that the dataset is internally consistent.

Checks that all entries match the dataset’s metrics_type and, for HARM datasets, that all entries have the same harm_category, that harm_definition is specified, and that the harm definition file exists and is loadable.

Raises:

ValueError — If entries don’t match metrics_type, harm categories are inconsistent, or harm_definition is missing for HARM datasets.
FileNotFoundError — If the harm definition file does not exist.

`HumanLabeledEntry`¶

A class that represents an entry in a dataset of assistant responses that have been scored by humans. It is used to evaluate PyRIT scorer performance as measured by degree of alignment with human labels. This class includes the Messages and a list of human-assigned scores, which are floats between 0.0 and 1.0 inclusive (representing degree of severity) for harm datasets, and booleans for objective datasets.

`InsecureCodeScorer`¶

Bases: FloatScaleScorer

A scorer that uses an LLM to evaluate code snippets for potential security vulnerabilities. Configuration is loaded from a YAML file for dynamic prompts and instructions.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The target to use for scoring code security.
`system_prompt_path`	`Optional[Union[str, Path]]`	Path to the YAML file containing the system prompt. Defaults to the default insecure code scoring prompt if not provided. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator for the scorer. Defaults to None. Defaults to `None`.

`LikertScaleEvalFiles`¶

Configuration for evaluating a Likert scale scorer on a set of dataset files.

`LikertScalePaths`¶

Bases: enum.Enum

Enum containing Likert scale configurations including YAML paths and evaluation file mappings.

Each enum value is a tuple of (yaml_path, evaluation_files) where:

yaml_path: Path to the YAML file containing the Likert scale definition
evaluation_files: Optional LikertScaleEvalFiles for scorer evaluation, or None if no dataset exists

`MarkdownInjectionScorer`¶

Bases: TrueFalseScorer

A scorer that detects markdown injection attempts in text responses.

This scorer checks for the presence of markdown syntax patterns that could be used for injection attacks, such as links, images, or other markdown constructs that might be exploited. Returns True if markdown injection is detected.

Constructor Parameters:

Parameter	Type	Description
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

`MetricsType`¶

Bases: Enum

Enum representing the type of metrics when evaluating scorers on human-labeled datasets.

`ObjectiveHumanLabeledEntry`¶

Bases: HumanLabeledEntry

A class that represents a human-labeled dataset entry for a specific objective. This class includes the Messages and a list of human scores, which are booleans indicating whether the response/conversation meets the objective (e.g., 0 for not meeting the objective, 1 for meeting the objective). The objective is a string that represents the objective (e.g., "how to make a Molotov cocktail?).

`ObjectiveScorerEvaluator`¶

Bases: ScorerEvaluator

A class that evaluates an objective scorer against HumanLabeledDatasets of type OBJECTIVE.

`ObjectiveScorerMetrics`¶

Bases: ScorerMetrics

Metrics for evaluating an objective scorer against a HumanLabeledDataset.

`PlagiarismMetric`¶

Bases: Enum

Enum representing different plagiarism detection metrics.

`PlagiarismScorer`¶

Bases: FloatScaleScorer

A scorer that measures plagiarism by computing word-level similarity between the AI response and a reference text.

This scorer implements three similarity metrics:

Word-level longest common subsequence (LCS)
Word-level Levenshtein similarity
Word-level n-gram Jaccard similarity

Constructor Parameters:

Parameter	Type	Description
`reference_text`	`str`	The reference text to compare against.
`metric`	`PlagiarismMetric`	The plagiarism detection metric to use. Defaults to PlagiarismMetric.LCS. Defaults to `PlagiarismMetric.LCS`.
`n`	`int`	The n-gram size for n-gram similarity. Defaults to 5. Defaults to `5`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator for the scorer. Defaults to None. Defaults to `None`.

`PromptShieldScorer`¶

Bases: TrueFalseScorer

Returns true if an attack or jailbreak has been detected by Prompt Shield.

Constructor Parameters:

Parameter	Type	Description
`prompt_shield_target`	`PromptShieldTarget`	The Prompt Shield target to use for scoring.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

`QuestionAnswerScorer`¶

Bases: TrueFalseScorer

A class that represents a question answering scorer.

Constructor Parameters:

Parameter	Type	Description
`correct_answer_matching_patterns`	`list[str]`	A list of patterns to check for in the response. If any pattern is found in the response, the score will be True. These patterns should be format strings that will be formatted with the correct answer metadata. Defaults to CORRECT_ANSWER_MATCHING_PATTERNS. Defaults to `CORRECT_ANSWER_MATCHING_PATTERNS`.
`category`	`Optional[list[str]]`	Optional list of categories for the score. Defaults to None. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

`RefusalScorerPaths`¶

Bases: enum.Enum

Paths to refusal scorer system prompt YAML files.

Each enum value represents a different refusal detection strategy:

DEFAULT: Standard refusal detection that works with or without an explicit objective. If an objective is provided, evaluates refusal against it; if not, evaluates against the implied objective. Safe completions (including partial information, redirections, asking questions, or excessive caveats) are NOT considered refusals.
STRICT: Strict refusal detection that treats “safe completions” as refusals. Works best when an explicit objective is provided.

`RegistryUpdateBehavior`¶

Bases: Enum

Enum representing how the evaluation registry should be updated.

`Scorer`¶

Bases: Identifiable, abc.ABC

Abstract base class for scorers.

Constructor Parameters:

Parameter	Type	Description
`validator`	`ScorerPromptValidator`	Validator for message pieces and scorer configuration.

Methods:

`evaluate_async`¶

evaluate_async(file_mapping: Optional[ScorerEvalDatasetFiles] = None, num_scorer_trials: int = 3, update_registry_behavior: RegistryUpdateBehavior = None, max_concurrency: int = 10) → Optional[ScorerMetrics]

Evaluate this scorer against human-labeled datasets.

Uses file mapping to determine which datasets to evaluate and how to aggregate results.

Parameter	Type	Description
`file_mapping`	`Optional[ScorerEvalDatasetFiles]`	Optional ScorerEvalDatasetFiles configuration. If not provided, uses the scorer’s configured evaluation_file_mapping. Maps input file patterns to an output result file. Defaults to `None`.
`num_scorer_trials`	`int`	Number of times to score each response (for measuring variance). Defaults to 3. Defaults to `3`.
`update_registry_behavior`	`RegistryUpdateBehavior`	Controls how existing registry entries are handled. - SKIP_IF_EXISTS (default): Check registry for existing results. If found, return cached metrics. - ALWAYS_UPDATE: Always run evaluation and overwrite any existing registry entry. - NEVER_UPDATE: Always run evaluation but never write to registry (for debugging). Defaults to RegistryUpdateBehavior.SKIP_IF_EXISTS. Defaults to `None`.
`max_concurrency`	`int`	Maximum number of concurrent scoring requests. Defaults to 10. Defaults to `10`.

Returns:

Optional[ScorerMetrics] — The evaluation metrics, or None if no datasets found.

Raises:

ValueError — If no file_mapping is provided and no evaluation_file_mapping is configured.

`get_identifier`¶

get_identifier() → ComponentIdentifier

Get the scorer’s identifier with eval_hash always attached.

Overrides the base Identifiable.get_identifier() so that to_dict() always emits the eval_hash key.

Returns:

ComponentIdentifier — The identity with eval_hash set.

`get_scorer_metrics`¶

get_scorer_metrics() → Optional[ScorerMetrics]

Get evaluation metrics for this scorer from the configured evaluation result file.

Looks up metrics by this scorer’s identity hash in the JSONL result file. The result file may contain entries for multiple scorer configurations.

Subclasses must implement this to return the appropriate metrics type:

TrueFalseScorer subclasses should return ObjectiveScorerMetrics
FloatScaleScorer subclasses should return HarmScorerMetrics

Returns:

Optional[ScorerMetrics] — The metrics for this scorer, or None if not found or not configured.

`scale_value_float`¶

scale_value_float(value: float, min_value: float, max_value: float) → float

Scales a value from 0 to 1 based on the given min and max values. E.g. 3 stars out of 5 stars would be .5.

Parameter	Type	Description
`value`	`float`	The value to be scaled.
`min_value`	`float`	The minimum value of the range.
`max_value`	`float`	The maximum value of the range.

Returns:

float — The scaled value.

`score_async`¶

score_async(message: Message, objective: Optional[str] = None, role_filter: Optional[ChatMessageRole] = None, skip_on_error_result: bool = False, infer_objective_from_request: bool = False) → list[Score]

Score the message, add the results to the database, and return a list of Score objects.

Parameter	Type	Description
`message`	`Message`	The message to be scored.
`objective`	`Optional[str]`	The task or objective based on which the message should be scored. Defaults to None. Defaults to `None`.
`role_filter`	`Optional[ChatMessageRole]`	Only score messages with this exact stored role. Use “assistant” to score only real assistant responses, or “simulated_assistant” to score only simulated responses. Defaults to None (no filtering). Defaults to `None`.
`skip_on_error_result`	`bool`	If True, skip scoring if the message contains an error. Defaults to False. Defaults to `False`.
`infer_objective_from_request`	`bool`	If True, infer the objective from the message’s previous request when objective is not provided. Defaults to False. Defaults to `False`.

Returns:

list[Score] — list[Score]: A list of Score objects representing the results.

Raises:

PyritException — If scoring raises a PyRIT exception (re-raised with enhanced context).
RuntimeError — If scoring raises a non-PyRIT exception (wrapped with scorer context).

`score_image_async`¶

score_image_async(image_path: str, objective: Optional[str] = None) → list[Score]

Score the given image using the chat target.

Parameter	Type	Description
`image_path`	`str`	The path to the image file to be scored.
`objective`	`Optional[str]`	The objective based on which the image should be scored. Defaults to None. Defaults to `None`.

Returns:

list[Score] — list[Score]: A list of Score objects representing the results.

`score_image_batch_async`¶

score_image_batch_async(image_paths: Sequence[str], objectives: Optional[Sequence[str]] = None, batch_size: int = 10) → list[Score]

Score a batch of images asynchronously.

Parameter	Type	Description
`image_paths`	`Sequence[str]`	Sequence of paths to image files to be scored.
`objectives`	`Optional[Sequence[str]]`	Optional sequence of objectives corresponding to each image. If provided, must match the length of image_paths. Defaults to None. Defaults to `None`.
`batch_size`	`int`	Maximum number of images to score concurrently. Defaults to 10. Defaults to `10`.

Returns:

list[Score] — list[Score]: A list of Score objects representing the scoring results for all images.

Raises:

ValueError — If the number of objectives does not match the number of image_paths.

`score_prompts_batch_async`¶

score_prompts_batch_async(messages: Sequence[Message], objectives: Optional[Sequence[str]] = None, batch_size: int = 10, role_filter: Optional[ChatMessageRole] = None, skip_on_error_result: bool = False, infer_objective_from_request: bool = False) → list[Score]

Score multiple prompts in batches using the provided objectives.

Parameter	Type	Description
`messages`	`Sequence[Message]`	The messages to be scored.
`objectives`	`Sequence[str]`	The objectives/tasks based on which the prompts should be scored. Must have the same length as messages. Defaults to `None`.
`batch_size`	`int`	The maximum batch size for processing prompts. Defaults to 10. Defaults to `10`.
`role_filter`	`Optional[ChatMessageRole]`	If provided, only score pieces with this role. Defaults to None (no filtering). Defaults to `None`.
`skip_on_error_result`	`bool`	If True, skip scoring pieces that have errors. Defaults to False. Defaults to `False`.
`infer_objective_from_request`	`bool`	If True and objective is empty, attempt to infer the objective from the request. Defaults to False. Defaults to `False`.

Returns:

list[Score] — list[Score]: A flattened list of Score objects from all scored prompts.

Raises:

ValueError — If objectives is empty or if the number of objectives doesn’t match the number of messages.

`score_response_async`¶

score_response_async(response: Message, objective_scorer: Optional[Scorer] = None, auxiliary_scorers: Optional[list[Scorer]] = None, role_filter: ChatMessageRole = 'assistant', objective: Optional[str] = None, skip_on_error_result: bool = True) → dict[str, list[Score]]

Score a response using an objective scorer and optional auxiliary scorers.

Parameter	Type	Description
`response`	`Message`	Response containing pieces to score.
`objective_scorer`	`Optional[Scorer]`	The main scorer to determine success. Defaults to None. Defaults to `None`.
`auxiliary_scorers`	`Optional[List[Scorer]]`	List of auxiliary scorers to apply. Defaults to None. Defaults to `None`.
`role_filter`	`ChatMessageRole`	Only score pieces with this exact stored role. Defaults to “assistant” (real responses only, not simulated). Defaults to `'assistant'`.
`objective`	`Optional[str]`	Task/objective for scoring context. Defaults to None. Defaults to `None`.
`skip_on_error_result`	`bool`	If True, skip scoring pieces that have errors. Defaults to True. Defaults to `True`.

Returns:

dict[str, list[Score]] — Dict[str, List[Score]]: Dictionary with keys auxiliary_scores and objective_scores containing lists of scores from each type of scorer.

Raises:

ValueError — If response is not provided.

`score_response_multiple_scorers_async`¶

score_response_multiple_scorers_async(response: Message, scorers: list[Scorer], role_filter: ChatMessageRole = 'assistant', objective: Optional[str] = None, skip_on_error_result: bool = True) → list[Score]

Score a response using multiple scorers in parallel.

This method applies each scorer to the first scorable response piece (filtered by role and error), and returns all scores. This is typically used for auxiliary scoring where all results are needed.

Parameter	Type	Description
`response`	`Message`	The response containing pieces to score.
`scorers`	`List[Scorer]`	List of scorers to apply.
`role_filter`	`ChatMessageRole`	Only score pieces with this exact stored role. Defaults to “assistant” (real responses only, not simulated). Defaults to `'assistant'`.
`objective`	`Optional[str]`	Optional objective description for scoring context. Defaults to `None`.
`skip_on_error_result`	`bool`	If True, skip scoring pieces that have errors (default: True). Defaults to `True`.

Returns:

list[Score] — List[Score]: All scores from all scorers

`score_text_async`¶

score_text_async(text: str, objective: Optional[str] = None) → list[Score]

Scores the given text based on the task using the chat target.

Parameter	Type	Description
`text`	`str`	The text to be scored.
`objective`	`Optional[str]`	The task based on which the text should be scored Defaults to `None`.

Returns:

list[Score] — list[Score]: A list of Score objects representing the results.

`validate_return_scores`¶

validate_return_scores(scores: list[Score]) → None

Validate the scores returned by the scorer. Because some scorers may require specific Score types or values.

Parameter	Type	Description
`scores`	`list[Score]`	The scores to be validated.

`ScorerEvalDatasetFiles`¶

Configuration for evaluating a scorer on a set of dataset files.

Maps input dataset files (via glob patterns) to an output result file. Multiple files matching the patterns will be concatenated before evaluation.

`ScorerEvaluator`¶

Bases: abc.ABC

A class that evaluates an LLM scorer against HumanLabeledDatasets, calculating appropriate metrics and saving them to a file.

Constructor Parameters:

Parameter	Type	Description
`scorer`	`Scorer`	The scorer to evaluate.

Methods:

`evaluate_dataset_async`¶

evaluate_dataset_async(labeled_dataset: HumanLabeledDataset, num_scorer_trials: int = 1, max_concurrency: int = 10) → ScorerMetrics

Run the evaluation for the scorer/policy combination on the passed in HumanLabeledDataset.

This method performs pure computation without side effects (no file writing). It can be called directly with an in-memory HumanLabeledDataset for experiments that don’t use file-based datasets (e.g., iterative rubric tuning with custom splits).

Parameter	Type	Description
`labeled_dataset`	`HumanLabeledDataset`	The HumanLabeledDataset to evaluate the scorer against.
`num_scorer_trials`	`int`	The number of trials to run the scorer on all responses. Defaults to `1`.
`max_concurrency`	`int`	Maximum number of concurrent scoring requests. Defaults to 10. Defaults to `10`.

Returns:

ScorerMetrics — The metrics for the scorer. This will be either HarmScorerMetrics or ObjectiveScorerMetrics depending on the type of the HumanLabeledDataset (HARM or OBJECTIVE).

Raises:

ValueError — If the labeled_dataset is invalid.

`from_scorer`¶

from_scorer(scorer: Scorer, metrics_type: Optional[MetricsType] = None) → ScorerEvaluator

Create a ScorerEvaluator based on the type of scoring.

Parameter	Type	Description
`scorer`	`Scorer`	The scorer to evaluate.
`metrics_type`	`MetricsType`	The type of scoring, either HARM or OBJECTIVE. If not provided, it will default to OBJECTIVE for true/false scorers and HARM for all other scorers. Defaults to `None`.

Returns:

ScorerEvaluator — An instance of HarmScorerEvaluator or ObjectiveScorerEvaluator.

`run_evaluation_async`¶

run_evaluation_async(dataset_files: ScorerEvalDatasetFiles, num_scorer_trials: int = 3, update_registry_behavior: RegistryUpdateBehavior = RegistryUpdateBehavior.SKIP_IF_EXISTS, max_concurrency: int = 10) → Optional[ScorerMetrics]

Evaluate scorer using dataset files configuration.

The update_registry_behavior parameter controls how existing registry entries are handled:

SKIP_IF_EXISTS (default): Check registry for existing results matching scorer config, dataset version, and num_scorer_trials. If found, return cached metrics. If not found, run evaluation and write to registry.
ALWAYS_UPDATE: Always run evaluation and overwrite any existing registry entry.
NEVER_UPDATE: Always run evaluation but never write to registry (for debugging).

Parameter	Type	Description
`dataset_files`	`ScorerEvalDatasetFiles`	ScorerEvalDatasetFiles configuration specifying glob patterns for input files and a result file name.
`num_scorer_trials`	`int`	Number of scoring trials per response. Defaults to 3. Defaults to `3`.
`update_registry_behavior`	`RegistryUpdateBehavior`	Controls how existing registry entries are handled. Defaults to RegistryUpdateBehavior.SKIP_IF_EXISTS. Defaults to `RegistryUpdateBehavior.SKIP_IF_EXISTS`.
`max_concurrency`	`int`	Maximum number of concurrent scoring requests. Defaults to 10. Defaults to `10`.

Returns:

Optional[ScorerMetrics] — ScorerMetrics if evaluation completed, None if no files found.

Raises:

ValueError — If harm_category is not specified for harm scorer evaluations.

`ScorerMetrics`¶

Base dataclass for storing scorer evaluation metrics.

This class provides methods for serializing metrics to JSON and loading them from JSON files.

Methods:

`from_json`¶

from_json(file_path: Union[str, Path]) → T

Load the metrics from a JSON file.

Parameter	Type	Description
`file_path`	`Union[str, Path]`	The path to the JSON file.

Returns:

T — An instance of ScorerMetrics with the loaded data.

Raises:

FileNotFoundError — If the specified file does not exist.

`to_json`¶

to_json() → str

Convert the metrics to a JSON string.

Returns:

str — The JSON string representation of the metrics.

`ScorerMetricsWithIdentity`¶

Bases: Generic[M]

Wrapper that combines scorer metrics with the scorer’s identity information.

This class provides a clean interface for working with evaluation results, allowing access to both the scorer configuration and its performance metrics.

Generic over the metrics type M, so:

ScorerMetricsWithIdentity[ObjectiveScorerMetrics] has metrics: ObjectiveScorerMetrics
ScorerMetricsWithIdentity[HarmScorerMetrics] has metrics: HarmScorerMetrics

`ScorerPrinter`¶

Bases: ABC

Abstract base class for printing scorer information.

This interface defines the contract for printing scorer details including type information, nested sub-scorers, and evaluation metrics from the registry. Implementations can render output to console, logs, files, or other outputs.

Methods:

`print_harm_scorer`¶

print_harm_scorer(scorer_identifier: ComponentIdentifier, harm_category: str) → None

Print harm scorer information including type, nested scorers, and evaluation metrics.

This method displays:

Scorer type and identity information
Nested sub-scorers (for composite scorers)
Harm evaluation metrics (MAE, Krippendorff alpha) from the registry

Parameter	Type	Description
`scorer_identifier`	`ComponentIdentifier`	The scorer identifier to print information for.
`harm_category`	`str`	The harm category for looking up metrics (e.g., “hate_speech”, “violence”).

`print_objective_scorer`¶

print_objective_scorer(scorer_identifier: ComponentIdentifier) → None

Print objective scorer information including type, nested scorers, and evaluation metrics.

This method displays:

Scorer type and identity information
Nested sub-scorers (for composite scorers)
Objective evaluation metrics (accuracy, precision, recall, F1) from the registry

Parameter	Type	Description
`scorer_identifier`	`ComponentIdentifier`	The scorer identifier to print information for.

`ScorerPromptValidator`¶

Validates message pieces and scorer configurations.

This class provides validation for scorer inputs, ensuring that message pieces meet required criteria such as data types, roles, and metadata requirements.

Constructor Parameters:

Parameter	Type	Description
`supported_data_types`	`Optional[Sequence[PromptDataType]]`	Data types that the scorer supports. Defaults to all data types if not provided. Defaults to `None`.
`required_metadata`	`Optional[Sequence[str]]`	Metadata keys that must be present in message pieces. Defaults to empty list. Defaults to `None`.
`supported_roles`	`Optional[Sequence[ChatMessageRole]]`	Message roles that the scorer supports. Defaults to all roles if not provided. Defaults to `None`.
`max_pieces_in_response`	`Optional[int]`	Maximum number of pieces allowed in a response. Defaults to None (no limit). Defaults to `None`.
`max_text_length`	`Optional[int]`	Maximum character length for text data type pieces. Defaults to None (no limit). Defaults to `None`.
`enforce_all_pieces_valid`	`Optional[bool]`	Whether all pieces must be valid or just at least one. Defaults to False. Defaults to `False`.
`raise_on_no_valid_pieces`	`Optional[bool]`	Whether to raise ValueError when no pieces are valid. Defaults to False, allowing scorers to handle empty results gracefully (e.g., returning False for blocked responses). Set to True to raise an exception instead. Defaults to `False`.
`is_objective_required`	`bool`	Whether an objective must be provided for scoring. Defaults to False. Defaults to `False`.

Methods:

`is_message_piece_supported`¶

is_message_piece_supported(message_piece: MessagePiece) → bool

Check if a message piece is supported by this validator.

Parameter	Type	Description
`message_piece`	`MessagePiece`	The message piece to check.

Returns:

bool — True if the message piece meets all validation criteria, False otherwise.

`validate`¶

validate(message: Message, objective: str | None) → None

Validate a message and objective against configured requirements.

Parameter	Type	Description
`message`	`Message`	The message to validate.
`objective`	`str	None`

Raises:

ValueError — If validation fails due to unsupported pieces, exceeding max pieces, or missing objective.

`SelfAskCategoryScorer`¶

Bases: TrueFalseScorer

A class that represents a self-ask score for text classification and scoring. Given a classifier file, it scores according to these categories and returns the category the MessagePiece fits best.

There is also a false category that is used if the MessagePiece does not fit any of the categories.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The chat target to interact with.
`content_classifier_path`	`Union[str, Path]`	The path to the classifier YAML file.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.

`SelfAskGeneralFloatScaleScorer`¶

Bases: FloatScaleScorer

A general-purpose self-ask float-scale scorer that uses a chat target and a configurable system prompt and prompt format. The final score is normalized to [0, 1].

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The chat target used to score.
`system_prompt_format_string`	`str`	System prompt template with placeholders for objective, prompt, and message_piece.
`prompt_format_string`	`Optional[str]`	User prompt template with the same placeholders. Defaults to `None`.
`category`	`Optional[str]`	Category for the score. Defaults to `None`.
`min_value`	`int`	Minimum of the model’s native scale. Defaults to 0. Defaults to `0`.
`max_value`	`int`	Maximum of the model’s native scale. Defaults to 100. Defaults to `100`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. If omitted, a default validator will be used requiring text input and an objective. Defaults to `None`.
`score_value_output_key`	`str`	JSON key for the score value. Defaults to “score_value”. Defaults to `'score_value'`.
`rationale_output_key`	`str`	JSON key for the rationale. Defaults to “rationale”. Defaults to `'rationale'`.
`description_output_key`	`str`	JSON key for the description. Defaults to “description”. Defaults to `'description'`.
`metadata_output_key`	`str`	JSON key for the metadata. Defaults to “metadata”. Defaults to `'metadata'`.
`category_output_key`	`str`	JSON key for the category. Defaults to “category”. Defaults to `'category'`.

`SelfAskGeneralTrueFalseScorer`¶

Bases: TrueFalseScorer

A general-purpose self-ask True/False scorer that uses a chat target and a configurable system prompt and prompt format.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The chat target used to score.
`system_prompt_format_string`	`str`	System prompt template with placeholders for objective, task (alias of objective), prompt, and message_piece.
`prompt_format_string`	`Optional[str]`	User prompt template with the same placeholders. Defaults to `None`.
`category`	`Optional[str]`	Category for the score. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. If omitted, a default validator will be used requiring text input and an objective. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	Aggregator for combining scores. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.
`score_value_output_key`	`str`	JSON key for the score value. Defaults to “score_value”. Defaults to `'score_value'`.
`rationale_output_key`	`str`	JSON key for the rationale. Defaults to “rationale”. Defaults to `'rationale'`.
`description_output_key`	`str`	JSON key for the description. Defaults to “description”. Defaults to `'description'`.
`metadata_output_key`	`str`	JSON key for the metadata. Defaults to “metadata”. Defaults to `'metadata'`.
`category_output_key`	`str`	JSON key for the category. Defaults to “category”. Defaults to `'category'`.

`SelfAskLikertScorer`¶

Bases: FloatScaleScorer

A class that represents a “self-ask” score for text scoring based on a Likert scale. A Likert scale consists of ranked, ordered categories and is often on a 5 or 7 point basis, but you can configure a scale with any set of non-negative integer score values and descriptions by providing a custom YAML file.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The chat target to use for scoring.
`likert_scale`	`LikertScalePaths`	The Likert scale configuration to use for scoring.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator for the scorer. Defaults to None. Defaults to `None`.

`SelfAskQuestionAnswerScorer`¶

Bases: SelfAskTrueFalseScorer

A class that represents a self-ask question answering scorer.

Usually, QuestionAnswerScorer should be used, but this can be useful for when choices are not sent to objective target or you need more flexibility in determining if the questions were answered correctly.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The chat target to use for the scorer.
`true_false_question_path`	`Optional[pathlib.Path]`	The path to the true/false question file. Defaults to None, which uses the default question_answering.yaml file. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

`SelfAskRefusalScorer`¶

Bases: TrueFalseScorer

A self-ask scorer that detects refusal in AI responses.

This scorer uses a language model to determine whether a response contains a refusal to answer or comply with the given prompt. It’s useful for evaluating whether AI systems are appropriately refusing harmful requests.

The scorer supports two modes via RefusalScorerPaths:

DEFAULT: Works with or without an explicit objective. Safe completions are NOT considered refusals.
STRICT: Treats safe completions as refusals. Works best with an explicit objective.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The endpoint that will be used to score the prompt.
`refusal_system_prompt_path`	`Union[RefusalScorerPaths, Path, str]`	The path to the system prompt to use for refusal detection. Can be a RefusalScorerPaths enum value, a Path, or a string path. Defaults to RefusalScorerPaths.DEFAULT. Defaults to `RefusalScorerPaths.DEFAULT`.
`prompt_format_string`	`Optional[str]`	The format string for the prompt with placeholders. Use `{objective}` for the conversation objective and `{response}` for the response to evaluate. Defaults to “conversation_objective: {objective}\nresponse_to_evaluate_input: {response}”. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

`SelfAskScaleScorer`¶

Bases: FloatScaleScorer

A class that represents a “self-ask” score for text scoring for a customizable numeric scale.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The chat target to use for scoring.
`scale_arguments_path`	`Optional[Union[Path, str]]`	Path to the YAML file containing scale definitions. Defaults to TREE_OF_ATTACKS_SCALE if not provided. Defaults to `None`.
`system_prompt_path`	`Optional[Union[Path, str]]`	Path to the YAML file containing the system prompt. Defaults to GENERAL_SYSTEM_PROMPT if not provided. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator for the scorer. Defaults to None. Defaults to `None`.

`SelfAskTrueFalseScorer`¶

Bases: TrueFalseScorer

A class that represents a self-ask true/false for scoring.

Given written descriptions of “true” and “false” (passed as a file or a TrueFalseQuestion), it returns the value that matches either description most closely.

If no descriptions are provided, it defaults to the TASK_ACHIEVED scorer.

Constructor Parameters:

Parameter	Type	Description
`chat_target`	`PromptChatTarget`	The chat target to interact with.
`true_false_question_path`	`Optional[Union[str, Path]]`	The path to the true/false question file. Defaults to `None`.
`true_false_question`	`Optional[TrueFalseQuestion]`	The true/false question object. Defaults to `None`.
`true_false_system_prompt_path`	`Optional[Union[str, Path]]`	The path to the system prompt file. Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

`SubStringScorer`¶

Bases: TrueFalseScorer

Scorer that checks if a given substring is present in the text.

This scorer performs substring matching using a configurable text matching strategy. Supports both exact substring matching and approximate matching.

Constructor Parameters:

Parameter	Type	Description
`substring`	`str`	The substring to search for in the text.
`text_matcher`	`Optional[TextMatching]`	The text matching strategy to use. Defaults to ExactTextMatching with case_sensitive=False. Defaults to `None`.
`categories`	`Optional[list[str]]`	Optional list of categories for the score. Defaults to None. Defaults to `None`.
`aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Defaults to `None`.

`TrueFalseAggregatorFunc`¶

`TrueFalseCompositeScorer`¶

Bases: TrueFalseScorer

Composite true/false scorer that aggregates results from other true/false scorers.

This scorer invokes a collection of constituent TrueFalseScorer instances and reduces their single-score outputs into one final true/false score using the supplied aggregation function (e.g., TrueFalseScoreAggregator.AND, TrueFalseScoreAggregator.OR, TrueFalseScoreAggregator.MAJORITY).

Constructor Parameters:

Parameter	Type	Description
`aggregator`	`TrueFalseAggregatorFunc`	Aggregation function to combine child scores (e.g., `TrueFalseScoreAggregator.AND`, `TrueFalseScoreAggregator.OR`, `TrueFalseScoreAggregator.MAJORITY`).
`scorers`	`List[TrueFalseScorer]`	The constituent true/false scorers to invoke.

`TrueFalseInverterScorer`¶

Bases: TrueFalseScorer

A scorer that inverts a true false score.

Constructor Parameters:

Parameter	Type	Description
`scorer`	`TrueFalseScorer`	The underlying true/false scorer whose results will be inverted.
`validator`	`Optional[ScorerPromptValidator]`	Custom validator. Defaults to None. Note: This parameter is present for signature compatibility but is not used. Defaults to `None`.

`TrueFalseQuestion`¶

A class that represents a true/false question.

This is sent to an LLM and can be used as an alternative to a yaml file from TrueFalseQuestionPaths.

Constructor Parameters:

Parameter	Type	Description
`true_description`	`str`	Description of what constitutes a “true” response.
`false_description`	`str`	Description of what constitutes a “false” response. Defaults to a generic description if not provided. Defaults to `''`.
`category`	`str`	The category of the question. Defaults to an empty string. Defaults to `''`.
`metadata`	`str`	Additional metadata for context. Defaults to an empty string. Defaults to `''`.

`TrueFalseQuestionPaths`¶

Bases: enum.Enum

Paths to true/false question YAML files.

`TrueFalseScoreAggregator`¶

Namespace for true/false score aggregators that return a single aggregated score.

All aggregators return a list containing one ScoreAggregatorResult that combines all input scores together, preserving all categories.

`TrueFalseScorer`¶

Bases: Scorer

Base class for scorers that return true/false binary scores.

This scorer evaluates prompt responses and returns a single boolean score indicating whether the response meets a specific criterion. Multiple pieces in a request response are aggregated using a TrueFalseAggregatorFunc function (default: TrueFalseScoreAggregator.OR).

Constructor Parameters:

Parameter	Type	Description
`validator`	`ScorerPromptValidator`	Custom validator.
`score_aggregator`	`TrueFalseAggregatorFunc`	The aggregator function to use. Defaults to TrueFalseScoreAggregator.OR. Defaults to `TrueFalseScoreAggregator.OR`.

Methods:

`get_scorer_metrics`¶

get_scorer_metrics() → Optional[ObjectiveScorerMetrics]

Get evaluation metrics for this scorer from the configured evaluation result file.

Returns:

Optional[ObjectiveScorerMetrics] — The metrics for this scorer, or None if not found or not configured.

`validate_return_scores`¶

validate_return_scores(scores: list[Score]) → None

Validate the scores returned by the scorer.

Parameter	Type	Description
`scores`	`list[Score]`	The scores to be validated.

Raises:

ValueError — If the number of scores is not exactly one.
ValueError — If the score value is not “true” or “false”.

`VideoFloatScaleScorer`¶

Bases: FloatScaleScorer, _BaseVideoScorer

A scorer that processes videos by extracting frames and scoring them using a float scale image scorer.

The VideoFloatScaleScorer breaks down a video into frames and uses a float scale scoring mechanism. Frame scores are aggregated using a FloatScaleAggregatorFunc.

By default, uses FloatScaleScorerByCategory.MAX which groups scores by category (useful for scorers like AzureContentFilterScorer that return multiple scores per frame). This returns one aggregated score per category (e.g., one for “Hate”, one for “Violence”, etc.).

For scorers that return a single score per frame, or to combine all categories together, use FloatScaleScoreAggregator.MAX, FloatScaleScorerAllCategories.MAX, etc.

Optionally, an audio_scorer can be provided to also score the video’s audio track. When provided, the audio is extracted, transcribed, and scored. The audio scores are included in the aggregation.

Constructor Parameters:

Parameter	Type	Description
`image_capable_scorer`	`FloatScaleScorer`	A FloatScaleScorer capable of processing images.
`audio_scorer`	`Optional[FloatScaleScorer]`	Optional FloatScaleScorer for scoring the video’s audio track. When provided, audio is extracted from the video, transcribed to text, and scored. The audio scores are aggregated with frame scores. Defaults to `None`.
`num_sampled_frames`	`Optional[int]`	Number of frames to extract from the video for scoring (default: 5). Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Validator for the scorer. Defaults to video_path data type validator. Defaults to `None`.
`score_aggregator`	`FloatScaleAggregatorFunc`	Aggregator for combining frame scores. Defaults to FloatScaleScorerByCategory.MAX. Use FloatScaleScorerByCategory.MAX/AVERAGE/MIN for scorers that return multiple scores per frame (groups by category and returns one score per category). Use FloatScaleScorerAllCategories.MAX/AVERAGE/MIN to combine all scores regardless of category (returns single score with all categories combined). Use FloatScaleScoreAggregator.MAX/AVERAGE/MIN for simple aggregation preserving all categories (returns single score with all categories preserved). Defaults to `FloatScaleScorerByCategory.MAX`.
`image_objective_template`	`Optional[str]`	Template for formatting the objective when scoring image frames. Use {objective} as placeholder for the actual objective. Set to None to not pass objective to image scorer. Defaults to a template that provides context about the video frame. Defaults to `_BaseVideoScorer._DEFAULT_IMAGE_OBJECTIVE_TEMPLATE`.
`audio_objective_template`	`Optional[str]`	Template for formatting the objective when scoring audio. Use {objective} as placeholder for the actual objective. Set to None to not pass objective to audio scorer. Defaults to None because video objectives typically describe visual content that doesn’t apply to audio. Defaults to `None`.

`VideoTrueFalseScorer`¶

Bases: TrueFalseScorer, _BaseVideoScorer

A scorer that processes videos by extracting frames and scoring them using a true/false image scorer.

Aggregation Logic (hard-coded): - Frame scores are aggregated using OR: if ANY frame meets the objective, the visual score is True. - When audio_scorer is provided, the final score uses AND: BOTH visual (frames) AND audio must be True for the overall video score to be True.

Constructor Parameters:

Parameter	Type	Description
`image_capable_scorer`	`TrueFalseScorer`	A TrueFalseScorer capable of processing images.
`audio_scorer`	`Optional[TrueFalseScorer]`	Optional TrueFalseScorer for scoring the video’s audio track. When provided, audio is extracted from the video and scored. The final score requires BOTH video frames AND audio to be True. Defaults to `None`.
`num_sampled_frames`	`Optional[int]`	Number of frames to extract from the video for scoring (default: 5). Defaults to `None`.
`validator`	`Optional[ScorerPromptValidator]`	Validator for the scorer. Defaults to video_path data type validator. Defaults to `None`.
`image_objective_template`	`Optional[str]`	Template for formatting the objective when scoring image frames. Use {objective} as placeholder for the actual objective. Set to None to not pass objective to image scorer. Defaults to a template that provides context about the video frame. Defaults to `_BaseVideoScorer._DEFAULT_IMAGE_OBJECTIVE_TEMPLATE`.
`audio_objective_template`	`Optional[str]`	Template for formatting the objective when scoring audio. Use {objective} as placeholder for the actual objective. Set to None to not pass objective to audio scorer. Defaults to None because video objectives typically describe visual content that doesn’t apply to audio. Defaults to `None`.

Functions¶

create_conversation_scorer¶

get_all_harm_metrics¶

get_all_objective_metrics¶

AudioFloatScaleScorer¶

AudioTrueFalseScorer¶

AzureContentFilterScorer¶

evaluate_async¶

BatchScorer¶

score_responses_by_filters_async¶

ConsoleScorerPrinter¶

print_harm_scorer¶

print_objective_scorer¶

ContentClassifierPaths¶

ConversationScorer¶

validate_return_scores¶

DecodingScorer¶

FloatScaleScoreAggregator¶

FloatScaleScorer¶

get_scorer_metrics¶

validate_return_scores¶

FloatScaleScorerAllCategories¶

FloatScaleScorerByCategory¶

FloatScaleThresholdScorer¶

GandalfScorer¶

HarmHumanLabeledEntry¶

HarmScorerEvaluator¶

HarmScorerMetrics¶

get_harm_definition¶

HumanInTheLoopScorerGradio¶

retrieve_score¶

HumanLabeledDataset¶

from_csv¶

get_harm_definition¶

validate¶

HumanLabeledEntry¶

InsecureCodeScorer¶

LikertScaleEvalFiles¶

LikertScalePaths¶

MarkdownInjectionScorer¶

MetricsType¶

ObjectiveHumanLabeledEntry¶

ObjectiveScorerEvaluator¶

ObjectiveScorerMetrics¶

PlagiarismMetric¶

PlagiarismScorer¶

PromptShieldScorer¶

QuestionAnswerScorer¶

RefusalScorerPaths¶

RegistryUpdateBehavior¶

Scorer¶

evaluate_async¶

get_identifier¶

get_scorer_metrics¶

scale_value_float¶

score_async¶

score_image_async¶

score_image_batch_async¶

score_prompts_batch_async¶

score_response_async¶

score_response_multiple_scorers_async¶

score_text_async¶

validate_return_scores¶

ScorerEvalDatasetFiles¶

ScorerEvaluator¶

evaluate_dataset_async¶

from_scorer¶

run_evaluation_async¶

ScorerMetrics¶

from_json¶

to_json¶

ScorerMetricsWithIdentity¶

ScorerPrinter¶

print_harm_scorer¶

print_objective_scorer¶

ScorerPromptValidator¶

is_message_piece_supported¶

validate¶

SelfAskCategoryScorer¶

`create_conversation_scorer`¶

`get_all_harm_metrics`¶

`get_all_objective_metrics`¶

`AudioFloatScaleScorer`¶

`AudioTrueFalseScorer`¶

`AzureContentFilterScorer`¶

`evaluate_async`¶

`BatchScorer`¶

`score_responses_by_filters_async`¶

`ConsoleScorerPrinter`¶

`print_harm_scorer`¶

`print_objective_scorer`¶

`ContentClassifierPaths`¶

`ConversationScorer`¶

`validate_return_scores`¶

`DecodingScorer`¶

`FloatScaleScoreAggregator`¶

`FloatScaleScorer`¶

`get_scorer_metrics`¶

`validate_return_scores`¶

`FloatScaleScorerAllCategories`¶

`FloatScaleScorerByCategory`¶

`FloatScaleThresholdScorer`¶

`GandalfScorer`¶

`HarmHumanLabeledEntry`¶

`HarmScorerEvaluator`¶

`HarmScorerMetrics`¶

`get_harm_definition`¶

`HumanInTheLoopScorerGradio`¶

`retrieve_score`¶

`HumanLabeledDataset`¶

`from_csv`¶

`get_harm_definition`¶

`validate`¶

`HumanLabeledEntry`¶

`InsecureCodeScorer`¶

`LikertScaleEvalFiles`¶

`LikertScalePaths`¶

`MarkdownInjectionScorer`¶

`MetricsType`¶

`ObjectiveHumanLabeledEntry`¶

`ObjectiveScorerEvaluator`¶

`ObjectiveScorerMetrics`¶

`PlagiarismMetric`¶

`PlagiarismScorer`¶

`PromptShieldScorer`¶

`QuestionAnswerScorer`¶

`RefusalScorerPaths`¶

`RegistryUpdateBehavior`¶

`Scorer`¶

`evaluate_async`¶

`get_identifier`¶

`get_scorer_metrics`¶

`scale_value_float`¶

`score_async`¶

`score_image_async`¶

`score_image_batch_async`¶

`score_prompts_batch_async`¶

`score_response_async`¶

`score_response_multiple_scorers_async`¶

`score_text_async`¶

`validate_return_scores`¶

`ScorerEvalDatasetFiles`¶

`ScorerEvaluator`¶

`evaluate_dataset_async`¶

`from_scorer`¶

`run_evaluation_async`¶

`ScorerMetrics`¶

`from_json`¶

`to_json`¶

`ScorerMetricsWithIdentity`¶

`ScorerPrinter`¶

`print_harm_scorer`¶

`print_objective_scorer`¶

`ScorerPromptValidator`¶

`is_message_piece_supported`¶

`validate`¶

`SelfAskCategoryScorer`¶

`SelfAskGeneralFloatScaleScorer`¶

`SelfAskGeneralTrueFalseScorer`¶