pyrit.score.SelfAskGeneralScorer

pyrit.score.SelfAskGeneralScorer#

class SelfAskGeneralScorer(chat_target: PromptChatTarget, system_prompt_format_string: str | None = None, prompt_format_string: str | None = None, scorer_type: Literal['true_false', 'float_scale'] = 'float_scale', score_value_output_key: str = 'score_value', rationale_output_key: str = 'rationale', description_output_key: str = 'description', metadata_output_key: str = 'metadata', category_output_key: str = 'category', category: list | None = None, labels: list | None = None, min_value: int = 0, max_value: int = 100)[source]#

Bases: Scorer

A general scorer that uses a chat target to score a prompt request piece.

It can be configured to use different scoring types (e.g., true/false, float scale) It can also format the prompt using a system-level prompt and a format string.

Parameters:

chat_target (PromptChatTarget) – The chat target to use for scoring.
system_prompt (str) – The system-level prompt that guides the behavior of the target LLM. Defaults to None. This can be a string or a format string template with placeholders for task and request_response. The system prompt or prompt or request_response needs to specify a JSON output format for the response
prompt_format_string (str) – A format string template for the prompt. Defaults to None. This is a string that can be formatted with task, request_response, and prompt
scorer_type (str) – The type of scorer (e.g., “true_false”, “float_scale”). Defaults to float_scale.
score_value_output_key (str) – The key in the JSON response that contains the score value. Defaults to “score_value”.
rationale_output_key (str) – The key in the JSON response that contains the rationale. Defaults to “rationale”.
description_output_key (str) – The key in the JSON response that contains the description. Defaults to “description”.
metadata_output_key (str) – The key in the JSON response that contains the metadata. Defaults to “metadata”.
category_output_key (str) – The key in the JSON response that contains the category. Defaults to “category”.
category (list) – A list of categories for the score. Defaults to None.
labels (list) – A list of labels for the score. Defaults to None.
min_value (int) – The minimum value for float scale scoring. Defaults to 0.
max_value (int) – The maximum value for float scale scoring. Defaults to 100.

__init__(chat_target: PromptChatTarget, system_prompt_format_string: str | None = None, prompt_format_string: str | None = None, scorer_type: Literal['true_false', 'float_scale'] = 'float_scale', score_value_output_key: str = 'score_value', rationale_output_key: str = 'rationale', description_output_key: str = 'description', metadata_output_key: str = 'metadata', category_output_key: str = 'category', category: list | None = None, labels: list | None = None, min_value: int = 0, max_value: int = 100) → None[source]#

Methods

`__init__`(chat_target[, ...])
`get_identifier`()	Returns an identifier dictionary for the scorer.
`get_scorer_metrics`(dataset_name[, metrics_type])	Returns evaluation statistics for the scorer using the dataset_name of the human labeled dataset that this scorer was run against.
`scale_value_float`(value, min_value, max_value)	Scales a value from 0 to 1 based on the given min and max values.
`score_async`(request_response, *[, task])	Score the request_response, add the results to the database and return a list of Score objects.
`score_image_async`(image_path, *[, task])	Scores the given image using the chat target.
`score_prompts_with_tasks_batch_async`(*, ...)
`score_response_async`(*, response, scorers[, ...])	Score a response using multiple scorers in parallel.
`score_response_select_first_success_async`(*, ...)	Score response pieces sequentially until finding a successful score.
`score_response_with_objective_async`(*, response)	Score a response using both auxiliary and objective scorers.
`score_responses_inferring_tasks_batch_async`(*, ...)	Scores a batch of responses (ignores non-assistant messages).
`score_text_async`(text, *[, task])	Scores the given text based on the task using the chat target.
`score_text_batch_async`(*, texts[, tasks, ...])
`validate`(request_response, *[, task])	Validates the request_response piece to score.

Attributes

scorer_type

scorer_type: ScoreType#

validate(request_response: PromptRequestPiece, *, task: str | None = None) → None[source]#

Validates the request_response piece to score. Because some scorers may require specific PromptRequestPiece types or values.

Parameters:

request_response (PromptRequestPiece) – The request response to be validated.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).

pyrit.score.SelfAskGeneralScorer

Contents

pyrit.score.SelfAskGeneralScorer#