pyrit.score.SelfAskCategoryScorer

pyrit.score.SelfAskCategoryScorer#

class SelfAskCategoryScorer(chat_target: PromptChatTarget, content_classifier: Path)[source]#

Bases: Scorer

A class that represents a self-ask score for text classification and scoring. Given a classifer file, it scores according to these categories and returns the category the PromptRequestPiece fits best.

There is also a false category that is used if the promptrequestpiece does not fit any of the categories.

__init__(chat_target: PromptChatTarget, content_classifier: Path) → None[source]#

Initializes a new instance of the SelfAskCategoryScorer class.

Parameters:

chat_target (PromptChatTarget) – The chat target to interact with.
content_classifier (Path) – The path to the classifier file.

Methods

`__init__`(chat_target, content_classifier)	Initializes a new instance of the SelfAskCategoryScorer class.
`get_identifier`()	Returns an identifier dictionary for the scorer.
`get_scorer_metrics`(dataset_name[, metrics_type])	Returns evaluation statistics for the scorer using the dataset_name of the human labeled dataset that this scorer was run against.
`scale_value_float`(value, min_value, max_value)	Scales a value from 0 to 1 based on the given min and max values.
`score_async`(request_response, *[, task])	Scores the given request_response using the chat target and adds score to memory.
`score_image_async`(image_path, *[, task])	Scores the given image using the chat target.
`score_prompts_with_tasks_batch_async`(*, ...)
`score_response_async`(*, response, scorers[, ...])	Score a response using multiple scorers in parallel.
`score_response_select_first_success_async`(*, ...)	Score response pieces sequentially until finding a successful score.
`score_response_with_objective_async`(*, response)	Score a response using both auxiliary and objective scorers.
`score_responses_inferring_tasks_batch_async`(*, ...)	Scores a batch of responses (ignores non-assistant messages).
`score_text_async`(text, *[, task])	Scores the given text based on the task using the chat target.
`score_text_batch_async`(*, texts[, tasks, ...])
`validate`(request_response, *[, task])	Validates the request_response piece to score.

Attributes

scorer_type

async score_async(request_response: PromptRequestPiece, *, task: str | None = None) → list[Score][source]#

Scores the given request_response using the chat target and adds score to memory.

Parameters:

request_response (PromptRequestPiece) – The prompt request piece to score.
task (str) – The task based on which the text should be scored (the original attacker model’s objective). Currently not supported for this scorer.

Returns:

The request_response scored.: The category that fits best in the response is used for score_category. The score_value is True in all cases unless no category fits. In which case, the score value is false and the _false_category is used.

Return type:

list[Score]

scorer_type: ScoreType#

validate(request_response: PromptRequestPiece, *, task: str | None = None)[source]#

Validates the request_response piece to score. Because some scorers may require specific PromptRequestPiece types or values.

Parameters:

request_response (PromptRequestPiece) – The request response to be validated.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).

pyrit.score.SelfAskCategoryScorer

Contents

pyrit.score.SelfAskCategoryScorer#