pyrit.score.HumanInTheLoopScorer#
- class HumanInTheLoopScorer(*, scorer: Scorer | None = None, re_scorers: list[Scorer] | None = None)[source]#
Bases:
Scorer
Create scores from manual human input and adds them to the database.
- Parameters:
Methods
__init__
(*[, scorer, re_scorers])edit_score
(existing_score, original_prompt, ...)Edit an existing score.
get_identifier
()Returns an identifier dictionary for the scorer.
get_modified_value
(original_prompt, ...[, ...])Get the modified value for the score.
get_scorer_metrics
(dataset_name[, metrics_type])Returns evaluation statistics for the scorer using the dataset_name of the human labeled dataset that this scorer was run against.
import_scores_from_csv
(csv_file_path)rescore
(request_response, *[, task])scale_value_float
(value, min_value, max_value)Scales a value from 0 to 1 based on the given min and max values.
score_async
(request_response, *[, task])Score the request_response, add the results to the database and return a list of Score objects.
score_image_async
(image_path, *[, task])Scores the given image using the chat target.
score_prompt_manually
(request_response, *[, ...])Manually score the prompt
score_prompts_with_tasks_batch_async
(*, ...)score_response_async
(*, response, scorers[, ...])Score a response using multiple scorers in parallel.
score_response_select_first_success_async
(*, ...)Score response pieces sequentially until finding a successful score.
score_response_with_objective_async
(*, response)Score a response using both auxiliary and objective scorers.
score_responses_inferring_tasks_batch_async
(*, ...)Scores a batch of responses (ignores non-assistant messages).
score_text_async
(text, *[, task])Scores the given text based on the task using the chat target.
score_text_batch_async
(*, texts[, tasks, ...])validate
(request_response, *[, task])Validates the request_response piece to score.
Attributes
- edit_score(existing_score: Score, original_prompt: str, request_response: PromptRequestPiece, task: str | None = None) Score [source]#
Edit an existing score.
- Parameters:
existing_score (Score) – The existing score to edit.
original_prompt (str) – The original prompt.
request_response (PromptRequestPiece) – The request response to score.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).
- Returns:
new score after all changes
- get_modified_value(original_prompt: str, score_value: str, field_name: str, extra_value_description: str = '') str [source]#
Get the modified value for the score.
- Parameters:
- Returns:
The value after modification or the original value if the user does not want to change it.
- async rescore(request_response: PromptRequestPiece, *, task: str | None = None) list[Score] [source]#
- score_prompt_manually(request_response: PromptRequestPiece, *, task: str | None = None) list[Score] [source]#
Manually score the prompt
- Parameters:
request_response (PromptRequestPiece) – The prompt request piece to score.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).
- Returns:
list of scores
- scorer_type: ScoreType#
- validate(request_response: PromptRequestPiece, *, task: str | None = None)[source]#
Validates the request_response piece to score. Because some scorers may require specific PromptRequestPiece types or values.
- Parameters:
request_response (PromptRequestPiece) – The request response to be validated.
task (str) – The task based on which the text should be scored (the original attacker model’s objective).