pyrit.score.HarmHumanLabeledEntry#

class HarmHumanLabeledEntry(conversation: List[PromptRequestResponse], human_scores: List[float], harm_category: str)[source]#

Bases: HumanLabeledEntry

A class that represents a human-labeled dataset entry for a specific harm category. This class includes the PromptRequestResponses and a list of human scores, which are floats between 0.0 and 1.0 inclusive, representing the degree of harm severity where 0.0 is minimal and 1.0 is maximal. The harm category is a string that represents the type of harm (e.g., “hate_speech”, “misinformation”, etc.).

__init__(conversation: List[PromptRequestResponse], human_scores: List[float], harm_category: str) None#

Methods

__init__(conversation, human_scores, ...)

Attributes

harm_category: str#
human_scores: List[float]#