pyrit.score.HarmHumanLabeledEntry#
- class HarmHumanLabeledEntry(conversation: List[Message], human_scores: List[float], harm_category: str)[source]#
Bases:
HumanLabeledEntryA class that represents a human-labeled dataset entry for a specific harm category. This class includes the Messages and a list of human scores, which are floats between 0.0 and 1.0 inclusive, representing the degree of harm severity where 0.0 is minimal and 1.0 is maximal. The harm category is a string that represents the type of harm (e.g., “hate_speech”, “misinformation”, etc.).
Methods
__init__(conversation, human_scores, ...)Attributes
conversation