pyrit.score.HarmHumanLabeledEntry

pyrit.score.HarmHumanLabeledEntry#

class HarmHumanLabeledEntry(conversation: List[Message], human_scores: List[float], harm_category: str)[source]#

Bases: HumanLabeledEntry

A class that represents a human-labeled dataset entry for a specific harm category. This class includes the Messages and a list of human scores, which are floats between 0.0 and 1.0 inclusive, representing the degree of harm severity where 0.0 is minimal and 1.0 is maximal. The harm category is a string that represents the type of harm (e.g., “hate_speech”, “misinformation”, etc.).

__init__(conversation: List[Message], human_scores: List[float], harm_category: str) → None#

Methods

__init__(conversation, human_scores, ...)

Attributes

`human_scores`
`harm_category`
`conversation`

harm_category: str#

human_scores: List[float]#

pyrit.score.HarmHumanLabeledEntry

Contents

pyrit.score.HarmHumanLabeledEntry#