Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

pyrit.message_normalizer

Functionality to normalize messages into compatible formats for targets.

ChatMessageNormalizer

Bases: MessageListNormalizer[ChatMessage], MessageStringNormalizer

Normalizer that converts a list of Messages to a list of ChatMessages.

This normalizer handles both single-part and multipart messages:

Constructor Parameters:

ParameterTypeDescription
use_developer_roleboolIf True, translates “system” role to “developer” role. Defaults to False.
system_message_behaviorSystemMessageBehaviorHow to handle system messages. Defaults to “keep”. Defaults to 'keep'.

Methods:

normalize_async

normalize_async(messages: list[Message]) → list[ChatMessage]

Convert a list of Messages to a list of ChatMessages.

For single-piece text messages, content is a string. For multi-piece or non-text messages, content is a list of content dicts.

ParameterTypeDescription
messageslist[Message]The list of Message objects to normalize.

Returns:

Raises:

normalize_string_async

normalize_string_async(messages: list[Message]) → str

Convert a list of Messages to a JSON string representation.

This serializes the list of ChatMessages to JSON format.

ParameterTypeDescription
messageslist[Message]The list of Message objects to normalize.

Returns:

ConversationContextNormalizer

Bases: MessageStringNormalizer

Normalizer that formats conversation history as turn-based text.

This is the standard format used by attacks like Crescendo and TAP for including conversation context in adversarial chat prompts. The output format is:

Turn 1:
User: <content>
Assistant: <content>

Turn 2:
User: <content>
...

Methods:

normalize_string_async

normalize_string_async(messages: list[Message]) → str

Normalize a list of messages into a turn-based context string.

ParameterTypeDescription
messageslist[Message]The list of Message objects to normalize.

Returns:

Raises:

GenericSystemSquashNormalizer

Bases: MessageListNormalizer[Message]

Normalizer that combines the first system message with the first user message using generic instruction tags.

Methods:

normalize_async

normalize_async(messages: list[Message]) → list[Message]

Return messages with the first system message combined into the first user message.

The format uses generic instruction tags:

Instructions

{system_content}

{user_content}

ParameterTypeDescription
messageslist[Message]The list of messages to normalize.

Returns:

Raises:

MessageListNormalizer

Bases: abc.ABC, Generic[T]

Abstract base class for normalizers that return a list of items.

Subclasses specify the type T (e.g., Message, ChatMessage) that the list contains. T must implement the DictConvertible protocol (have a to_dict() method).

Methods:

normalize_async

normalize_async(messages: list[Message]) → list[T]

Normalize the list of messages into a list of items.

ParameterTypeDescription
messageslist[Message]The list of Message objects to normalize.

Returns:

normalize_to_dicts_async

normalize_to_dicts_async(messages: list[Message]) → list[dict[str, Any]]

Normalize the list of messages into a list of dictionaries.

This method uses normalize_async and calls to_dict() on each item.

ParameterTypeDescription
messageslist[Message]The list of Message objects to normalize.

Returns:

MessageStringNormalizer

Bases: abc.ABC

Abstract base class for normalizers that return a string representation.

Use this for formatting messages into text for non-chat targets or context strings.

Methods:

normalize_string_async

normalize_string_async(messages: list[Message]) → str

Normalize the list of messages into a string representation.

ParameterTypeDescription
messageslist[Message]The list of Message objects to normalize.

Returns:

TokenizerTemplateNormalizer

Bases: MessageStringNormalizer

Enable application of the chat template stored in a Hugging Face tokenizer to a list of messages. For more details, see https://huggingface.co/docs/transformers/main/en/chat_templating.

Constructor Parameters:

ParameterTypeDescription
tokenizerPreTrainedTokenizerBaseA Hugging Face tokenizer with a chat template.
system_message_behaviorTokenizerSystemBehaviorHow to handle system messages. Options: - “keep”: Keep system messages as-is (default) - “squash”: Merge system into first user message - “ignore”: Drop system messages entirely - “developer”: Change system role to developer role Defaults to 'keep'.

Methods:

from_model

from_model(model_name_or_alias: str, token: Optional[str] = None, system_message_behavior: Optional[TokenizerSystemBehavior] = None) → TokenizerTemplateNormalizer

Create a normalizer from a model name or alias.

This factory method simplifies creating a normalizer by handling tokenizer loading automatically. Use aliases for common models or provide a full HuggingFace model path.

ParameterTypeDescription
model_name_or_aliasstrEither a full HuggingFace model name or an alias (e.g., ‘chatml’, ‘phi3’, ‘llama3’). See MODEL_ALIASES for available aliases.
tokenOptional[str]Optional HuggingFace token for gated models. If not provided, falls back to HUGGINGFACE_TOKEN environment variable. Defaults to None.
system_message_behaviorOptional[TokenizerSystemBehavior]Override how to handle system messages. If not provided, uses the model’s default config. Defaults to None.

Returns:

Raises:

normalize_string_async

normalize_string_async(messages: list[Message]) → str

Apply the chat template stored in the tokenizer to a list of messages.

Handles system messages based on the configured system_message_behavior:

ParameterTypeDescription
messageslist[Message]A list of Message objects.

Returns: