pyrit.prompt_converter.VariationSelectorSmugglerConverter#
- class VariationSelectorSmugglerConverter(action: Literal['encode', 'decode'] = 'encode', base_char_utf8: str | None = None, embed_in_base: bool = True)[source]#
Bases:
SmugglerConverter
Encodes and decodes text using Unicode Variation Selectors.
- Each UTF-8 byte is mapped as follows:
Bytes 0x00-0x0F are mapped to U+FE00-U+FE0F.
Bytes 0x10-0xFF are mapped to U+E0100-U+E01EF.
If
embed_in_base
is True, the payload is concatenated with a base character (default: 😊); otherwise, a space separator is inserted.- Replicates functionality detailed in:
Extension: In addition to embedding into a base character, we also support appending invisible variation selectors directly to visible text—enabling mixed visible and hidden content within a single string.
- __init__(action: Literal['encode', 'decode'] = 'encode', base_char_utf8: str | None = None, embed_in_base: bool = True)[source]#
Initializes the converter with options for encoding/decoding.
- Parameters:
action (Literal["encode", "decode"]) – The action to perform.
base_char_utf8 (Optional[str]) – Base character for
variation_selector_smuggler
mode (default: 😊).embed_in_base (bool) – If True, the hidden payload is embedded directly into the base character. If False, a visible separator (space) is inserted between the base and payload. Default is True.
- Raises:
ValueError – If an unsupported action or
encoding_mode
is provided.
Methods
__init__
([action, base_char_utf8, embed_in_base])Initializes the converter with options for encoding/decoding.
convert_async
(*, prompt[, input_type])Converts the given prompt by either encoding or decoding it based on the specified action.
convert_tokens_async
(*, prompt[, ...])Converts substrings within a prompt that are enclosed by specified start and end tokens.
decode_message
(message)Decodes a message encoded using Unicode variation selectors.
decode_visible_hidden
(combined)Extracts the visible text and decodes the hidden text from a combined string.
encode_message
(message)Encodes the message using Unicode variation selectors.
encode_visible_hidden
(visible, hidden)Combines visible text with hidden text by encoding the hidden text using
variation_selector_smuggler
mode.get_identifier
()Returns an identifier dictionary for the converter.
input_supported
(input_type)Checks if the input type is supported by the converter.
output_supported
(output_type)Checks if the output type is supported by the converter.
Attributes
supported_input_types
Returns a list of supported input types for the converter.
supported_output_types
Returns a list of supported output types for the converter.
- decode_message(message: str) str [source]#
Decodes a message encoded using Unicode variation selectors. The decoder scans the string for variation selectors, ignoring any visible separator.
Extracts the visible text and decodes the hidden text from a combined string.
It searches for the first occurrence of the base character (
self.utf8_base_char
) and treats everything from that point on as the hidden payload.
- encode_message(message: str) Tuple[str, str] [source]#
Encodes the message using Unicode variation selectors.
- The message is converted to UTF-8 bytes, and each byte is mapped to a variation selector:
0x00-0x0F => U+FE00 to U+FE0F.
0x10-0xFF => U+E0100 to U+E01EF.
If
embed_in_base
is True, the payload is embedded directly into the base character; otherwise, a visible separator (a space) is inserted between the base and payload.
Combines visible text with hidden text by encoding the hidden text using
variation_selector_smuggler
mode.The hidden payload is generated as a composite using the current embedding setting and then appended to the visible text.