pyrit.prompt_converter.AzureSpeechTextToAudioConverter

pyrit.prompt_converter.AzureSpeechTextToAudioConverter#

class AzureSpeechTextToAudioConverter(azure_speech_region: str | None = None, azure_speech_key: str | None = None, synthesis_language: str = 'en_US', synthesis_voice_name: str = 'en-US-AvaNeural', output_format: Literal['wav', 'mp3'] = 'wav')[source]#

Bases: PromptConverter

Generates a wave file from a text prompt using Azure AI Speech service.

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech

__init__(azure_speech_region: str | None = None, azure_speech_key: str | None = None, synthesis_language: str = 'en_US', synthesis_voice_name: str = 'en-US-AvaNeural', output_format: Literal['wav', 'mp3'] = 'wav') → None[source]#

Initializes the converter with Azure Speech service credentials, synthesis language, and voice name.

Parameters:

azure_speech_region (str, Optional) – The name of the Azure region.
azure_speech_key (str, Optional) – The API key for accessing the service.
synthesis_language (str) – Synthesis voice language.
synthesis_voice_name (str) – Synthesis voice name, see URL. For more details see the following link for synthesis language and synthesis voice: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support
filename (str) – File name to be generated. Please include either .wav or .mp3.
output_format (str) – Either wav or mp3. Must match the file prefix.

Methods

`__init__`([azure_speech_region, ...])	Initializes the converter with Azure Speech service credentials, synthesis language, and voice name.
`convert_async`(*, prompt[, input_type])	Converts the given text prompt into its audio representation.
`convert_tokens_async`(*, prompt[, ...])	Converts substrings within a prompt that are enclosed by specified start and end tokens.
`get_identifier`()	Returns an identifier dictionary for the converter.
`input_supported`(input_type)	Checks if the input type is supported by the converter.
`output_supported`(output_type)	Checks if the output type is supported by the converter.

Attributes

`AZURE_SPEECH_KEY_ENVIRONMENT_VARIABLE`	The API key for accessing the service.
`AZURE_SPEECH_REGION_ENVIRONMENT_VARIABLE`	The name of the Azure region.
`AzureSpeachAudioFormat`	Supported audio formats for output.
`supported_input_types`	Returns a list of supported input types for the converter.
`supported_output_types`	Returns a list of supported output types for the converter.

AZURE_SPEECH_KEY_ENVIRONMENT_VARIABLE: str = 'AZURE_SPEECH_KEY'#: The API key for accessing the service.

AZURE_SPEECH_REGION_ENVIRONMENT_VARIABLE: str = 'AZURE_SPEECH_REGION'#: The name of the Azure region.

AzureSpeachAudioFormat#

Supported audio formats for output.

alias of Literal[‘wav’, ‘mp3’]

async convert_async(*, prompt: str, input_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'reasoning', 'error'] = 'text') → ConverterResult[source]#

Converts the given text prompt into its audio representation.

Parameters:

prompt (str) – The text prompt to be converted into audio.
input_type (PromptDataType) – The type of input data.

Returns:

The result containing the audio file path.

Return type:

ConverterResult

Raises:

ModuleNotFoundError – If the azure.cognitiveservices.speech module is not installed.
RuntimeError – If there is an error during the speech synthesis process.
ValueError – If the input type is not supported or if the prompt is empty.

input_supported(input_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'reasoning', 'error']) → bool[source]#

Checks if the input type is supported by the converter.

Parameters:: input_type (PromptDataType) – The input type to check.
Returns:: True if the input type is supported, False otherwise.
Return type:: bool

output_supported(output_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'reasoning', 'error']) → bool[source]#

Checks if the output type is supported by the converter.

Parameters:: output_type (PromptDataType) – The output type to check.
Returns:: True if the output type is supported, False otherwise.
Return type:: bool

pyrit.prompt_converter.AzureSpeechTextToAudioConverter

Contents

pyrit.prompt_converter.AzureSpeechTextToAudioConverter#