pyrit.prompt_converter.AzureSpeechTextToAudioConverter#

class AzureSpeechTextToAudioConverter(azure_speech_region: str | None = None, azure_speech_key: str | None = None, synthesis_language: str = 'en_US', synthesis_voice_name: str = 'en-US-AvaNeural', output_format: Literal['wav', 'mp3'] = 'wav')[source]#

Bases: PromptConverter

Generates a wave file from a text prompt using Azure AI Speech service.

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech

__init__(azure_speech_region: str | None = None, azure_speech_key: str | None = None, synthesis_language: str = 'en_US', synthesis_voice_name: str = 'en-US-AvaNeural', output_format: Literal['wav', 'mp3'] = 'wav') None[source]#

Initializes the converter with Azure Speech service credentials, synthesis language, and voice name.

Parameters:
  • azure_speech_region (str, Optional) – The name of the Azure region.

  • azure_speech_key (str, Optional) – The API key for accessing the service.

  • synthesis_language (str) – Synthesis voice language.

  • synthesis_voice_name (str) – Synthesis voice name, see URL. For more details see the following link for synthesis language and synthesis voice: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support

  • filename (str) – File name to be generated. Please include either .wav or .mp3.

  • output_format (str) – Either wav or mp3. Must match the file prefix.

Methods

__init__([azure_speech_region, ...])

Initializes the converter with Azure Speech service credentials, synthesis language, and voice name.

convert_async(*, prompt[, input_type])

Converts the given text prompt into its audio representation.

convert_tokens_async(*, prompt[, ...])

Converts substrings within a prompt that are enclosed by specified start and end tokens.

get_identifier()

Returns an identifier dictionary for the converter.

input_supported(input_type)

Checks if the input type is supported by the converter.

output_supported(output_type)

Checks if the output type is supported by the converter.

Attributes

AZURE_SPEECH_KEY_ENVIRONMENT_VARIABLE

The API key for accessing the service.

AZURE_SPEECH_REGION_ENVIRONMENT_VARIABLE

The name of the Azure region.

AzureSpeachAudioFormat

Supported audio formats for output.

supported_input_types

Returns a list of supported input types for the converter.

supported_output_types

Returns a list of supported output types for the converter.

AZURE_SPEECH_KEY_ENVIRONMENT_VARIABLE: str = 'AZURE_SPEECH_KEY'#

The API key for accessing the service.

AZURE_SPEECH_REGION_ENVIRONMENT_VARIABLE: str = 'AZURE_SPEECH_REGION'#

The name of the Azure region.

AzureSpeachAudioFormat#

Supported audio formats for output.

alias of Literal[‘wav’, ‘mp3’]

async convert_async(*, prompt: str, input_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'error'] = 'text') ConverterResult[source]#

Converts the given text prompt into its audio representation.

Parameters:
  • prompt (str) – The text prompt to be converted into audio.

  • input_type (PromptDataType) – The type of input data.

Returns:

The result containing the audio file path.

Return type:

ConverterResult

Raises:
  • ModuleNotFoundError – If the azure.cognitiveservices.speech module is not installed.

  • RuntimeError – If there is an error during the speech synthesis process.

  • ValueError – If the input type is not supported or if the prompt is empty.

input_supported(input_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'error']) bool[source]#

Checks if the input type is supported by the converter.

Parameters:

input_type (PromptDataType) – The input type to check.

Returns:

True if the input type is supported, False otherwise.

Return type:

bool

output_supported(output_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'error']) bool[source]#

Checks if the output type is supported by the converter.

Parameters:

output_type (PromptDataType) – The output type to check.

Returns:

True if the output type is supported, False otherwise.

Return type:

bool