pyrit.prompt_converter.AzureSpeechAudioToTextConverter#
- class AzureSpeechAudioToTextConverter(azure_speech_region: str | None = None, azure_speech_key: str | None = None, recognition_language: str = 'en-US')[source]#
Bases:
PromptConverter
Transcribes a .wav audio file into text using Azure AI Speech service.
https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-to-text
- __init__(azure_speech_region: str | None = None, azure_speech_key: str | None = None, recognition_language: str = 'en-US') None [source]#
Initializes the converter with Azure Speech service credentials and recognition language.
- Parameters:
azure_speech_region (str, Optional) – The name of the Azure region.
azure_speech_key (str, Optional) – The API key for accessing the service.
recognition_language (str) – Recognition voice language. Defaults to “en-US”. For more on supported languages, see the following link: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support
Methods
__init__
([azure_speech_region, ...])Initializes the converter with Azure Speech service credentials and recognition language.
convert_async
(*, prompt[, input_type])Converts the given audio file into its text representation.
convert_tokens_async
(*, prompt[, ...])Converts substrings within a prompt that are enclosed by specified start and end tokens.
get_identifier
()Returns an identifier dictionary for the converter.
input_supported
(input_type)Checks if the input type is supported by the converter.
output_supported
(output_type)Checks if the output type is supported by the converter.
recognize_audio
(audio_bytes)Recognizes audio file and returns transcribed text.
stop_cb
(evt, recognizer)Callback function that stops continuous recognition upon receiving an event 'evt'.
transcript_cb
(evt, transcript)Callback function that appends transcribed text upon receiving a "recognized" event.
Attributes
The API key for accessing the service.
The name of the Azure region.
supported_input_types
Returns a list of supported input types for the converter.
supported_output_types
Returns a list of supported output types for the converter.
- AZURE_SPEECH_KEY_ENVIRONMENT_VARIABLE: str = 'AZURE_SPEECH_KEY'#
The API key for accessing the service.
- AZURE_SPEECH_REGION_ENVIRONMENT_VARIABLE: str = 'AZURE_SPEECH_REGION'#
The name of the Azure region.
- async convert_async(*, prompt: str, input_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'reasoning', 'error'] = 'audio_path') ConverterResult [source]#
Converts the given audio file into its text representation.
- Parameters:
prompt (str) – File path to the audio file to be transcribed.
input_type (PromptDataType) – The type of input data.
- Returns:
The result containing the transcribed text.
- Return type:
- Raises:
ValueError – If the input type is not supported or if the provided file is not a .wav file.
- input_supported(input_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'reasoning', 'error']) bool [source]#
Checks if the input type is supported by the converter.
- Parameters:
input_type (PromptDataType) – The input type to check.
- Returns:
True if the input type is supported, False otherwise.
- Return type:
- output_supported(output_type: Literal['text', 'image_path', 'audio_path', 'video_path', 'url', 'reasoning', 'error']) bool [source]#
Checks if the output type is supported by the converter.
- Parameters:
output_type (PromptDataType) – The output type to check.
- Returns:
True if the output type is supported, False otherwise.
- Return type:
- recognize_audio(audio_bytes: bytes) str [source]#
Recognizes audio file and returns transcribed text.