pyrit.prompt_target.RealtimeTarget#
- class RealtimeTarget(*, voice: Literal['alloy', 'echo', 'shimmer'] | None = None, existing_convo: dict | None = None, **kwargs)[source]#
Bases:
OpenAITargetA prompt target for Azure OpenAI Realtime API.
This class enables real-time audio communication with OpenAI models, supporting voice input and output with configurable voice options.
Read more at https://learn.microsoft.com/en-us/azure/ai-services/openai/realtime-audio-reference and https://platform.openai.com/docs/guides/realtime-websocket
- __init__(*, voice: Literal['alloy', 'echo', 'shimmer'] | None = None, existing_convo: dict | None = None, **kwargs) None[source]#
Initialize the Realtime target with specified parameters.
- Parameters:
model_name (str, Optional) – The name of the model (or deployment name in Azure). If no value is provided, the OPENAI_REALTIME_MODEL environment variable will be used.
endpoint (str, Optional) – The target URL for the OpenAI service. Defaults to the OPENAI_REALTIME_ENDPOINT environment variable.
api_key (str | Callable[[], str], Optional) – The API key for accessing the OpenAI service, or a callable that returns an access token. For Azure endpoints with Entra authentication, pass a token provider from pyrit.auth (e.g., get_azure_openai_auth(endpoint)). Defaults to the OPENAI_REALTIME_API_KEY environment variable.
headers (str, Optional) – Headers of the endpoint (JSON).
max_requests_per_minute (int, Optional) – Number of requests the target can handle per minute before hitting a rate limit. The number of requests sent to the target will be capped at the value provided.
voice (literal str, Optional) – The voice to use. Defaults to None. the only supported voices by the AzureOpenAI Realtime API are “alloy”, “echo”, and “shimmer”.
existing_convo (dict[str, websockets.WebSocketClientProtocol], Optional) – Existing conversations.
**kwargs – Additional keyword arguments passed to the parent OpenAITarget class.
httpx_client_kwargs (dict, Optional) – Additional kwargs to be passed to the
httpx.AsyncClient()constructor. For example, to specify a 3 minute timeout:httpx_client_kwargs={"timeout": 180}
Methods
__init__(*[, voice, existing_convo])Initialize the Realtime target with specified parameters.
cleanup_conversation(conversation_id)Disconnects from the Realtime API for a specific conversation.
Disconnects from the Realtime API connections.
connect(conversation_id)Connect to Realtime API using AsyncOpenAI client and return the realtime connection.
dispose_db_engine()Dispose database engine to release database connections and resources.
get_identifier()Get an identifier dictionary for this prompt target.
Check if the target supports JSON as a response format.
is_response_format_json(message_piece)Check if the response format is JSON and ensure the target supports it.
receive_events(conversation_id)Continuously receive events from the OpenAI Realtime API connection.
save_audio(audio_bytes[, num_channels, ...])Save audio bytes to a WAV file.
send_audio_async(filename, conversation_id)Send an audio message using OpenAI Realtime API client.
send_config(conversation_id)Send the session configuration using OpenAI client.
send_prompt_async(**kwargs)Send a normalized prompt async to the prompt target.
send_response_create(conversation_id)Send response.create using OpenAI client.
send_text_async(text, conversation_id)Send text prompt using OpenAI Realtime API client.
set_model_name(*, model_name)Set the model name for this target.
set_system_prompt(*, system_prompt, ...[, ...])Set the system prompt for the prompt target.
Attributes
ADDITIONAL_REQUEST_HEADERSA list of PromptConverters that are supported by the prompt target.
- async cleanup_conversation(conversation_id: str)[source]#
Disconnects from the Realtime API for a specific conversation.
- Parameters:
conversation_id (str) – The conversation ID to disconnect from.
- async connect(conversation_id: str)[source]#
Connect to Realtime API using AsyncOpenAI client and return the realtime connection.
- Returns:
The Realtime API connection.
- is_json_response_supported() bool[source]#
Check if the target supports JSON as a response format.
- Returns:
True if JSON response is supported, False otherwise.
- Return type:
- async receive_events(conversation_id: str) RealtimeTargetResult[source]#
Continuously receive events from the OpenAI Realtime API connection.
Uses a robust “soft-finish” strategy to handle cases where response.done may not arrive. After receiving audio.done, waits for a grace period before soft-finishing if no response.done arrives.
- Parameters:
conversation_id – conversation ID
- Returns:
RealtimeTargetResult with audio data and transcripts
- Raises:
asyncio.TimeoutError – If waiting for events times out.
ConnectionError – If connection is not valid
RuntimeError – If server returns an error
- async save_audio(audio_bytes: bytes, num_channels: int = 1, sample_width: int = 2, sample_rate: int = 16000, output_filename: str | None = None) str[source]#
Save audio bytes to a WAV file.
- Parameters:
audio_bytes (bytes) – Audio bytes to save.
num_channels (int) – Number of audio channels. Defaults to 1 for the PCM16 format
sample_width (int) – Sample width in bytes. Defaults to 2 for the PCM16 format
sample_rate (int) – Sample rate in Hz. Defaults to 16000 Hz for the PCM16 format
output_filename (str) – Output filename. If None, a UUID filename will be used.
- Returns:
The path to the saved audio file.
- Return type:
- async send_audio_async(filename: str, conversation_id: str) Tuple[str, RealtimeTargetResult][source]#
Send an audio message using OpenAI Realtime API client.
- Parameters:
- Returns:
Path to saved audio file and the RealtimeTargetResult
- Return type:
Tuple[str, RealtimeTargetResult]
- Raises:
Exception – If sending audio fails.
RuntimeError – If no audio is received from the server.
- async send_config(conversation_id: str)[source]#
Send the session configuration using OpenAI client.
- Parameters:
conversation_id (str) – Conversation ID
- async send_prompt_async(**kwargs)#
Send a normalized prompt async to the prompt target.
- async send_response_create(conversation_id: str)[source]#
Send response.create using OpenAI client.
- Parameters:
conversation_id (str) – Conversation ID
- async send_text_async(text: str, conversation_id: str) Tuple[str, RealtimeTargetResult][source]#
Send text prompt using OpenAI Realtime API client.
- Parameters:
text – prompt to send.
conversation_id – conversation ID
- Returns:
Path to saved audio file and the RealtimeTargetResult
- Return type:
Tuple[str, RealtimeTargetResult]
- Raises:
RuntimeError – If no audio is received from the server.