pyrit.prompt_target.RealtimeTarget#
- class RealtimeTarget(*, api_version: str = '2025-04-01-preview', system_prompt: str | None = None, voice: Literal['alloy', 'echo', 'shimmer'] | None = None, existing_convo: dict | None = None, **kwargs)[source]#
Bases:
OpenAITarget
- __init__(*, api_version: str = '2025-04-01-preview', system_prompt: str | None = None, voice: Literal['alloy', 'echo', 'shimmer'] | None = None, existing_convo: dict | None = None, **kwargs) None [source]#
RealtimeTarget class for Azure OpenAI Realtime API.
Read more at https://learn.microsoft.com/en-us/azure/ai-services/openai/realtime-audio-reference and https://platform.openai.com/docs/guides/realtime-websocket
- Parameters:
model_name (str, Optional) – The name of the model.
endpoint (str, Optional) – The target URL for the OpenAI service.
api_key (str, Optional) – The API key for accessing the Azure OpenAI service. Defaults to the OPENAI_CHAT_KEY environment variable.
headers (str, Optional) – Headers of the endpoint (JSON).
use_aad_auth (bool, Optional) – When set to True, user authentication is used instead of API Key. DefaultAzureCredential is taken for https://cognitiveservices.azure.com/.default . Please run az login locally to leverage user AuthN.
api_version (str, Optional) – The version of the Azure OpenAI API. Defaults to “2024-06-01”.
max_requests_per_minute (int, Optional) – Number of requests the target can handle per minute before hitting a rate limit. The number of requests sent to the target will be capped at the value provided.
api_version – The version of the Azure OpenAI API. Defaults to “2024-10-01-preview”.
system_prompt (str, Optional) – The system prompt to use. Defaults to “You are a helpful AI assistant”.
voice (literal str, Optional) – The voice to use. Defaults to None. the only supported voices by the AzureOpenAI Realtime API are “alloy”, “echo”, and “shimmer”.
existing_convo (dict[str, websockets.WebSocketClientProtocol], Optional) – Existing conversations.
httpx_client_kwargs (dict, Optional) – Additional kwargs to be passed to the httpx.AsyncClient() constructor. For example, to specify a 3 minutes timeout: httpx_client_kwargs={“timeout”: 180}
Methods
__init__
(*[, api_version, system_prompt, ...])RealtimeTarget class for Azure OpenAI Realtime API.
cleanup_conversation
(conversation_id)Disconnects from the WebSocket server for a specific conversation
Disconnects from the WebSocket server to clean up, cleaning up all existing conversations.
connect
()Connects to Realtime API Target using websockets.
dispose_db_engine
()Dispose DuckDB database engine to release database connections and resources.
get_identifier
()Indicates that this target supports JSON response format.
is_response_format_json
(request_piece)Checks if the response format is JSON and ensures the target supports it.
receive_events
(conversation_id)Continuously receive events from the WebSocket server.
refresh_auth_headers
()Refresh the authentication headers.
save_audio
(audio_bytes[, num_channels, ...])Saves audio bytes to a WAV file.
send_audio_async
(filename, conversation_id)Send an audio message to the WebSocket server.
send_config
(conversation_id)Sends the session configuration to the WebSocket server.
send_event
(event, conversation_id)Sends an event to the WebSocket server.
send_prompt_async
(**kwargs)Sends a normalized prompt async to the prompt target.
send_response_create
(conversation_id)Sends response.create message to the WebSocket server.
send_text_async
(text, conversation_id)Sends text prompt to the WebSocket server.
set_system_prompt
(*, system_prompt, ...[, ...])Sets the system prompt for the prompt target.
Attributes
ADDITIONAL_REQUEST_HEADERS
- async cleanup_conversation(conversation_id: str)[source]#
Disconnects from the WebSocket server for a specific conversation
- Parameters:
conversation_id (str) – The conversation ID to disconnect from.
- async cleanup_target()[source]#
Disconnects from the WebSocket server to clean up, cleaning up all existing conversations.
- async connect()[source]#
Connects to Realtime API Target using websockets. Returns the WebSocket connection.
- is_json_response_supported() bool [source]#
Indicates that this target supports JSON response format.
- async receive_events(conversation_id: str) RealtimeTargetResult [source]#
Continuously receive events from the WebSocket server.
- Parameters:
conversation_id – conversation ID
- Returns:
Collection of conversation messages with audio data at index 0 (bytes) and transcript at index 1 (str) if available
- Return type:
- Raises:
ConnectionError – If WebSocket connection is not valid
ValueError – If received event doesn’t match expected structure
- async save_audio(audio_bytes: bytes, num_channels: int = 1, sample_width: int = 2, sample_rate: int = 16000, output_filename: str | None = None) str [source]#
Saves audio bytes to a WAV file.
- Parameters:
audio_bytes (bytes) – Audio bytes to save.
num_channels (int) – Number of audio channels. Defaults to 1 for the PCM16 format
sample_width (int) – Sample width in bytes. Defaults to 2 for the PCM16 format
sample_rate (int) – Sample rate in Hz. Defaults to 16000 Hz for the PCM16 format
output_filename (str) – Output filename. If None, a UUID filename will be used.
- Returns:
The path to the saved audio file.
- Return type:
- async send_audio_async(filename: str, conversation_id: str) Tuple[str, RealtimeTargetResult] [source]#
Send an audio message to the WebSocket server.
- Parameters:
filename (str) – The path to the audio file.
- async send_config(conversation_id: str)[source]#
Sends the session configuration to the WebSocket server.
- Parameters:
conversation_id (str) – Conversation ID
- async send_event(event: dict, conversation_id: str)[source]#
Sends an event to the WebSocket server.
- async send_prompt_async(**kwargs)#
Sends a normalized prompt async to the prompt target.
- async send_response_create(conversation_id: str)[source]#
Sends response.create message to the WebSocket server.
- Parameters:
conversation_id (str) – Conversation ID