Integration Modes¶
The RAI platform provides two primary integration modes, each designed to address specific customer requirements. Below is a detailed comparison of the AOAI Protocol Alignment Mode and the Standard Protocol Mode.
AOAI Protocol Alignment Mode¶
Customers using this mode interact directly with the AOAI protocol. They forward the user's request and AOAI's response to the RAI platform through a gRPC stream without requiring modifications. The RAI platform focuses exclusively on analyzing conversation content and interactions. API Reference
Example Workflow¶
Chat API Request Payload: Chat API Documentation
{
"model": "gpt-4o",
"messages": [
{
"role": "developer",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}
Request sent to RAI:
{
"aoaiRawBuffer": {
"apiName": "Chatcompletion",
"sourceType": "PROMPT",
"payload": "{\"model\":\"gpt-4o\",\"messages\":[{\"role\":\"developer\",\"content\":\"You are a helpful assistant.\"},{\"role\":\"user\",\"content\":\"Hello!\"}]}"
}
}
Completion Response from LLM Model:
{
"id": "chatcmpl-123456",
"object": "chat.completion",
"created": 1728933352,
"model": "gpt-4o-2024-08-06",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hi there! How can I assist you today?",
"refusal": null
}
}
]
}
Response sent to RAI:
{
"aoaiRawBuffer": {
"apiName": "Chatcompletion",
"sourceType": "Completion",
"payload": "{\"id\":\"chatcmpl-123456\",\"object\":\"chat.completion\",\"created\":1728933352,\"model\":\"gpt-4o-2024-08-06\",\"choices\":[{\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"Hi there! How can I assist you today?\",\"refusal\":null}}]}"
}
}
For streaming mode, individual chunks can also be sent:
{
"aoaiRawBuffer": {
"apiName": "Chatcompletion",
"sourceType": "Completion",
"payload": "{\"id\":\"chatcmpl123\",\"object\":\"chat.completion.chunk\",\"created\":1694268190,\"model\":\"gpt-4o-mini\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"Hi there!\"}}]}"
}
}
Benefits:
* **Simplicity**: Minimal processing required by the customer.
* **Efficiency**: Real-time streaming ensures low latency.
* **Bandwidth Optimization**: Optional exclusion of unnecessary details.
* **Compatibility**: Seamless integration for AOAI customers.
Standard Protocol Mode¶
Customers collect and structure user-AI interactions into a messages
payload following a specified protocol. The formatted payload is sent to the RAI platform’s standard API endpoint for processing. Streaming API Reference | HTTP API Reference
Example HTTP Payload¶
{
"RAIPolicy": "policy name",
"messages": [
{
"role": "SYSTEM",
"sourceType": "PROMPT",
"contents": [
{
"modality": "TEXT",
"text": "You are an AI assistant that helps people find information."
}
]
},
{
"role": "USER",
"sourceType": "PROMPT",
"contents": [
{
"modality": "TEXT",
"text": "I'm very concerned about the homeless in our city."
},
{
"modality": "IMAGE",
"imageBase64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/wcAAwAB/epH1FUAAAAASUVORK5CYII="
}
]
}
]
}
Example gRPC Payload¶
{
"buffer": {
"messages": [
{
"messageId": "0",
"sourceType": "PROMPT",
"role": "USER",
"contents": [
{
"contentIndex": 0,
"modality": "TEXT",
"text": "This is a sample text message for analysis."
},
{
"contentIndex": 1,
"modality": "IMAGE",
"imageBase64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/wcAAwAB/epH1FUAAAAASUVORK5CYII="
}
]
},
{
"messageId": "1",
"sourceType": "COMPLETION",
"role": "ASSISTANT",
"contents": [
{
"contentIndex": 0,
"modality": "TEXT",
"text": "Here is my response to your query."
}
]
}
]
}
}
Source Type¶
Defining sourceType
is crucial because both prompt and completion are inputs for RAI, making it difficult to determine the origin of a message. Using role
for this purpose is unreliable, as malicious users could manipulate the prompt by injecting historical "assistant" messages to deceive the LLM. The sourceType
provides a tamper-resistant mechanism to reliably distinguish between message origins, ensuring the integrity and accuracy of RAI operations.
Message ID and Content Index (gRPC)¶
To meet the requirements of robust content tracking, RAI must support two scenarios:
- Parallel Multi-Choice Streaming: Handling multiple choices being streamed simultaneously.
- Real-Time Multi-Turn Conversations: Supporting interactive, multi-turn exchanges in the Real-Time API.
Introducing messageId
and contentIndex
provides precise mechanisms for locating content to be appended, as well as for returning accurate watermarking and analysis results. Both messageId
and contentIndex
are scoped to the sourceType
for clarity and consistency.
Completion API and Chat API Scenarios¶
- The prompt is always collected as a whole payload, and the
messageId
andcontentIndex
are scoped bysourceType = Prompt
. - For Completion API scenario, all the prompt text will be concatenated into one single string. If not specified, the
messageId
will be assigned0
andcontentIndex
is0
. - For Chat API, if not specified, the
messageId
will be the string value of the index in themessages
array, andcontentIndex
will be the index of the content array in a message. - For completion, the index of a choice can be used as the
messageId
. The array index of content in one choice serves as thecontentIndex
.
Real-Time API Multi-Turn Conversations¶
- For user inputs,
item_id
can be used as themessageId
. - For model responses,
response_id
combined withitem_id
can define themessageId
. - Content within a message can use its natural indices as the
contentIndex
.
This approach ensures precise identification and handling of content across different scenarios.
Benefits¶
- Flexibility: Allows customers to integrate with the RAI platform while maintaining their custom workflows and data structures.
- Standardization: Ensures compatibility with diverse systems through a well-defined protocol.
- Customizability: Enables customers to enrich payloads with additional context or metadata to improve analysis.
Comparison of Integration Modes¶
Aspect | AOAI Protocol Alignment Mode | Standard Protocol Mode |
---|---|---|
Ease of Integration | High – No transformation | Moderate – Requires payload structuring |
Latency | Low – Real-time processing | Low – Real-time processing |
Flexibility | Low – Fixed protocol | High – Customizable payloads |
Target Audience | AOAI Service Team | All |