Integration Modes¶

The RAI platform provides two primary integration modes, each designed to address specific customer requirements. Below is a detailed comparison of the AOAI Protocol Alignment Mode and the Standard Protocol Mode.

AOAI Protocol Alignment Mode¶

Customers using this mode interact directly with the AOAI protocol. They forward the user's request and AOAI's response to the RAI platform through a gRPC stream without requiring modifications. The RAI platform focuses exclusively on analyzing conversation content and interactions. API Reference

Example Workflow¶

Chat API Request Payload: Chat API Documentation

{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "developer",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}

Request sent to RAI:

{
  "aoaiRawBuffer": {
    "apiName": "Chatcompletion",
    "sourceType": "PROMPT",
    "payload": "{\"model\":\"gpt-4o\",\"messages\":[{\"role\":\"developer\",\"content\":\"You are a helpful assistant.\"},{\"role\":\"user\",\"content\":\"Hello!\"}]}"
  }
}

Completion Response from LLM Model:

{
  "id": "chatcmpl-123456",
  "object": "chat.completion",
  "created": 1728933352,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hi there! How can I assist you today?",
        "refusal": null
      }
    }
  ]
}

Response sent to RAI:

{
  "aoaiRawBuffer": {
    "apiName": "Chatcompletion",
    "sourceType": "Completion",
    "payload": "{\"id\":\"chatcmpl-123456\",\"object\":\"chat.completion\",\"created\":1728933352,\"model\":\"gpt-4o-2024-08-06\",\"choices\":[{\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"Hi there! How can I assist you today?\",\"refusal\":null}}]}"
  }
}

For streaming mode, individual chunks can also be sent:

{
  "aoaiRawBuffer": {
    "apiName": "Chatcompletion",
    "sourceType": "Completion",
    "payload": "{\"id\":\"chatcmpl123\",\"object\":\"chat.completion.chunk\",\"created\":1694268190,\"model\":\"gpt-4o-mini\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"Hi there!\"}}]}"
  }
}

Benefits:

* **Simplicity**: Minimal processing required by the customer.
* **Efficiency**: Real-time streaming ensures low latency.
* **Bandwidth Optimization**: Optional exclusion of unnecessary details.
* **Compatibility**: Seamless integration for AOAI customers.

Standard Protocol Mode¶

Customers collect and structure user-AI interactions into a messages payload following a specified protocol. The formatted payload is sent to the RAI platform’s standard API endpoint for processing. Streaming API Reference | HTTP API Reference

Example HTTP Payload¶

{
  "RAIPolicy": "policy name",
  "messages": [
    {
      "role": "SYSTEM",
      "sourceType": "PROMPT",
      "contents": [
        {
          "modality": "TEXT",
          "text": "You are an AI assistant that helps people find information."
        }
      ]
    },
    {
      "role": "USER",
      "sourceType": "PROMPT",
      "contents": [
        {
          "modality": "TEXT",
          "text": "I'm very concerned about the homeless in our city."
        },
        {
          "modality": "IMAGE",
          "imageBase64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/wcAAwAB/epH1FUAAAAASUVORK5CYII=" 
        }
      ]
    }
  ]
}

Example gRPC Payload¶

{
  "buffer": {
    "messages": [
      {
        "messageId": "0",
        "sourceType": "PROMPT",
        "role": "USER",
        "contents": [
          {
            "contentIndex": 0,
            "modality": "TEXT",
            "text": "This is a sample text message for analysis."
          },
          {
            "contentIndex": 1,
            "modality": "IMAGE",
            "imageBase64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/wcAAwAB/epH1FUAAAAASUVORK5CYII="
          }
        ]
      },
      {
        "messageId": "1",
        "sourceType": "COMPLETION",
        "role": "ASSISTANT",
        "contents": [
          {
            "contentIndex": 0,
            "modality": "TEXT",
            "text": "Here is my response to your query."
          }
        ]
      }
    ]
  }
}

Source Type¶

Defining sourceType is crucial because both prompt and completion are inputs for RAI, making it difficult to determine the origin of a message. Using role for this purpose is unreliable, as malicious users could manipulate the prompt by injecting historical "assistant" messages to deceive the LLM. The sourceType provides a tamper-resistant mechanism to reliably distinguish between message origins, ensuring the integrity and accuracy of RAI operations.

Message ID and Content Index (gRPC)¶

To meet the requirements of robust content tracking, RAI must support two scenarios:

Parallel Multi-Choice Streaming: Handling multiple choices being streamed simultaneously.
Real-Time Multi-Turn Conversations: Supporting interactive, multi-turn exchanges in the Real-Time API.

Introducing messageId and contentIndex provides precise mechanisms for locating content to be appended, as well as for returning accurate watermarking and analysis results. Both messageId and contentIndex are scoped to the sourceType for clarity and consistency.

Completion API and Chat API Scenarios¶

The prompt is always collected as a whole payload, and the messageId and contentIndex are scoped by sourceType = Prompt.
For Completion API scenario, all the prompt text will be concatenated into one single string. If not specified, the messageId will be assigned 0 and contentIndex is 0.
For Chat API, if not specified, the messageId will be the string value of the index in the messages array, and contentIndex will be the index of the content array in a message.
For completion, the index of a choice can be used as the messageId. The array index of content in one choice serves as the contentIndex.

Real-Time API Multi-Turn Conversations¶

For user inputs, item_id can be used as the messageId.
For model responses, response_id combined with item_id can define the messageId.
Content within a message can use its natural indices as the contentIndex.

This approach ensures precise identification and handling of content across different scenarios.

Benefits¶

Flexibility: Allows customers to integrate with the RAI platform while maintaining their custom workflows and data structures.
Standardization: Ensures compatibility with diverse systems through a well-defined protocol.
Customizability: Enables customers to enrich payloads with additional context or metadata to improve analysis.

Comparison of Integration Modes¶

Aspect	AOAI Protocol Alignment Mode	Standard Protocol Mode
Ease of Integration	High – No transformation	Moderate – Requires payload structuring
Latency	Low – Real-time processing	Low – Real-time processing
Flexibility	Low – Fixed protocol	High – Customizable payloads
Target Audience	AOAI Service Team	All