Skip to content

RAI Policy Management API Reference

Responsible AI (RAI) Policy Management is a core feature of the RAI platform, enabling customers to define, customize, and enforce policies for analyzing harmful content. These policies streamline the process of identifying harmful content and managing its impact effectively, while offering flexibility to meet diverse customer requirements.

Core Concepts

1. Task

In the RAI platform, a task represents a specific type of harmful content annotation (e.g., identifying sexual content) applied to a defined content scope. Tasks can optionally include a blocking threshold to determine whether the content should be blocked based on its severity. More details

2. Policy

A policy is an array of tasks defining the actions RAI services should perform. Each task within the policy specifies:

  • Harm Category: The type of harmful content (e.g., HATE, SELF_HARM).
  • Scope: Where and how the task applies (e.g., to all user prompts or assistant responses).
  • Thresholds and Blocking Criteria: Rules for triggering content blocking based on severity or detection criteria.

Prerequisites

For more information, please refer to this doc: Create Azure AI Content Safety

Authentication

Microsoft Entra ID (AAD token)

Step 1 - Get the access token. If you are using your own account for test, the token can be get as below by using azure clii.

az account get-access-token --resource https://cognitiveservices.azure.com  --query accessToken --output tsv

Step 2 - Assign the Cognitive Services User role to your account. Go to the Azure portal, navigate to your Content Safety resource or Azure AI Services resource, and select Access Control in the left navigation bar, then select + Add role assignment, choose the Cognitive Services User role, and select the member of your account that you need to assign this role to, then review and assign. It might take a few minutes for the assignment to take effect.

For more details, please refer to this doc - Authenticate requests to Azure AI services

API Reference

Below document provides the API reference for managing RAI (Responsible AI) policies, including operations to create, update, retrieve, list, and delete policies.

The below fields must be included in the url:

Name Required Description Type
API Version Required This is the API version to be checked. The current version is: 2024-12-15-preview. Example: <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview String

1. Create or Update a RAI Policy

Endpoint

PATCH <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview

Request Body

{
  "name": "MyPolicy10",
  "taskSettings": [
    {
      "settingId": "task_setting_01",
      "settingEnabled": true,
      "appliedFor": [
        {
          "role": "user",
          "source": "prompt"
        },
        {
          "role": "system",
          "source": "completion"
        }
      ],
      "kind": "harmCategory",
      "harmCategoryTaskSetting": {
        "harmCategory": "hate"
      },
      "blockingCriteria": {
        "enabled": true,
        "kind": "severity",
        "allowedSeverity": 0
      }
    }
  ]
}

Field Descriptions

Field Type Description
name String A unique identifier for the policy.
taskSettings Array An array of objects, where each object defines a specific task for detecting harmful content and its associated blocking criteria. Each task setting includes the following fields:
- settingId String A unique identifier for the task setting.
- settingEnabled Boolean Indicates whether the task setting is active.
- appliedFor Object Specifies the scope of the task, such as roles and sources it applies to.
-- role String The role for which the task is applied. Possible values: USER, ASSISTANT, SYSTEM, TOOL, FUNCTION, ALL (all the above).
-- source String The source of the content. Possible values: PROMPT (Message from user to AI), COMPLETION (Message from AI to user), ALL (all the above).
- kind String The type of task. Currently, only "harmCategory" is supported for harmful content detection tasks. For more details about harm categories.
- harmCategoryTaskSetting String Specifies the harm category to monitor. Possible values: hate, selfHarm, sexual, violence, codeVulnerability, promptInjection, protectedMaterialCode, protectedMaterialText
- blockingCriteria Object Criteria to determine whether the content should be blocked.
-- enabled Boolean Indicates if blocking criteria are active.
-- kind String The type of blocking criteria. Possible values: severity, riskLevel, isDetected.
-- allowedSeverity String The maximum allowed severity level for blocking.
-- allowedRiskLevel String The maximum allowed risk level for blocking.
-- isDetected Boolean Indicates whether the content has been detected as harmful.

Response

Status Code: 200 OK / 201 Created


2. Get a RAI Policy by raiPolicyName

Endpoint

GET <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview

Response Body

{
  "name": "MyPolicy10",
  "taskSettings": [
    {
      "settingId": "task_setting_01",
      "settingEnabled": true,
      "appliedFor": [
        {
          "role": "User",
          "source": "Prompt"
        },
        {
          "role": "System",
          "source": "Completion"
        }
      ],
      "kind": "HarmCategory",
      "harmCategoryTaskSetting": {
        "harmCategory": "Hate"
      },
      "blockingCriteria": {
        "enabled": true,
        "kind": "Severity",
        "allowedSeverity": 0
      }
    }
  ]
}

Field Descriptions

Field Type Description
name String A unique identifier for the policy.
taskSettings Array An array of objects, where each object defines a specific task for detecting harmful content and its associated blocking criteria. Each task setting includes the following fields:
- settingId String A unique identifier for the task setting.
- settingEnabled Boolean Indicates whether the task setting is active.
- appliedFor Object Specifies the scope of the task, such as roles and sources it applies to.
-- role String The role for which the task is applied. Possible values: USER, ASSISTANT, SYSTEM, TOOL, FUNCTION, ALL (all the above).
-- source String The source of the content. Possible values: PROMPT (Message from user to AI), COMPLETION (Message from AI to user), ALL (all the above).
- kind String The type of task. Currently, only "harmCategory" is supported for harmful content detection tasks. For more details about harm categories.
- harmCategoryTaskSetting String Specifies the harm category to monitor. Possible values: hate, selfHarm, sexual, violence, codeVulnerability, promptInjection, protectedMaterialCode, protectedMaterialText
- blockingCriteria Object Criteria to determine whether the content should be blocked.
-- enabled Boolean Indicates if blocking criteria are active.
-- kind String The type of blocking criteria. Possible values: severity, riskLevel, isDetected.
-- allowedSeverity String The maximum allowed severity level for blocking.
-- allowedRiskLevel String The maximum allowed risk level for blocking.
-- isDetected Boolean Indicates whether the content has been detected as harmful.

Refer to the Create or Update a RAI Policy section for a complete example of how these fields are used in the request body.


3. List All RAI Policies

Endpoint

GET <endpoint>/contentsafety/raiPolicies?api-version=2024-12-15-preview

Response Body

{
  "values": [
    {
      "name": "MyPolicy10",
      "taskSettings": [
        {
          "settingId": "task_setting_01",
          "settingEnabled": true,
          "appliedFor": [
            {
              "role": "User",
              "source": "Prompt"
            },
            {
              "role": "System",
              "source": "Completion"
            }
          ],
          "kind": "HarmCategory",
          "harmCategoryTaskSetting": {
            "harmCategory": "Hate"
          },
          "blockingCriteria": {
            "enabled": true,
            "kind": "Severity",
            "allowedSeverity": 0
          }
        }
      ]
    }
  ]
}

Field Descriptions

Field Type Description
name String A unique identifier for the policy.
taskSettings Array An array of objects, where each object defines a specific task for detecting harmful content and its associated blocking criteria. Each task setting includes the following fields:
- settingId String A unique identifier for the task setting.
- settingEnabled Boolean Indicates whether the task setting is active.
- appliedFor Object Specifies the scope of the task, such as roles and sources it applies to.
-- role String The role for which the task is applied. Possible values: USER, ASSISTANT, SYSTEM, TOOL, FUNCTION, ALL (all the above).
-- source String The source of the content. Possible values: PROMPT (Message from user to AI), COMPLETION (Message from AI to user), ALL (all the above).
- kind String The type of task. Currently, only "harmCategory" is supported for harmful content detection tasks. For more details about harm categories.
- harmCategoryTaskSetting String Specifies the harm category to monitor. Possible values: hate, selfHarm, sexual, violence, codeVulnerability, promptInjection, protectedMaterialCode, protectedMaterialText
- blockingCriteria Object Criteria to determine whether the content should be blocked.
-- enabled Boolean Indicates if blocking criteria are active.
-- kind String The type of blocking criteria. Possible values: severity, riskLevel, isDetected.
-- allowedSeverity String The maximum allowed severity level for blocking.
-- allowedRiskLevel String The maximum allowed risk level for blocking.
-- isDetected Boolean Indicates whether the content has been detected as harmful.

4. Delete a RAI Policy by raiPolicyName

Endpoint

DELETE <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview

Response

Status Code: 204 No Content


This API reference provides all necessary operations for managing RAI policies, ensuring seamless integration and efficient policy lifecycle management.