RAI Policy Management API Reference¶
Responsible AI (RAI) Policy Management is a core feature of the RAI platform, enabling customers to define, customize, and enforce policies for analyzing harmful content. These policies streamline the process of identifying harmful content and managing its impact effectively, while offering flexibility to meet diverse customer requirements.
Core Concepts¶
1. Task¶
In the RAI platform, a task represents a specific type of harmful content annotation (e.g., identifying sexual content) applied to a defined content scope. Tasks can optionally include a blocking threshold to determine whether the content should be blocked based on its severity. More details
2. Policy¶
A policy is an array of tasks defining the actions RAI services should perform. Each task within the policy specifies:
- Harm Category: The type of harmful content (e.g., HATE, SELF_HARM).
- Scope: Where and how the task applies (e.g., to all user prompts or assistant responses).
- Thresholds and Blocking Criteria: Rules for triggering content blocking based on severity or detection criteria.
Prerequisites¶
For more information, please refer to this doc: Create Azure AI Content Safety
Authentication¶
Microsoft Entra ID (AAD token)¶
Step 1 - Get the access token. If you are using your own account for test, the token can be get as below by using azure clii.
az account get-access-token --resource https://cognitiveservices.azure.com --query accessToken --output tsv
Step 2 - Assign the Cognitive Services User role to your account. Go to the Azure portal, navigate to your Content Safety resource or Azure AI Services resource, and select Access Control in the left navigation bar, then select + Add role assignment, choose the Cognitive Services User role, and select the member of your account that you need to assign this role to, then review and assign. It might take a few minutes for the assignment to take effect.
For more details, please refer to this doc - Authenticate requests to Azure AI services
API Reference¶
Below document provides the API reference for managing RAI (Responsible AI) policies, including operations to create, update, retrieve, list, and delete policies.
The below fields must be included in the url:
Name | Required | Description | Type |
---|---|---|---|
API Version | Required | This is the API version to be checked. The current version is: 2024-12-15-preview. Example: <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview |
String |
1. Create or Update a RAI Policy¶
Endpoint¶
PATCH <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview
Request Body¶
{
"name": "MyPolicy10",
"taskSettings": [
{
"settingId": "task_setting_01",
"settingEnabled": true,
"appliedFor": [
{
"role": "user",
"source": "prompt"
},
{
"role": "system",
"source": "completion"
}
],
"kind": "harmCategory",
"harmCategoryTaskSetting": {
"harmCategory": "hate"
},
"blockingCriteria": {
"enabled": true,
"kind": "severity",
"allowedSeverity": 0
}
}
]
}
Field Descriptions¶
Field | Type | Description |
---|---|---|
name | String | A unique identifier for the policy. |
taskSettings | Array | An array of objects, where each object defines a specific task for detecting harmful content and its associated blocking criteria. Each task setting includes the following fields: |
- settingId | String | A unique identifier for the task setting. |
- settingEnabled | Boolean | Indicates whether the task setting is active. |
- appliedFor | Object | Specifies the scope of the task, such as roles and sources it applies to. |
-- role | String | The role for which the task is applied. Possible values: USER, ASSISTANT, SYSTEM, TOOL, FUNCTION, ALL (all the above). |
-- source | String | The source of the content. Possible values: PROMPT (Message from user to AI), COMPLETION (Message from AI to user), ALL (all the above). |
- kind | String | The type of task. Currently, only "harmCategory" is supported for harmful content detection tasks. For more details about harm categories. |
- harmCategoryTaskSetting | String | Specifies the harm category to monitor. Possible values: hate , selfHarm , sexual , violence , codeVulnerability , promptInjection , protectedMaterialCode , protectedMaterialText |
- blockingCriteria | Object | Criteria to determine whether the content should be blocked. |
-- enabled | Boolean | Indicates if blocking criteria are active. |
-- kind | String | The type of blocking criteria. Possible values: severity, riskLevel, isDetected. |
-- allowedSeverity | String | The maximum allowed severity level for blocking. |
-- allowedRiskLevel | String | The maximum allowed risk level for blocking. |
-- isDetected | Boolean | Indicates whether the content has been detected as harmful. |
Response¶
Status Code: 200 OK / 201 Created
2. Get a RAI Policy by raiPolicyName
¶
Endpoint¶
GET <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview
Response Body¶
{
"name": "MyPolicy10",
"taskSettings": [
{
"settingId": "task_setting_01",
"settingEnabled": true,
"appliedFor": [
{
"role": "User",
"source": "Prompt"
},
{
"role": "System",
"source": "Completion"
}
],
"kind": "HarmCategory",
"harmCategoryTaskSetting": {
"harmCategory": "Hate"
},
"blockingCriteria": {
"enabled": true,
"kind": "Severity",
"allowedSeverity": 0
}
}
]
}
Field Descriptions¶
Field | Type | Description |
---|---|---|
name | String | A unique identifier for the policy. |
taskSettings | Array | An array of objects, where each object defines a specific task for detecting harmful content and its associated blocking criteria. Each task setting includes the following fields: |
- settingId | String | A unique identifier for the task setting. |
- settingEnabled | Boolean | Indicates whether the task setting is active. |
- appliedFor | Object | Specifies the scope of the task, such as roles and sources it applies to. |
-- role | String | The role for which the task is applied. Possible values: USER, ASSISTANT, SYSTEM, TOOL, FUNCTION, ALL (all the above). |
-- source | String | The source of the content. Possible values: PROMPT (Message from user to AI), COMPLETION (Message from AI to user), ALL (all the above). |
- kind | String | The type of task. Currently, only "harmCategory" is supported for harmful content detection tasks. For more details about harm categories. |
- harmCategoryTaskSetting | String | Specifies the harm category to monitor. Possible values: hate , selfHarm , sexual , violence , codeVulnerability , promptInjection , protectedMaterialCode , protectedMaterialText |
- blockingCriteria | Object | Criteria to determine whether the content should be blocked. |
-- enabled | Boolean | Indicates if blocking criteria are active. |
-- kind | String | The type of blocking criteria. Possible values: severity, riskLevel, isDetected. |
-- allowedSeverity | String | The maximum allowed severity level for blocking. |
-- allowedRiskLevel | String | The maximum allowed risk level for blocking. |
-- isDetected | Boolean | Indicates whether the content has been detected as harmful. |
Refer to the Create or Update a RAI Policy section for a complete example of how these fields are used in the request body.
3. List All RAI Policies¶
Endpoint¶
GET <endpoint>/contentsafety/raiPolicies?api-version=2024-12-15-preview
Response Body¶
{
"values": [
{
"name": "MyPolicy10",
"taskSettings": [
{
"settingId": "task_setting_01",
"settingEnabled": true,
"appliedFor": [
{
"role": "User",
"source": "Prompt"
},
{
"role": "System",
"source": "Completion"
}
],
"kind": "HarmCategory",
"harmCategoryTaskSetting": {
"harmCategory": "Hate"
},
"blockingCriteria": {
"enabled": true,
"kind": "Severity",
"allowedSeverity": 0
}
}
]
}
]
}
Field Descriptions¶
Field | Type | Description |
---|---|---|
name | String | A unique identifier for the policy. |
taskSettings | Array | An array of objects, where each object defines a specific task for detecting harmful content and its associated blocking criteria. Each task setting includes the following fields: |
- settingId | String | A unique identifier for the task setting. |
- settingEnabled | Boolean | Indicates whether the task setting is active. |
- appliedFor | Object | Specifies the scope of the task, such as roles and sources it applies to. |
-- role | String | The role for which the task is applied. Possible values: USER, ASSISTANT, SYSTEM, TOOL, FUNCTION, ALL (all the above). |
-- source | String | The source of the content. Possible values: PROMPT (Message from user to AI), COMPLETION (Message from AI to user), ALL (all the above). |
- kind | String | The type of task. Currently, only "harmCategory" is supported for harmful content detection tasks. For more details about harm categories. |
- harmCategoryTaskSetting | String | Specifies the harm category to monitor. Possible values: hate , selfHarm , sexual , violence , codeVulnerability , promptInjection , protectedMaterialCode , protectedMaterialText |
- blockingCriteria | Object | Criteria to determine whether the content should be blocked. |
-- enabled | Boolean | Indicates if blocking criteria are active. |
-- kind | String | The type of blocking criteria. Possible values: severity, riskLevel, isDetected. |
-- allowedSeverity | String | The maximum allowed severity level for blocking. |
-- allowedRiskLevel | String | The maximum allowed risk level for blocking. |
-- isDetected | Boolean | Indicates whether the content has been detected as harmful. |
4. Delete a RAI Policy by raiPolicyName
¶
Endpoint¶
DELETE <endpoint>/contentsafety/raiPolicies/{raiPolicyName}?api-version=2024-12-15-preview
Response¶
Status Code: 204 No Content
This API reference provides all necessary operations for managing RAI policies, ensuring seamless integration and efficient policy lifecycle management.