🚀 Deployment Guide

Use this page as the canonical installation guide. Start with Basic Deployment for a simple environment, or Zero Trust Deployment when network isolation is required.

Note: You can change parameter values in main.parameters.json or set them with azd env set before running azd provision. This applies only to parameters that support environment variable substitution.

Underlying infrastructure: GPT-RAG provisions the Azure AI Landing Zone (AILZ) Bicep module as its infrastructure foundation. For the full list of parameters, opt-in features (IP allow-lists, BYO Private DNS / Log Analytics, hub-and-spoke integration, etc.), and the v2 migration path, see the AILZ parameterization reference and the v2-migration guide.

Prerequisites

Required Permissions:

Azure subscription with Contributor and User Access Admin roles
Agreement to Responsible AI terms for Azure AI Services

Required Tools:

Basic Deployment

Quick setup for demos without network isolation. In this mode, the workstation can run the full flow: provision, post-provision configuration, and service deployment.

azd init -t azure/gpt-rag
az login
azd auth login
azd env set NETWORK_ISOLATION false
azd provision
azd deploy

Add --tenant for az or --tenant-id for azd if you want a specific tenant.

Resource naming: Starting with GPT-RAG v3.1.0 and AI Landing Zone v2.2.0, fresh deployments name resources using the Cloud Adoption Framework pattern (for example cosmos-<hash>-<env>-<region>-001). No extra variables are required. If you need to keep the pre-v3.1.0 names, set RESOURCE_NAMING_MODE=legacy before azd provision. See the resource naming guide for details, override options, and a before/after table.

azd provision runs GPT-RAG preflight checks before Azure Resource Manager deployment starts. These checks validate the selected region, jumpbox VM SKU restrictions, provider/location support for AI Search, Cosmos DB, Container Apps, and AI Foundry/Cognitive Services, and Azure OpenAI model quota for the configured deployments. If model quota is insufficient, the hook fails early and suggests candidate regions when possible.

Some transient Azure capacity failures are not exposed by reliable pre-create APIs. For example, Cosmos DB can still fail later with regional high-demand ServiceUnavailable; the preflight reports this limitation explicitly. Use GPT_RAG_REGIONAL_PREFLIGHT_SKIP=true only to bypass GPT-RAG regional checks, or PREFLIGHT_SKIP=true to bypass all preflight hooks.

For the basic flow, the postProvision hook runs locally after azd provision and configures GPT-RAG data-plane resources such as App Configuration and search setup. Then azd deploy deploys the frontend, orchestrator, and ingestion services.

Retrieval backend

GPT-RAG can retrieve grounding content directly from Azure AI Search or through a Foundry IQ knowledge base. Starting with GPT-RAG v3.0.2 and AI Landing Zone v2.1.2, new deployments use Foundry IQ by default through a native Azure Blob Knowledge Source. Existing deployments can stay on RETRIEVAL_BACKEND=ai_search until you explicitly migrate.

Use the grounding sources overview to understand the default Foundry IQ path, when to keep using Azure AI Search, and when to use the searchIndex pattern for custom GPT-RAG ingestion pipelines.

The most important settings are:

Setting	Typical value	Purpose
`RETRIEVAL_BACKEND`	`foundry_iq` for new deployments, `ai_search` for existing compatibility or rollback	Selects the retrieval path.
`FOUNDRY_IQ_PATTERN`	`azureBlob` by default, or `searchIndex` for custom GPT-RAG ingestion	Selects the Foundry IQ setup choice.
`KNOWLEDGE_BASE_NAME`	`<env>-knowledge-base`	Foundry IQ knowledge base name.
`KNOWLEDGE_BASE_CONNECTION_ID`	Generated by AILZ	Dedicated Foundry connection for knowledge-base use.
`FOUNDRY_IQ_API_VERSION`	`2026-05-01-preview`	Required for per-user permissions and custom ingestion path `filterAddOn`.
`FOUNDRY_IQ_KNOWLEDGE_RETRIEVAL_BILLING_PLAN`	`free` or `standard`	Controls Azure AI Search agentic retrieval billing.

With the default Blob path, Foundry IQ processes files directly from the documents container. GPT-RAG ingestion is not used in that path. Use FOUNDRY_IQ_PATTERN=searchIndex only when you intentionally keep a custom GPT-RAG ingestion pipeline that writes chunks to Azure AI Search.

Two optional Foundry IQ Knowledge Sources can run alongside the documents source on the same Knowledge Base. Both are off by default and require signed-in users:

Foundry IQ: Work IQ (Microsoft 365) blends in mail, meetings, files, chats, and people from the signed-in user's M365 world. Gated public preview.
Foundry IQ: Fabric ontology (Microsoft Fabric)
Foundry IQ: Fabric Data Agent (Microsoft Fabric) blends in analytical data from a Fabric ontology (semantic model, lakehouse, warehouse, KQL). Preview. Review data-egress caveats before enabling.

Demo video:

Zero Trust Deployment

For deployments that require network isolation.

Network-isolated deployments use a two-host flow:

Phase	Where to run	Command
Provision infrastructure	Workstation	`azd provision`
Configure data-plane resources	Jumpbox or VNet-connected host	`scripts/postProvision.ps1`
Deploy services	Jumpbox or VNet-connected host	`azd deploy`

Do not run azd deploy from the workstation when NETWORK_ISOLATION=true. The deploy hook blocks that path because private resources and the private ACR build pool are reachable only from inside the VNet.

Network Isolation runbook

Use this runbook for a clean network-isolated deployment:

On your workstation, create or select the azd environment and enable network isolation.
Still on your workstation, run azd provision. This creates the infrastructure and then stops before local data-plane configuration.
Connect to the jumpbox through Azure Bastion, or use another machine with VNet/VPN access.
On the jumpbox, authenticate with the VM managed identity.
On the jumpbox, run scripts/postProvision.ps1 with RUN_FROM_JUMPBOX=true.
On the jumpbox, run azd deploy with RUN_FROM_JUMPBOX=true and ACR_TASK_AGENT_POOL=build-pool.

BUILD_MODE is normally not required. Component deploy scripts automatically use ACR remote builds when NETWORK_ISOLATION=true or when ACR_TASK_AGENT_POOL is set.

Regional preflight

Run preflight before every Zero Trust deployment. It is much faster to fail in the first few minutes than to wait for a long network-isolated deployment and then discover that a regional dependency cannot be created.

azd provision runs the scripts/preProvision hook. The hook invokes scripts/Invoke-RegionalPreflight.ps1 before the Azure Resource Manager deployment starts.

Preflight checks include:

the selected Azure region and provider support,
common regional readiness checks for Azure AI Search, Cosmos DB, Container Apps, AI Foundry, and Cognitive Services,
jumpbox VM SKU availability and restrictions,
Azure OpenAI model quota for the configured deployments.

Preflight is an early warning, not a live capacity reservation. Azure capacity can still change after the check passes, and some regional capacity errors are only returned when Azure creates the resource. Recent examples include Azure AI Search Standard capacity in Sweden Central and Cosmos DB zonal capacity in West Europe.

Use the result this way:

Result	Operator action
`FAIL`	Stop. Fix the subscription, quota, region, or parameter issue before provisioning.
`WARN`	Review the warning before continuing. If it mentions capacity or regional risk, consider changing region first.
Pass	Continue, but keep the deployment logs open because live capacity can still change.

If a region fails or warns on a critical dependency, try another fully supported region instead of waiting 30 minutes or more for a deployment that is likely to fail. Use GPT_RAG_REGIONAL_PREFLIGHT_SKIP=true only when you intentionally bypass regional checks, or PREFLIGHT_SKIP=true to bypass all preflight hooks.

Before Provisioning

Enable network isolation in your environment:

azd env set NETWORK_ISOLATION true

Optional v2 parameters can be set before provisioning:

azd env set DEPLOYMENT_MODE standalone
azd env set VM_SIZE Standard_D2s_v3
azd env set ENABLE_COSMOS_ANALYTICAL_STORAGE false

ALLOWED_IP_RANGES is also available for CIDR allow-listing, but because it is an array parameter, prefer editing main.parameters.json or using a parameter overlay rather than storing a complex array in the azd environment.

Make sure you’re signed in with your Azure user account:

az login
azd auth login

Add --tenant for az or --tenant-id for azd if you want a specific tenant.

Provision Infrastructure

azd env set AZURE_SKIP_NETWORK_ISOLATION_WARNING true   # optional for automation; skips the local post-provision prompt
azd provision

Post-Provision Configuration

With NETWORK_ISOLATION=true, data-plane configuration must run from inside the VNet. A workstation should only run azd provision; if it does not have VNet/VPN access, the local post-provision hook will skip data-plane work and tell you to continue from the jumpbox.

Using the Jumpbox VM

1) Reset the VM password in the Azure Portal (required on first access if not set in deployment parameters):

Go to your VM resource → Support + troubleshooting → Reset password → Set new credentials
Default username is testvmuser

2) Connect via Azure Bastion

3) Authenticate with the VM's Managed Identity:

az login --identity
azd auth login --managed-identity

Add --tenant for az or --tenant-id for azd if you want a specific tenant.

4) Run the post-provision script:

PowerShell:

cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
.\scripts\postProvision.ps1

Bash:

cd /mnt/c/github/gpt-rag
./scripts/postProvision.sh

Note: If you have re-initialized or cloned the gpt-rag repo again, refresh your azd environment before running the postProvision script so it points to the existing deployment: azd init -t azure/gpt-rag then azd env refresh. When prompted, select the same Subscription, Resource Group, and Location as the original provisioning so azd correctly links to your environment.

Existing Platform / AI Landing Zone Integrated

Use these settings when GPT-RAG must deploy into an existing enterprise platform, such as a hub-spoke network with centrally managed Private DNS Zones, Log Analytics, Application Insights, Bastion, NAT Gateway, or Azure Firewall.

Core mode: set DEPLOYMENT_MODE to ailz-integrated, then pass the existing resource IDs that your platform team owns. The default remains standalone, so basic deployments do not require these settings.

azd env set DEPLOYMENT_MODE ailz-integrated
azd env set USE_EXISTING_VNET true
azd env set EXISTING_VNET_RESOURCE_ID "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Network/virtualNetworks/<vnet>"

Existing Private DNS Zones: set the zone resource IDs for services already managed by the platform. Common values include EXISTING_PRIVATE_DNS_ZONE_OPENAI_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_AISERVICES_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_SEARCH_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_COSMOS_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_BLOB_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_KEYVAULT_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_APPCONFIG_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_CONTAINERAPPS_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_ACR_RESOURCE_ID, and Azure Monitor / App Insights zone IDs.

azd env set EXISTING_PRIVATE_DNS_ZONE_SEARCH_RESOURCE_ID "/subscriptions/<sub>/resourceGroups/<dns-rg>/providers/Microsoft.Network/privateDnsZones/privatelink.search.windows.net"
azd env set EXISTING_PRIVATE_DNS_ZONE_OPENAI_RESOURCE_ID "/subscriptions/<sub>/resourceGroups/<dns-rg>/providers/Microsoft.Network/privateDnsZones/privatelink.openai.azure.com"
azd env set DNS_ZONE_LINK_SUFFIX "<unique-spoke-name>"

Shared platform resources: set EXISTING_LOG_ANALYTICS_WORKSPACE_RESOURCE_ID, EXISTING_APPLICATION_INSIGHTS_RESOURCE_ID, EXISTING_APPLICATION_INSIGHTS_CONNECTION_STRING, HUB_INTEGRATION_HUB_VNET_RESOURCE_ID, HUB_INTEGRATION_EGRESS_NEXT_HOP_IP, or HUB_INTEGRATION_EXISTING_ROUTE_TABLE_RESOURCE_ID when those resources are centrally managed.

Spoke resource switches: use DEPLOY_JUMPBOX, DEPLOY_BASTION, DEPLOY_NAT_GATEWAY, EXISTING_JUMPBOX_RESOURCE_ID, EXISTING_BASTION_RESOURCE_ID, EXISTING_NAT_GATEWAY_RESOURCE_ID, DEPLOY_AZURE_FIREWALL, and DEPLOY_ACR_TASK_AGENT_POOL to align GPT-RAG with the platform topology. These are optional and preserve the default standalone behavior when unset.

Deploy GPT-RAG Services

Note: For Zero Trust deployments with network isolation, deploy services from the jumpbox or another host with VNet connectivity. If using the jumpbox VM, the repositories are located in the C:\github directory.

Once the GPT-RAG infrastructure is provisioned, you can deploy the services.

To deploy all services at once, navigate to the gpt-rag directory (with azd environment configured) and run:

cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
azd env set NETWORK_ISOLATION true
azd env set ACR_TASK_AGENT_POOL build-pool
azd deploy

This command deploys each service in sequence. Docker is not required on the jumpbox for network-isolated deployments; component scripts use Azure Container Registry remote builds (az acr build) against the private ACR task agent pool.

The deploy hook uses NETWORK_ISOLATION as the source of truth. When NETWORK_ISOLATION=true, azd deploy fails fast unless it is running from the VNet with RUN_FROM_JUMPBOX=true; the older AZURE_ZERO_TRUST variable is not used.

If you prefer to deploy a single service, for example, when updating only that service, you can deploy it individually. Below is an example using the orchestrator service. The same approach applies to other services (frontend, dataingest, mcp).

Deploy Individual Services

Make sure you're logged in to Azure:

az login

Example: Deploying the Orchestrator

Using azd (recommended):

Initialize the template:

azd init -t azure/gpt-rag-orchestrator

Important: Use the same environment name with azd init as in the infrastructure deployment to keep components consistent.

Update environment variables then deploy:

azd env refresh
azd deploy

Important: Run azd env refresh with the same subscription and resource group used in the infrastructure deployment.

Using a shell script:

Clone the repository, set the App Configuration endpoint, and run the deployment script.

PowerShell (Windows):

git clone https://github.com/Azure/gpt-rag-orchestrator.git
$env:APP_CONFIG_ENDPOINT = "https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
.\scripts\deploy.ps1

Bash (Linux/macOS):

git clone https://github.com/Azure/gpt-rag-orchestrator.git
export APP_CONFIG_ENDPOINT="https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
./scripts/deploy.sh

Permissions

Microsoft Foundry Role and AI Search Assignments

Resource	Role	Assignee	Description
GenAI App Search Service	Search Index Data Reader	Microsoft Foundry Project	Read index data
GenAI App Search Service	Search Service Contributor	Microsoft Foundry Project	Create AI Search connection
GenAI App Storage Account	Storage Blob Data Reader	Microsoft Foundry Project	Read blob data
Microsoft Foundry Account	Cognitive Services User	Search Service	Allow Search Service to access vectorizers

Container App Role Assignments

Resource	Role	Assignee	Description
GenAI App Configuration Store	App Configuration Data Reader	ContainerApp: orchestrator	Read configuration data
GenAI App Configuration Store	App Configuration Data Reader	ContainerApp: frontend	Read configuration data
GenAI App Configuration Store	App Configuration Data Reader	ContainerApp: dataingest	Read configuration data
GenAI App Configuration Store	App Configuration Data Reader	ContainerApp: mcp	Read configuration data
GenAI App Container Registry	AcrPull	ContainerApp: orchestrator	Pull container images
GenAI App Container Registry	AcrPull	ContainerApp: frontend	Pull container images
GenAI App Container Registry	AcrPull	ContainerApp: dataingest	Pull container images
GenAI App Container Registry	AcrPull	ContainerApp: mcp	Pull container images
GenAI App Key Vault	Key Vault Secrets User	ContainerApp: orchestrator	Read secrets
GenAI App Key Vault	Key Vault Secrets User	ContainerApp: frontend	Read secrets
GenAI App Key Vault	Key Vault Secrets User	ContainerApp: dataingest	Read secrets
GenAI App Key Vault	Key Vault Secrets User	ContainerApp: mcp	Read secrets
GenAI App Search Service	Search Index Data Reader	ContainerApp: orchestrator	Read index data
GenAI App Search Service	Search Index Data Contributor	ContainerApp: dataingest	Read/write index data
GenAI App Search Service	Search Index Data Contributor	ContainerApp: mcp	Read/write index data
GenAI App Storage Account	Storage Blob Data Reader	ContainerApp: orchestrator	Read blob data
GenAI App Storage Account	Storage Blob Data Reader	ContainerApp: frontend	Read blob data
GenAI App Storage Account	Storage Blob Data Contributor	ContainerApp: dataingest	Read/write blob data
GenAI App Storage Account	Storage Blob Data Contributor	ContainerApp: mcp	Read/write blob data
GenAI App Cosmos DB	Cosmos DB Built-in Data Contributor	ContainerApp: orchestrator	Read/write Cosmos DB data
Microsoft Foundry Account	Cognitive Services User	ContainerApp: orchestrator	Access Cognitive Services
Microsoft Foundry Account	Cognitive Services User	ContainerApp: dataingest	Access Cognitive Services
Microsoft Foundry Account	Cognitive Services User	ContainerApp: mcp	Access Cognitive Services
Microsoft Foundry Account	Cognitive Services OpenAI User	ContainerApp: orchestrator	Use OpenAI APIs
Microsoft Foundry Account	Cognitive Services OpenAI User	ContainerApp: dataingest	Use OpenAI APIs
Microsoft Foundry Account	Cognitive Services OpenAI User	ContainerApp: mcp	Use OpenAI APIs

Executor Role Assignments

Resource	Role	Assignee	Description
GenAI App Configuration Store	App Configuration Data Owner	Executor	Full control over configuration settings
GenAI App Container Registry	AcrPush	Executor	Push container images
GenAI App Container Registry	AcrPull	Executor	Pull container images
GenAI App Key Vault	Key Vault Contributor	Executor	Manage Key Vault settings
GenAI App Key Vault	Key Vault Secrets Officer	Executor	Create Key Vault secrets
GenAI App Search Service	Search Service Contributor	Executor	Create/update search service elements
GenAI App Search Service	Search Index Data Contributor	Executor	Read/write search index data
GenAI App Search Service	Search Index Data Reader	Executor	Read index data
GenAI App Storage Account	Storage Blob Data Contributor	Executor	Read/write blob data
GenAI App Cosmos DB	Cosmos DB Built-in Data Contributor	Executor	Read/write Cosmos DB data
Microsoft Foundry Account	Cognitive Services OpenAI User	Executor	Use OpenAI APIs

Jumpbox VM Role Assignments

Resource	Role	Assignee	Description
GenAI App Container Apps	Container Apps Contributor	Jumpbox VM	Full control over Container Apps
Azure Managed Identity	Managed Identity Operator	Jumpbox VM	Assign and manage user-assigned identities
GenAI App Container Registry	Container Registry Repository Writer	Jumpbox VM	Write to ACR repositories
GenAI App Container Registry	Container Registry Tasks Contributor	Jumpbox VM	Manage ACR tasks
GenAI App Container Registry	Container Registry Data Access Configuration Administrator	Jumpbox VM	Manage ACR data access configuration
GenAI App Container Registry	AcrPush	Jumpbox VM	Push container images
GenAI App Configuration Store	App Configuration Data Owner	Jumpbox VM	Full control over configuration settings
GenAI App Key Vault	Key Vault Contributor	Jumpbox VM	Manage Key Vault settings
GenAI App Key Vault	Key Vault Secrets Officer	Jumpbox VM	Create Key Vault secrets
GenAI App Search Service	Search Service Contributor	Jumpbox VM	Create/update search service elements
GenAI App Search Service	Search Index Data Contributor	Jumpbox VM	Read/write search index data
GenAI App Storage Account	Storage Blob Data Contributor	Jumpbox VM	Read/write blob data
GenAI App Cosmos DB	Cosmos DB Built-in Data Contributor	Jumpbox VM	Read/write Cosmos DB data
Microsoft Foundry Account	Cognitive Services Contributor	Jumpbox VM	Manage Cognitive Services resources
Microsoft Foundry Account	Cognitive Services OpenAI User	Jumpbox VM	Use OpenAI APIs