π Deployment Guide
Use this page as the canonical installation guide. Start with Basic Deployment for a simple environment, or Zero Trust Deployment when network isolation is required.
Note: You can change parameter values in
main.parameters.jsonor set them withazd env setbefore runningazd provision. This applies only to parameters that support environment variable substitution.Underlying infrastructure: GPT-RAG provisions the Azure AI Landing Zone (AILZ) Bicep module as its infrastructure foundation. For the full list of parameters, opt-in features (IP allow-lists, BYO Private DNS / Log Analytics, hub-and-spoke integration, etc.), and the v2 migration path, see the AILZ parameterization reference and the v2-migration guide.
Prerequisites
Required Permissions:
- Azure subscription with Contributor and User Access Admin roles
- Agreement to Responsible AI terms for Azure AI Services
Required Tools:
- Azure Developer CLI
- PowerShell 7+ (Windows only)
- Git
- Python 3.12
Basic Deployment
Quick setup for demos without network isolation. In this mode, the workstation can run the full flow: provision, post-provision configuration, and service deployment.
azd init -t azure/gpt-rag
az login
azd auth login
azd env set NETWORK_ISOLATION false
azd provision
azd deploy
Add
--tenantforazor--tenant-idforazdif you want a specific tenant.Resource naming: Starting with GPT-RAG v3.1.0 and AI Landing Zone v2.2.0, fresh deployments name resources using the Cloud Adoption Framework pattern (for example
cosmos-<hash>-<env>-<region>-001). No extra variables are required. If you need to keep the pre-v3.1.0 names, setRESOURCE_NAMING_MODE=legacybeforeazd provision. See the resource naming guide for details, override options, and a before/after table.
azd provision runs GPT-RAG preflight checks before Azure Resource Manager deployment starts. These checks validate the selected region, jumpbox VM SKU restrictions, provider/location support for AI Search, Cosmos DB, Container Apps, and AI Foundry/Cognitive Services, and Azure OpenAI model quota for the configured deployments. If model quota is insufficient, the hook fails early and suggests candidate regions when possible.
Some transient Azure capacity failures are not exposed by reliable pre-create APIs. For example, Cosmos DB can still fail later with regional high-demand ServiceUnavailable; the preflight reports this limitation explicitly. Use GPT_RAG_REGIONAL_PREFLIGHT_SKIP=true only to bypass GPT-RAG regional checks, or PREFLIGHT_SKIP=true to bypass all preflight hooks.
For the basic flow, the postProvision hook runs locally after azd provision and configures GPT-RAG data-plane resources such as App Configuration and search setup. Then azd deploy deploys the frontend, orchestrator, and ingestion services.
Retrieval backend
GPT-RAG can retrieve grounding content directly from Azure AI Search or through a
Foundry IQ knowledge base. Starting with GPT-RAG v3.0.2 and AI Landing Zone
v2.1.2, new deployments use Foundry IQ by default through a native Azure Blob
Knowledge Source. Existing deployments can stay on RETRIEVAL_BACKEND=ai_search
until you explicitly migrate.
Use the retrieval backend selection guide to
understand the default Foundry IQ path, when to keep using Azure AI Search, and
when to use the searchIndex pattern for custom GPT-RAG ingestion pipelines.
The most important settings are:
| Setting | Typical value | Purpose |
|---|---|---|
RETRIEVAL_BACKEND |
foundry_iq for new deployments, ai_search for existing compatibility or rollback |
Selects the retrieval path. |
FOUNDRY_IQ_PATTERN |
azureBlob by default, or searchIndex for custom GPT-RAG ingestion |
Selects the Foundry IQ setup choice. |
KNOWLEDGE_BASE_NAME |
<env>-knowledge-base |
Foundry IQ knowledge base name. |
KNOWLEDGE_BASE_CONNECTION_ID |
Generated by AILZ | Dedicated Foundry connection for knowledge-base use. |
FOUNDRY_IQ_API_VERSION |
2026-05-01-preview |
Required for per-user permissions and custom ingestion path filterAddOn. |
FOUNDRY_IQ_KNOWLEDGE_RETRIEVAL_BILLING_PLAN |
free or standard |
Controls Azure AI Search agentic retrieval billing. |
With the default Blob path, Foundry IQ processes files directly from the
documents container. GPT-RAG ingestion is not used in that path. Use
FOUNDRY_IQ_PATTERN=searchIndex only when you intentionally keep a custom
GPT-RAG ingestion pipeline that writes chunks to Azure AI Search.
Demo video:
Zero Trust Deployment
For deployments that require network isolation.
Network-isolated deployments use a two-host flow:
| Phase | Where to run | Command |
|---|---|---|
| Provision infrastructure | Workstation | azd provision |
| Configure data-plane resources | Jumpbox or VNet-connected host | scripts/postProvision.ps1 |
| Deploy services | Jumpbox or VNet-connected host | azd deploy |
Do not run azd deploy from the workstation when NETWORK_ISOLATION=true. The deploy hook blocks that path because private resources and the private ACR build pool are reachable only from inside the VNet.
Network Isolation runbook
Use this runbook for a clean network-isolated deployment:
- On your workstation, create or select the azd environment and enable network isolation.
- Still on your workstation, run
azd provision. This creates the infrastructure and then stops before local data-plane configuration. - Connect to the jumpbox through Azure Bastion, or use another machine with VNet/VPN access.
- On the jumpbox, authenticate with the VM managed identity.
- On the jumpbox, run
scripts/postProvision.ps1withRUN_FROM_JUMPBOX=true. - On the jumpbox, run
azd deploywithRUN_FROM_JUMPBOX=trueandACR_TASK_AGENT_POOL=build-pool.
BUILD_MODE is normally not required. Component deploy scripts automatically use ACR remote builds when NETWORK_ISOLATION=true or when ACR_TASK_AGENT_POOL is set.
Regional preflight
Run preflight before every Zero Trust deployment. It is much faster to fail in the first few minutes than to wait for a long network-isolated deployment and then discover that a regional dependency cannot be created.
azd provision runs the scripts/preProvision hook. The hook invokes
scripts/Invoke-RegionalPreflight.ps1 before the Azure Resource Manager
deployment starts.
Preflight checks include:
- the selected Azure region and provider support,
- common regional readiness checks for Azure AI Search, Cosmos DB, Container Apps, AI Foundry, and Cognitive Services,
- jumpbox VM SKU availability and restrictions,
- Azure OpenAI model quota for the configured deployments.
Preflight is an early warning, not a live capacity reservation. Azure capacity can still change after the check passes, and some regional capacity errors are only returned when Azure creates the resource. Recent examples include Azure AI Search Standard capacity in Sweden Central and Cosmos DB zonal capacity in West Europe.
Use the result this way:
| Result | Operator action |
|---|---|
FAIL |
Stop. Fix the subscription, quota, region, or parameter issue before provisioning. |
WARN |
Review the warning before continuing. If it mentions capacity or regional risk, consider changing region first. |
| Pass | Continue, but keep the deployment logs open because live capacity can still change. |
If a region fails or warns on a critical dependency, try another fully supported
region instead of waiting 30 minutes or more for a deployment that is likely to
fail. Use GPT_RAG_REGIONAL_PREFLIGHT_SKIP=true only when you intentionally
bypass regional checks, or PREFLIGHT_SKIP=true to bypass all preflight hooks.
Before Provisioning
Enable network isolation in your environment:
azd env set NETWORK_ISOLATION true
Optional v2 parameters can be set before provisioning:
azd env set DEPLOYMENT_MODE standalone
azd env set VM_SIZE Standard_D2s_v3
azd env set ENABLE_COSMOS_ANALYTICAL_STORAGE false
ALLOWED_IP_RANGES is also available for CIDR allow-listing, but because it is an array parameter, prefer editing main.parameters.json or using a parameter overlay rather than storing a complex array in the azd environment.
Make sure youβre signed in with your Azure user account:
az login
azd auth login
Add
--tenantforazor--tenant-idforazdif you want a specific tenant.
Provision Infrastructure
azd env set AZURE_SKIP_NETWORK_ISOLATION_WARNING true # optional for automation; skips the local post-provision prompt
azd provision
Post-Provision Configuration
With NETWORK_ISOLATION=true, data-plane configuration must run from inside the VNet. A workstation should only run azd provision; if it does not have VNet/VPN access, the local post-provision hook will skip data-plane work and tell you to continue from the jumpbox.
Using the Jumpbox VM
1) Reset the VM password in the Azure Portal (required on first access if not set in deployment parameters):
- Go to your VM resource β Support + troubleshooting β Reset password β Set new credentials
- Default username is
testvmuser
2) Connect via Azure Bastion
3) Authenticate with the VM's Managed Identity:
az login --identity
azd auth login --managed-identity
Add
--tenantforazor--tenant-idforazdif you want a specific tenant.
4) Run the post-provision script:
PowerShell:
cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
.\scripts\postProvision.ps1
Bash:
cd /mnt/c/github/gpt-rag
./scripts/postProvision.sh
Note: If you have re-initialized or cloned the gpt-rag repo again, refresh your
azdenvironment before running the postProvision script so it points to the existing deployment:azd init -t azure/gpt-ragthenazd env refresh. When prompted, select the same Subscription, Resource Group, and Location as the original provisioning soazdcorrectly links to your environment.
Existing Platform / AI Landing Zone Integrated
Use these settings when GPT-RAG must deploy into an existing enterprise platform, such as a hub-spoke network with centrally managed Private DNS Zones, Log Analytics, Application Insights, Bastion, NAT Gateway, or Azure Firewall.
Core mode: set DEPLOYMENT_MODE to ailz-integrated, then pass the existing resource IDs that your platform team owns. The default remains standalone, so basic deployments do not require these settings.
azd env set DEPLOYMENT_MODE ailz-integrated
azd env set USE_EXISTING_VNET true
azd env set EXISTING_VNET_RESOURCE_ID "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Network/virtualNetworks/<vnet>"
Existing Private DNS Zones: set the zone resource IDs for services already managed by the platform. Common values include EXISTING_PRIVATE_DNS_ZONE_OPENAI_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_AISERVICES_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_SEARCH_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_COSMOS_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_BLOB_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_KEYVAULT_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_APPCONFIG_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_CONTAINERAPPS_RESOURCE_ID, EXISTING_PRIVATE_DNS_ZONE_ACR_RESOURCE_ID, and Azure Monitor / App Insights zone IDs.
azd env set EXISTING_PRIVATE_DNS_ZONE_SEARCH_RESOURCE_ID "/subscriptions/<sub>/resourceGroups/<dns-rg>/providers/Microsoft.Network/privateDnsZones/privatelink.search.windows.net"
azd env set EXISTING_PRIVATE_DNS_ZONE_OPENAI_RESOURCE_ID "/subscriptions/<sub>/resourceGroups/<dns-rg>/providers/Microsoft.Network/privateDnsZones/privatelink.openai.azure.com"
azd env set DNS_ZONE_LINK_SUFFIX "<unique-spoke-name>"
Shared platform resources: set EXISTING_LOG_ANALYTICS_WORKSPACE_RESOURCE_ID, EXISTING_APPLICATION_INSIGHTS_RESOURCE_ID, EXISTING_APPLICATION_INSIGHTS_CONNECTION_STRING, HUB_INTEGRATION_HUB_VNET_RESOURCE_ID, HUB_INTEGRATION_EGRESS_NEXT_HOP_IP, or HUB_INTEGRATION_EXISTING_ROUTE_TABLE_RESOURCE_ID when those resources are centrally managed.
Spoke resource switches: use DEPLOY_JUMPBOX, DEPLOY_BASTION, DEPLOY_NAT_GATEWAY, EXISTING_JUMPBOX_RESOURCE_ID, EXISTING_BASTION_RESOURCE_ID, EXISTING_NAT_GATEWAY_RESOURCE_ID, DEPLOY_AZURE_FIREWALL, and DEPLOY_ACR_TASK_AGENT_POOL to align GPT-RAG with the platform topology. These are optional and preserve the default standalone behavior when unset.
Deploy GPT-RAG Services
Note: For Zero Trust deployments with network isolation, deploy services from the jumpbox or another host with VNet connectivity. If using the jumpbox VM, the repositories are located in the
C:\githubdirectory.
Once the GPT-RAG infrastructure is provisioned, you can deploy the services.
To deploy all services at once, navigate to the gpt-rag directory (with azd environment configured) and run:
cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
azd env set NETWORK_ISOLATION true
azd env set ACR_TASK_AGENT_POOL build-pool
azd deploy
This command deploys each service in sequence. Docker is not required on the jumpbox for network-isolated deployments; component scripts use Azure Container Registry remote builds (az acr build) against the private ACR task agent pool.
The deploy hook uses NETWORK_ISOLATION as the source of truth. When NETWORK_ISOLATION=true, azd deploy fails fast unless it is running from the VNet with RUN_FROM_JUMPBOX=true; the older AZURE_ZERO_TRUST variable is not used.
If you prefer to deploy a single service, for example, when updating only that service, you can deploy it individually. Below is an example using the orchestrator service. The same approach applies to other services (frontend, dataingest, mcp).
Deploy Individual Services
Make sure you're logged in to Azure:
az login
Example: Deploying the Orchestrator
Using azd (recommended):
Initialize the template:
azd init -t azure/gpt-rag-orchestrator
Important: Use the same environment name with
azd initas in the infrastructure deployment to keep components consistent.
Update environment variables then deploy:
azd env refresh
azd deploy
Important: Run
azd env refreshwith the same subscription and resource group used in the infrastructure deployment.
Using a shell script:
Clone the repository, set the App Configuration endpoint, and run the deployment script.
PowerShell (Windows):
git clone https://github.com/Azure/gpt-rag-orchestrator.git
$env:APP_CONFIG_ENDPOINT = "https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
.\scripts\deploy.ps1
Bash (Linux/macOS):
git clone https://github.com/Azure/gpt-rag-orchestrator.git
export APP_CONFIG_ENDPOINT="https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
./scripts/deploy.sh
Permissions
Microsoft Foundry Role and AI Search Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Search Service | Search Index Data Reader | Microsoft Foundry Project | Read index data |
| GenAI App Search Service | Search Service Contributor | Microsoft Foundry Project | Create AI Search connection |
| GenAI App Storage Account | Storage Blob Data Reader | Microsoft Foundry Project | Read blob data |
| Microsoft Foundry Account | Cognitive Services User | Search Service | Allow Search Service to access vectorizers |
Container App Role Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: orchestrator | Read configuration data |
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: frontend | Read configuration data |
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: dataingest | Read configuration data |
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: mcp | Read configuration data |
| GenAI App Container Registry | AcrPull | ContainerApp: orchestrator | Pull container images |
| GenAI App Container Registry | AcrPull | ContainerApp: frontend | Pull container images |
| GenAI App Container Registry | AcrPull | ContainerApp: dataingest | Pull container images |
| GenAI App Container Registry | AcrPull | ContainerApp: mcp | Pull container images |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: orchestrator | Read secrets |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: frontend | Read secrets |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: dataingest | Read secrets |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: mcp | Read secrets |
| GenAI App Search Service | Search Index Data Reader | ContainerApp: orchestrator | Read index data |
| GenAI App Search Service | Search Index Data Contributor | ContainerApp: dataingest | Read/write index data |
| GenAI App Search Service | Search Index Data Contributor | ContainerApp: mcp | Read/write index data |
| GenAI App Storage Account | Storage Blob Data Reader | ContainerApp: orchestrator | Read blob data |
| GenAI App Storage Account | Storage Blob Data Reader | ContainerApp: frontend | Read blob data |
| GenAI App Storage Account | Storage Blob Data Contributor | ContainerApp: dataingest | Read/write blob data |
| GenAI App Storage Account | Storage Blob Data Contributor | ContainerApp: mcp | Read/write blob data |
| GenAI App Cosmos DB | Cosmos DB Built-in Data Contributor | ContainerApp: orchestrator | Read/write Cosmos DB data |
| Microsoft Foundry Account | Cognitive Services User | ContainerApp: orchestrator | Access Cognitive Services |
| Microsoft Foundry Account | Cognitive Services User | ContainerApp: dataingest | Access Cognitive Services |
| Microsoft Foundry Account | Cognitive Services User | ContainerApp: mcp | Access Cognitive Services |
| Microsoft Foundry Account | Cognitive Services OpenAI User | ContainerApp: orchestrator | Use OpenAI APIs |
| Microsoft Foundry Account | Cognitive Services OpenAI User | ContainerApp: dataingest | Use OpenAI APIs |
| Microsoft Foundry Account | Cognitive Services OpenAI User | ContainerApp: mcp | Use OpenAI APIs |
Executor Role Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Configuration Store | App Configuration Data Owner | Executor | Full control over configuration settings |
| GenAI App Container Registry | AcrPush | Executor | Push container images |
| GenAI App Container Registry | AcrPull | Executor | Pull container images |
| GenAI App Key Vault | Key Vault Contributor | Executor | Manage Key Vault settings |
| GenAI App Key Vault | Key Vault Secrets Officer | Executor | Create Key Vault secrets |
| GenAI App Search Service | Search Service Contributor | Executor | Create/update search service elements |
| GenAI App Search Service | Search Index Data Contributor | Executor | Read/write search index data |
| GenAI App Search Service | Search Index Data Reader | Executor | Read index data |
| GenAI App Storage Account | Storage Blob Data Contributor | Executor | Read/write blob data |
| GenAI App Cosmos DB | Cosmos DB Built-in Data Contributor | Executor | Read/write Cosmos DB data |
| Microsoft Foundry Account | Cognitive Services OpenAI User | Executor | Use OpenAI APIs |
Jumpbox VM Role Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Container Apps | Container Apps Contributor | Jumpbox VM | Full control over Container Apps |
| Azure Managed Identity | Managed Identity Operator | Jumpbox VM | Assign and manage user-assigned identities |
| GenAI App Container Registry | Container Registry Repository Writer | Jumpbox VM | Write to ACR repositories |
| GenAI App Container Registry | Container Registry Tasks Contributor | Jumpbox VM | Manage ACR tasks |
| GenAI App Container Registry | Container Registry Data Access Configuration Administrator | Jumpbox VM | Manage ACR data access configuration |
| GenAI App Container Registry | AcrPush | Jumpbox VM | Push container images |
| GenAI App Configuration Store | App Configuration Data Owner | Jumpbox VM | Full control over configuration settings |
| GenAI App Key Vault | Key Vault Contributor | Jumpbox VM | Manage Key Vault settings |
| GenAI App Key Vault | Key Vault Secrets Officer | Jumpbox VM | Create Key Vault secrets |
| GenAI App Search Service | Search Service Contributor | Jumpbox VM | Create/update search service elements |
| GenAI App Search Service | Search Index Data Contributor | Jumpbox VM | Read/write search index data |
| GenAI App Storage Account | Storage Blob Data Contributor | Jumpbox VM | Read/write blob data |
| GenAI App Cosmos DB | Cosmos DB Built-in Data Contributor | Jumpbox VM | Read/write Cosmos DB data |
| Microsoft Foundry Account | Cognitive Services Contributor | Jumpbox VM | Manage Cognitive Services resources |
| Microsoft Foundry Account | Cognitive Services OpenAI User | Jumpbox VM | Use OpenAI APIs |