Skip to content

πŸš€ Deployment Guide

Choose your preferred deployment method based on project requirements and environment constraints.

Note: You can change parameter values in main.parameters.json or set them with azd env set before running azd provision. This applies only to parameters that support environment variable substitution.

Underlying infrastructure: GPT-RAG provisions the Azure AI Landing Zone (AILZ) Bicep module as its infrastructure foundation. For the full list of parameters, opt-in features (IP allow-lists, BYO Private DNS / Log Analytics, hub-and-spoke integration, etc.), and the v2 migration path, see the AILZ parameterization reference and the v2-migration guide.

Prerequisites

Required Permissions:

  • Azure subscription with Contributor and User Access Admin roles
  • Agreement to Responsible AI terms for Azure AI Services

Required Tools:

Basic Deployment

Quick setup for demos without network isolation. In this mode, the workstation can run the full flow: provision, post-provision configuration, and service deployment.

azd init -t azure/gpt-rag
az login
azd auth login
azd env set NETWORK_ISOLATION false
azd provision
azd deploy

Add --tenant for az or --tenant-id for azd if you want a specific tenant.

azd provision runs GPT-RAG preflight checks before Azure Resource Manager deployment starts. These checks validate the selected region, jumpbox VM SKU restrictions, provider/location support for AI Search, Cosmos DB, Container Apps, and AI Foundry/Cognitive Services, and Azure OpenAI model quota for the configured deployments. If model quota is insufficient, the hook fails early and suggests candidate regions when possible.

Some transient Azure capacity failures are not exposed by reliable pre-create APIs. For example, Cosmos DB can still fail later with regional high-demand ServiceUnavailable; the preflight reports this limitation explicitly. Use GPT_RAG_REGIONAL_PREFLIGHT_SKIP=true only to bypass GPT-RAG regional checks, or PREFLIGHT_SKIP=true to bypass all preflight hooks.

For the basic flow, the postProvision hook runs locally after azd provision and configures GPT-RAG data-plane resources such as App Configuration and search setup. Then azd deploy deploys the frontend, orchestrator, and ingestion services.

Demo video:

Zero Trust Deployment

For deployments that require network isolation.

Network-isolated deployments use a two-host flow:

Phase Where to run Command
Provision infrastructure Workstation azd provision
Configure data-plane resources Jumpbox or VNet-connected host scripts/postProvision.ps1
Deploy services Jumpbox or VNet-connected host azd deploy

Do not run azd deploy from the workstation when NETWORK_ISOLATION=true. The deploy hook blocks that path because private resources and the private ACR build pool are reachable only from inside the VNet.

Network Isolation runbook

Use this runbook for a clean network-isolated deployment:

  1. On your workstation, create or select the azd environment and enable network isolation.
  2. Still on your workstation, run azd provision. This creates the infrastructure and then stops before local data-plane configuration.
  3. Connect to the jumpbox through Azure Bastion, or use another machine with VNet/VPN access.
  4. On the jumpbox, authenticate with the VM managed identity.
  5. On the jumpbox, run scripts/postProvision.ps1 with RUN_FROM_JUMPBOX=true.
  6. On the jumpbox, run azd deploy with RUN_FROM_JUMPBOX=true and ACR_TASK_AGENT_POOL=build-pool.

BUILD_MODE is normally not required. Component deploy scripts automatically use ACR remote builds when NETWORK_ISOLATION=true or when ACR_TASK_AGENT_POOL is set.

Before Provisioning

Enable network isolation in your environment:

azd env set NETWORK_ISOLATION true

Optional v2 parameters can be set before provisioning:

azd env set DEPLOYMENT_MODE standalone
azd env set VM_SIZE Standard_D2s_v3
azd env set ENABLE_COSMOS_ANALYTICAL_STORAGE false

ALLOWED_IP_RANGES is also available for CIDR allow-listing, but because it is an array parameter, prefer editing main.parameters.json or using a parameter overlay rather than storing a complex array in the azd environment.

Make sure you’re signed in with your Azure user account:

az login
azd auth login

Add --tenant for az or --tenant-id for azd if you want a specific tenant.

Provision Infrastructure

azd env set AZURE_SKIP_NETWORK_ISOLATION_WARNING true   # optional for automation; skips the local post-provision prompt
azd provision

Post-Provision Configuration

With NETWORK_ISOLATION=true, data-plane configuration must run from inside the VNet. A workstation should only run azd provision; if it does not have VNet/VPN access, the local post-provision hook will skip data-plane work and tell you to continue from the jumpbox.

Using the Jumpbox VM

1) Reset the VM password in the Azure Portal (required on first access if not set in deployment parameters):

  • Go to your VM resource β†’ Support + troubleshooting β†’ Reset password β†’ Set new credentials
  • Default username is testvmuser

2) Connect via Azure Bastion

3) Authenticate with the VM's Managed Identity:

az login --identity
azd auth login --managed-identity

Add --tenant for az or --tenant-id for azd if you want a specific tenant.

4) Run the post-provision script:

PowerShell:

cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
.\scripts\postProvision.ps1

Bash:

cd /mnt/c/github/gpt-rag
./scripts/postProvision.sh

Note: If you have re-initialized or cloned the gpt-rag repo again, refresh your azd environment before running the postProvision script so it points to the existing deployment: azd init -t azure/gpt-rag then azd env refresh. When prompted, select the same Subscription, Resource Group, and Location as the original provisioning so azd correctly links to your environment.

Deploy GPT-RAG Services

Note: For Zero Trust deployments with network isolation, deploy services from the jumpbox or another host with VNet connectivity. If using the jumpbox VM, the repositories are located in the C:\github directory.

Once the GPT-RAG infrastructure is provisioned, you can deploy the services.

To deploy all services at once, navigate to the gpt-rag directory (with azd environment configured) and run:

cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
azd env set NETWORK_ISOLATION true
azd env set ACR_TASK_AGENT_POOL build-pool
azd deploy

This command deploys each service in sequence. Docker is not required on the jumpbox for network-isolated deployments; component scripts use Azure Container Registry remote builds (az acr build) against the private ACR task agent pool.

The deploy hook uses NETWORK_ISOLATION as the source of truth. When NETWORK_ISOLATION=true, azd deploy fails fast unless it is running from the VNet with RUN_FROM_JUMPBOX=true; the older AZURE_ZERO_TRUST variable is not used.

If you prefer to deploy a single service, for example, when updating only that service, you can deploy it individually. Below is an example using the orchestrator service. The same approach applies to other services (frontend, dataingest, mcp).

Deploy Individual Services

Make sure you're logged in to Azure:

az login

Example: Deploying the Orchestrator

Using azd (recommended):

Initialize the template:

azd init -t azure/gpt-rag-orchestrator 

Important: Use the same environment name with azd init as in the infrastructure deployment to keep components consistent.

Update environment variables then deploy:

azd env refresh
azd deploy 

Important: Run azd env refresh with the same subscription and resource group used in the infrastructure deployment.

Using a shell script:

Clone the repository, set the App Configuration endpoint, and run the deployment script.

PowerShell (Windows):

git clone https://github.com/Azure/gpt-rag-orchestrator.git
$env:APP_CONFIG_ENDPOINT = "https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
.\scripts\deploy.ps1

Bash (Linux/macOS):

git clone https://github.com/Azure/gpt-rag-orchestrator.git
export APP_CONFIG_ENDPOINT="https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
./scripts/deploy.sh

Permissions

Microsoft Foundry Role and AI Search Assignments

Resource Role Assignee Description
GenAI App Search Service Search Index Data Reader Microsoft Foundry Project Read index data
GenAI App Search Service Search Service Contributor Microsoft Foundry Project Create AI Search connection
GenAI App Storage Account Storage Blob Data Reader Microsoft Foundry Project Read blob data
Microsoft Foundry Account Cognitive Services User Search Service Allow Search Service to access vectorizers

Container App Role Assignments

Resource Role Assignee Description
GenAI App Configuration Store App Configuration Data Reader ContainerApp: orchestrator Read configuration data
GenAI App Configuration Store App Configuration Data Reader ContainerApp: frontend Read configuration data
GenAI App Configuration Store App Configuration Data Reader ContainerApp: dataingest Read configuration data
GenAI App Configuration Store App Configuration Data Reader ContainerApp: mcp Read configuration data
GenAI App Container Registry AcrPull ContainerApp: orchestrator Pull container images
GenAI App Container Registry AcrPull ContainerApp: frontend Pull container images
GenAI App Container Registry AcrPull ContainerApp: dataingest Pull container images
GenAI App Container Registry AcrPull ContainerApp: mcp Pull container images
GenAI App Key Vault Key Vault Secrets User ContainerApp: orchestrator Read secrets
GenAI App Key Vault Key Vault Secrets User ContainerApp: frontend Read secrets
GenAI App Key Vault Key Vault Secrets User ContainerApp: dataingest Read secrets
GenAI App Key Vault Key Vault Secrets User ContainerApp: mcp Read secrets
GenAI App Search Service Search Index Data Reader ContainerApp: orchestrator Read index data
GenAI App Search Service Search Index Data Contributor ContainerApp: dataingest Read/write index data
GenAI App Search Service Search Index Data Contributor ContainerApp: mcp Read/write index data
GenAI App Storage Account Storage Blob Data Reader ContainerApp: orchestrator Read blob data
GenAI App Storage Account Storage Blob Data Reader ContainerApp: frontend Read blob data
GenAI App Storage Account Storage Blob Data Contributor ContainerApp: dataingest Read/write blob data
GenAI App Storage Account Storage Blob Data Contributor ContainerApp: mcp Read/write blob data
GenAI App Cosmos DB Cosmos DB Built-in Data Contributor ContainerApp: orchestrator Read/write Cosmos DB data
Microsoft Foundry Account Cognitive Services User ContainerApp: orchestrator Access Cognitive Services
Microsoft Foundry Account Cognitive Services User ContainerApp: dataingest Access Cognitive Services
Microsoft Foundry Account Cognitive Services User ContainerApp: mcp Access Cognitive Services
Microsoft Foundry Account Cognitive Services OpenAI User ContainerApp: orchestrator Use OpenAI APIs
Microsoft Foundry Account Cognitive Services OpenAI User ContainerApp: dataingest Use OpenAI APIs
Microsoft Foundry Account Cognitive Services OpenAI User ContainerApp: mcp Use OpenAI APIs

Executor Role Assignments

Resource Role Assignee Description
GenAI App Configuration Store App Configuration Data Owner Executor Full control over configuration settings
GenAI App Container Registry AcrPush Executor Push container images
GenAI App Container Registry AcrPull Executor Pull container images
GenAI App Key Vault Key Vault Contributor Executor Manage Key Vault settings
GenAI App Key Vault Key Vault Secrets Officer Executor Create Key Vault secrets
GenAI App Search Service Search Service Contributor Executor Create/update search service elements
GenAI App Search Service Search Index Data Contributor Executor Read/write search index data
GenAI App Search Service Search Index Data Reader Executor Read index data
GenAI App Storage Account Storage Blob Data Contributor Executor Read/write blob data
GenAI App Cosmos DB Cosmos DB Built-in Data Contributor Executor Read/write Cosmos DB data
Microsoft Foundry Account Cognitive Services OpenAI User Executor Use OpenAI APIs

Jumpbox VM Role Assignments

Resource Role Assignee Description
GenAI App Container Apps Container Apps Contributor Jumpbox VM Full control over Container Apps
Azure Managed Identity Managed Identity Operator Jumpbox VM Assign and manage user-assigned identities
GenAI App Container Registry Container Registry Repository Writer Jumpbox VM Write to ACR repositories
GenAI App Container Registry Container Registry Tasks Contributor Jumpbox VM Manage ACR tasks
GenAI App Container Registry Container Registry Data Access Configuration Administrator Jumpbox VM Manage ACR data access configuration
GenAI App Container Registry AcrPush Jumpbox VM Push container images
GenAI App Configuration Store App Configuration Data Owner Jumpbox VM Full control over configuration settings
GenAI App Key Vault Key Vault Contributor Jumpbox VM Manage Key Vault settings
GenAI App Key Vault Key Vault Secrets Officer Jumpbox VM Create Key Vault secrets
GenAI App Search Service Search Service Contributor Jumpbox VM Create/update search service elements
GenAI App Search Service Search Index Data Contributor Jumpbox VM Read/write search index data
GenAI App Storage Account Storage Blob Data Contributor Jumpbox VM Read/write blob data
GenAI App Cosmos DB Cosmos DB Built-in Data Contributor Jumpbox VM Read/write Cosmos DB data
Microsoft Foundry Account Cognitive Services Contributor Jumpbox VM Manage Cognitive Services resources
Microsoft Foundry Account Cognitive Services OpenAI User Jumpbox VM Use OpenAI APIs
Β© 2025 GPT-RAG β€” powered by ❀️ and coffee β˜•