π Deployment Guide
Choose your preferred deployment method based on project requirements and environment constraints.
Note: You can change parameter values in
main.parameters.jsonor set them withazd env setbefore runningazd provision. This applies only to parameters that support environment variable substitution.Underlying infrastructure: GPT-RAG provisions the Azure AI Landing Zone (AILZ) Bicep module as its infrastructure foundation. For the full list of parameters, opt-in features (IP allow-lists, BYO Private DNS / Log Analytics, hub-and-spoke integration, etc.), and the v2 migration path, see the AILZ parameterization reference and the v2-migration guide.
Prerequisites
Required Permissions:
- Azure subscription with Contributor and User Access Admin roles
- Agreement to Responsible AI terms for Azure AI Services
Required Tools:
- Azure Developer CLI
- PowerShell 7+ (Windows only)
- Git
- Python 3.12
Basic Deployment
Quick setup for demos without network isolation. In this mode, the workstation can run the full flow: provision, post-provision configuration, and service deployment.
azd init -t azure/gpt-rag
az login
azd auth login
azd env set NETWORK_ISOLATION false
azd provision
azd deploy
Add
--tenantforazor--tenant-idforazdif you want a specific tenant.
azd provision runs GPT-RAG preflight checks before Azure Resource Manager deployment starts. These checks validate the selected region, jumpbox VM SKU restrictions, provider/location support for AI Search, Cosmos DB, Container Apps, and AI Foundry/Cognitive Services, and Azure OpenAI model quota for the configured deployments. If model quota is insufficient, the hook fails early and suggests candidate regions when possible.
Some transient Azure capacity failures are not exposed by reliable pre-create APIs. For example, Cosmos DB can still fail later with regional high-demand ServiceUnavailable; the preflight reports this limitation explicitly. Use GPT_RAG_REGIONAL_PREFLIGHT_SKIP=true only to bypass GPT-RAG regional checks, or PREFLIGHT_SKIP=true to bypass all preflight hooks.
For the basic flow, the postProvision hook runs locally after azd provision and configures GPT-RAG data-plane resources such as App Configuration and search setup. Then azd deploy deploys the frontend, orchestrator, and ingestion services.
Demo video:
Zero Trust Deployment
For deployments that require network isolation.
Network-isolated deployments use a two-host flow:
| Phase | Where to run | Command |
|---|---|---|
| Provision infrastructure | Workstation | azd provision |
| Configure data-plane resources | Jumpbox or VNet-connected host | scripts/postProvision.ps1 |
| Deploy services | Jumpbox or VNet-connected host | azd deploy |
Do not run azd deploy from the workstation when NETWORK_ISOLATION=true. The deploy hook blocks that path because private resources and the private ACR build pool are reachable only from inside the VNet.
Network Isolation runbook
Use this runbook for a clean network-isolated deployment:
- On your workstation, create or select the azd environment and enable network isolation.
- Still on your workstation, run
azd provision. This creates the infrastructure and then stops before local data-plane configuration. - Connect to the jumpbox through Azure Bastion, or use another machine with VNet/VPN access.
- On the jumpbox, authenticate with the VM managed identity.
- On the jumpbox, run
scripts/postProvision.ps1withRUN_FROM_JUMPBOX=true. - On the jumpbox, run
azd deploywithRUN_FROM_JUMPBOX=trueandACR_TASK_AGENT_POOL=build-pool.
BUILD_MODE is normally not required. Component deploy scripts automatically use ACR remote builds when NETWORK_ISOLATION=true or when ACR_TASK_AGENT_POOL is set.
Before Provisioning
Enable network isolation in your environment:
azd env set NETWORK_ISOLATION true
Optional v2 parameters can be set before provisioning:
azd env set DEPLOYMENT_MODE standalone
azd env set VM_SIZE Standard_D2s_v3
azd env set ENABLE_COSMOS_ANALYTICAL_STORAGE false
ALLOWED_IP_RANGES is also available for CIDR allow-listing, but because it is an array parameter, prefer editing main.parameters.json or using a parameter overlay rather than storing a complex array in the azd environment.
Make sure youβre signed in with your Azure user account:
az login
azd auth login
Add
--tenantforazor--tenant-idforazdif you want a specific tenant.
Provision Infrastructure
azd env set AZURE_SKIP_NETWORK_ISOLATION_WARNING true # optional for automation; skips the local post-provision prompt
azd provision
Post-Provision Configuration
With NETWORK_ISOLATION=true, data-plane configuration must run from inside the VNet. A workstation should only run azd provision; if it does not have VNet/VPN access, the local post-provision hook will skip data-plane work and tell you to continue from the jumpbox.
Using the Jumpbox VM
1) Reset the VM password in the Azure Portal (required on first access if not set in deployment parameters):
- Go to your VM resource β Support + troubleshooting β Reset password β Set new credentials
- Default username is
testvmuser
2) Connect via Azure Bastion
3) Authenticate with the VM's Managed Identity:
az login --identity
azd auth login --managed-identity
Add
--tenantforazor--tenant-idforazdif you want a specific tenant.
4) Run the post-provision script:
PowerShell:
cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
.\scripts\postProvision.ps1
Bash:
cd /mnt/c/github/gpt-rag
./scripts/postProvision.sh
Note: If you have re-initialized or cloned the gpt-rag repo again, refresh your
azdenvironment before running the postProvision script so it points to the existing deployment:azd init -t azure/gpt-ragthenazd env refresh. When prompted, select the same Subscription, Resource Group, and Location as the original provisioning soazdcorrectly links to your environment.
Deploy GPT-RAG Services
Note: For Zero Trust deployments with network isolation, deploy services from the jumpbox or another host with VNet connectivity. If using the jumpbox VM, the repositories are located in the
C:\githubdirectory.
Once the GPT-RAG infrastructure is provisioned, you can deploy the services.
To deploy all services at once, navigate to the gpt-rag directory (with azd environment configured) and run:
cd C:\github\GPT-RAG
azd env set RUN_FROM_JUMPBOX true
azd env set NETWORK_ISOLATION true
azd env set ACR_TASK_AGENT_POOL build-pool
azd deploy
This command deploys each service in sequence. Docker is not required on the jumpbox for network-isolated deployments; component scripts use Azure Container Registry remote builds (az acr build) against the private ACR task agent pool.
The deploy hook uses NETWORK_ISOLATION as the source of truth. When NETWORK_ISOLATION=true, azd deploy fails fast unless it is running from the VNet with RUN_FROM_JUMPBOX=true; the older AZURE_ZERO_TRUST variable is not used.
If you prefer to deploy a single service, for example, when updating only that service, you can deploy it individually. Below is an example using the orchestrator service. The same approach applies to other services (frontend, dataingest, mcp).
Deploy Individual Services
Make sure you're logged in to Azure:
az login
Example: Deploying the Orchestrator
Using azd (recommended):
Initialize the template:
azd init -t azure/gpt-rag-orchestrator
Important: Use the same environment name with
azd initas in the infrastructure deployment to keep components consistent.
Update environment variables then deploy:
azd env refresh
azd deploy
Important: Run
azd env refreshwith the same subscription and resource group used in the infrastructure deployment.
Using a shell script:
Clone the repository, set the App Configuration endpoint, and run the deployment script.
PowerShell (Windows):
git clone https://github.com/Azure/gpt-rag-orchestrator.git
$env:APP_CONFIG_ENDPOINT = "https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
.\scripts\deploy.ps1
Bash (Linux/macOS):
git clone https://github.com/Azure/gpt-rag-orchestrator.git
export APP_CONFIG_ENDPOINT="https://<your-app-config-name>.azconfig.io"
cd gpt-rag-orchestrator
./scripts/deploy.sh
Permissions
Microsoft Foundry Role and AI Search Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Search Service | Search Index Data Reader | Microsoft Foundry Project | Read index data |
| GenAI App Search Service | Search Service Contributor | Microsoft Foundry Project | Create AI Search connection |
| GenAI App Storage Account | Storage Blob Data Reader | Microsoft Foundry Project | Read blob data |
| Microsoft Foundry Account | Cognitive Services User | Search Service | Allow Search Service to access vectorizers |
Container App Role Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: orchestrator | Read configuration data |
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: frontend | Read configuration data |
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: dataingest | Read configuration data |
| GenAI App Configuration Store | App Configuration Data Reader | ContainerApp: mcp | Read configuration data |
| GenAI App Container Registry | AcrPull | ContainerApp: orchestrator | Pull container images |
| GenAI App Container Registry | AcrPull | ContainerApp: frontend | Pull container images |
| GenAI App Container Registry | AcrPull | ContainerApp: dataingest | Pull container images |
| GenAI App Container Registry | AcrPull | ContainerApp: mcp | Pull container images |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: orchestrator | Read secrets |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: frontend | Read secrets |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: dataingest | Read secrets |
| GenAI App Key Vault | Key Vault Secrets User | ContainerApp: mcp | Read secrets |
| GenAI App Search Service | Search Index Data Reader | ContainerApp: orchestrator | Read index data |
| GenAI App Search Service | Search Index Data Contributor | ContainerApp: dataingest | Read/write index data |
| GenAI App Search Service | Search Index Data Contributor | ContainerApp: mcp | Read/write index data |
| GenAI App Storage Account | Storage Blob Data Reader | ContainerApp: orchestrator | Read blob data |
| GenAI App Storage Account | Storage Blob Data Reader | ContainerApp: frontend | Read blob data |
| GenAI App Storage Account | Storage Blob Data Contributor | ContainerApp: dataingest | Read/write blob data |
| GenAI App Storage Account | Storage Blob Data Contributor | ContainerApp: mcp | Read/write blob data |
| GenAI App Cosmos DB | Cosmos DB Built-in Data Contributor | ContainerApp: orchestrator | Read/write Cosmos DB data |
| Microsoft Foundry Account | Cognitive Services User | ContainerApp: orchestrator | Access Cognitive Services |
| Microsoft Foundry Account | Cognitive Services User | ContainerApp: dataingest | Access Cognitive Services |
| Microsoft Foundry Account | Cognitive Services User | ContainerApp: mcp | Access Cognitive Services |
| Microsoft Foundry Account | Cognitive Services OpenAI User | ContainerApp: orchestrator | Use OpenAI APIs |
| Microsoft Foundry Account | Cognitive Services OpenAI User | ContainerApp: dataingest | Use OpenAI APIs |
| Microsoft Foundry Account | Cognitive Services OpenAI User | ContainerApp: mcp | Use OpenAI APIs |
Executor Role Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Configuration Store | App Configuration Data Owner | Executor | Full control over configuration settings |
| GenAI App Container Registry | AcrPush | Executor | Push container images |
| GenAI App Container Registry | AcrPull | Executor | Pull container images |
| GenAI App Key Vault | Key Vault Contributor | Executor | Manage Key Vault settings |
| GenAI App Key Vault | Key Vault Secrets Officer | Executor | Create Key Vault secrets |
| GenAI App Search Service | Search Service Contributor | Executor | Create/update search service elements |
| GenAI App Search Service | Search Index Data Contributor | Executor | Read/write search index data |
| GenAI App Search Service | Search Index Data Reader | Executor | Read index data |
| GenAI App Storage Account | Storage Blob Data Contributor | Executor | Read/write blob data |
| GenAI App Cosmos DB | Cosmos DB Built-in Data Contributor | Executor | Read/write Cosmos DB data |
| Microsoft Foundry Account | Cognitive Services OpenAI User | Executor | Use OpenAI APIs |
Jumpbox VM Role Assignments
| Resource | Role | Assignee | Description |
|---|---|---|---|
| GenAI App Container Apps | Container Apps Contributor | Jumpbox VM | Full control over Container Apps |
| Azure Managed Identity | Managed Identity Operator | Jumpbox VM | Assign and manage user-assigned identities |
| GenAI App Container Registry | Container Registry Repository Writer | Jumpbox VM | Write to ACR repositories |
| GenAI App Container Registry | Container Registry Tasks Contributor | Jumpbox VM | Manage ACR tasks |
| GenAI App Container Registry | Container Registry Data Access Configuration Administrator | Jumpbox VM | Manage ACR data access configuration |
| GenAI App Container Registry | AcrPush | Jumpbox VM | Push container images |
| GenAI App Configuration Store | App Configuration Data Owner | Jumpbox VM | Full control over configuration settings |
| GenAI App Key Vault | Key Vault Contributor | Jumpbox VM | Manage Key Vault settings |
| GenAI App Key Vault | Key Vault Secrets Officer | Jumpbox VM | Create Key Vault secrets |
| GenAI App Search Service | Search Service Contributor | Jumpbox VM | Create/update search service elements |
| GenAI App Search Service | Search Index Data Contributor | Jumpbox VM | Read/write search index data |
| GenAI App Storage Account | Storage Blob Data Contributor | Jumpbox VM | Read/write blob data |
| GenAI App Cosmos DB | Cosmos DB Built-in Data Contributor | Jumpbox VM | Read/write Cosmos DB data |
| Microsoft Foundry Account | Cognitive Services Contributor | Jumpbox VM | Manage Cognitive Services resources |
| Microsoft Foundry Account | Cognitive Services OpenAI User | Jumpbox VM | Use OpenAI APIs |