Overview

Architecture

GPT-RAG is modular. The full Zero Trust diagram shows a hardened, full-capability reference architecture; it is not the minimum footprint required to evaluate or run the accelerator. Start with the Basic Deployment baseline, then add network isolation, enterprise integration, public ingress, or AI capabilities only when the scenario requires them.

Full Zero Trust reference

The existing architecture diagram remains the full network-isolated reference view. Use it when discussing hardened deployments, while the complementary diagrams below explain the baseline and optional layers.

Zero Trust Architecture

Download Visio Diagram

Complementary modular views

How to read these diagrams

The modular view is organized around Basic Deployment, Common platform services, and Zero Trust additions. Solid-color chips are standard resources, dashed orange chips are default-on or BYO-capable parameters, and solid orange chips are opt-in add-ons.

Basic Deployment architecture

The baseline corresponds to the classic Basic Deployment flow with NETWORK_ISOLATION=false, DEPLOY_HOSTED_AGENT_ORCHESTRATION=false, and DEPLOY_ADMINISTRATIVE_PANEL=false. It focuses on the default application and data path: users access the frontend, the orchestrator Container App coordinates AI and retrieval, ingestion indexes enterprise content, and shared platform services provide configuration, secrets, identity, storage, search, and conversation state. The unreleased hosted modes replace the chat Container App with a Microsoft Foundry hosted agent and optionally retain only the ingestion administrative backend.

Modular architecture layers

Use the table below for the deployment parameters behind each layer, and the Deployment Guide for the full azd env set flows.

Deployment component table

Layer	Posture	Controlled by	Include when
Frontend, chat runtime, ingestion	Mode-selected baseline	`DEPLOY_HOSTED_AGENT_ORCHESTRATION`, `DEPLOY_ADMINISTRATIVE_PANEL`, `manifest.json` components, and `containerAppsList`	Running classic frontend/orchestrator/ingestion, hosted chat without a panel, or hosted chat with only the ingestion administrative backend.
AI Foundry account, project, and model deployments	Required AI control plane	`deployAiFoundry`, `deployAfProject`, `deployAAfAgentSvc`, `modelDeploymentList`	Provisioning Azure AI Foundry / Azure OpenAI and the model deployments used by GPT-RAG.
AI Foundry associated resources	Default-created or BYO-capable	`aiSearchResourceId`, `aiFoundryStorageAccountResourceId`, `aiFoundryCosmosDBAccountResourceId`, `keyVaultResourceId`, `aiFoundryStorageSku`	Letting the AI Foundry module create its required Storage, Search, Cosmos DB, and Key Vault resources, or reusing existing ones.
RAG workload data services	Mode-selected, parameter-controlled	`deploySearchService`, `deployStorageAccount`, `deployCosmosDb`, `storageAccountContainersList`, `databaseContainersList`	Running the indexed-document and file-storage path. Classic retains conversation/panel Cosmos DB; hosted/no-panel omits panel-only Cosmos DB; hosted/panel retains only panel/feedback data.
App Configuration, managed identity / RBAC, Container Apps, Container Registry	Required platform capabilities, topology varies by mode	`deployAppConfig`, `deployContainerApps`, `deployContainerEnv`, `deployContainerRegistry`, `useUAI`, service role lists	Centralizing runtime settings and identities. Hosted/no-panel does not provision an orchestrator Container App; hosted/panel retains only the ingestion administrative backend.
Workload Key Vault and observability	Default support, parameter-controlled or reusable	`deployKeyVault`, `deployLogAnalytics`, `deployAppInsights`, `EXISTING_LOG_ANALYTICS_WORKSPACE_RESOURCE_ID`, `EXISTING_APPLICATION_INSIGHTS_RESOURCE_ID`, `EXISTING_APPLICATION_INSIGHTS_CONNECTION_STRING`	Storing workload secrets and capturing telemetry. Application Insights is created or wired only when an effective Log Analytics workspace is available.
Zero Trust private networking	Optional security posture	`networkIsolation`, `allowedIpRanges`, `useExistingVNet`, `deploySubnets`, `policyManagedPrivateDns`, `EXISTING_PRIVATE_DNS_ZONE_*`	Requiring private endpoints, private DNS, VNet integration, NSGs, and internal Container Apps ingress.
Azure Firewall, Jumpbox, Bastion, NAT Gateway, private ACR build pool	Zero Trust operations/build options	`DEPLOY_AZURE_FIREWALL`, `DEPLOY_JUMPBOX`, `DEPLOY_BASTION`, `DEPLOY_NAT_GATEWAY`, `DEPLOY_ACR_TASK_AGENT_POOL`, `EXISTING_JUMPBOX_RESOURCE_ID`, `EXISTING_BASTION_RESOURCE_ID`, `EXISTING_NAT_GATEWAY_RESOURCE_ID`	Operating from inside the VNet, reusing central access/egress resources, or enabling a private ACR Task agent pool. In GPT-RAG, the ACR agent pool defaults to off and is opt-in.
Application Gateway WAF public ingress	Optional entry layer	`publicIngress.enabled`	Exposing one private Container App through controlled public HTTPS/WAF. See Application Gateway.
Existing platform / AI Landing Zone integration	Optional enterprise integration	`DEPLOYMENT_MODE=ailz-integrated`, `USE_EXISTING_VNET`, `EXISTING__RESOURCE_ID`, `HUB_INTEGRATION_`	Reusing central network, DNS, observability, Bastion, NAT, or hub-spoke resources.
Scenario capabilities	Optional feature add-ons	`DEPLOY_SPEECH_SERVICE`, `DEPLOY_GROUNDING_WITH_BING`, `ENABLE_AGENTIC_RETRIEVAL`	Enabling voice, Bing grounding, or agentic retrieval scenarios. MCP/tool-hosting and NL2SQL application behavior are configured outside the Bicep-deployed infrastructure shown in this diagram.

For data ownership, telemetry, retention, and responsibility boundaries, see Governance and responsible operation. The correlated audit event implementation on that page is available since GPT-RAG v3.7.0 but remains disabled by default and is not part of the basic deployment shown above until an operator enables it.

Key Capabilities

Enterprise-Grade Security
Optional Zero Trust architecture with private endpoints, Azure Key Vault integration, and comprehensive monitoring.
Flexible & Customizable
Modular design with customizable orchestration, multiple interface options, and bring-your-own-resources support.
Multimodal Experience
Native support for text, images, and voice with SharePoint and Fabric connectors for seamless data integration.
Production Ready
Enterprise-ready infrastructure with support for CI/CD pipelines and quality evaluation integration.