Skip to content

Overview

Architecture

GPT-RAG is modular. The full Zero Trust diagram shows a hardened, full-capability reference architecture; it is not the minimum footprint required to evaluate or run the accelerator. Start with the Basic Deployment baseline, then add network isolation, enterprise integration, public ingress, or AI capabilities only when the scenario requires them.

Full Zero Trust reference

The existing architecture diagram remains the full network-isolated reference view. Use it when discussing hardened deployments, while the complementary diagrams below explain the baseline and optional layers.

Zero Trust Architecture

Download Visio Diagram


Complementary modular views

How to read these diagrams

The modular view is organized around Basic Deployment, Common platform services, and Zero Trust additions. Solid-color chips are standard resources, dashed orange chips are default-on or BYO-capable parameters, and solid orange chips are opt-in add-ons.

Basic Deployment architecture

The baseline corresponds to the Basic Deployment flow with NETWORK_ISOLATION=false. It focuses on the default application and data path: users access the frontend, the orchestrator coordinates AI and retrieval, ingestion indexes enterprise content, and shared platform services provide configuration, secrets, identity, storage, search, and conversation state.

Modular architecture layers

Use the table below for the deployment parameters behind each layer, and the Deployment Guide for the full azd env set flows.

Deployment component table

Layer Posture Controlled by Include when
Frontend, orchestrator, ingestion Required baseline manifest.json components and containerAppsList Running the default GPT-RAG web, orchestration, and ingestion services.
AI Foundry account, project, and model deployments Required AI control plane deployAiFoundry, deployAfProject, deployAAfAgentSvc, modelDeploymentList Provisioning Azure AI Foundry / Azure OpenAI and the model deployments used by GPT-RAG.
AI Foundry associated resources Default-created or BYO-capable aiSearchResourceId, aiFoundryStorageAccountResourceId, aiFoundryCosmosDBAccountResourceId, keyVaultResourceId, aiFoundryStorageSku Letting the AI Foundry module create its required Storage, Search, Cosmos DB, and Key Vault resources, or reusing existing ones.
RAG workload data services Required for default RAG, parameter-controlled deploySearchService, deployStorageAccount, deployCosmosDb, storageAccountContainersList, databaseContainersList Running the standard indexed-document, conversation-state, and file-storage data path. Disable only for a customized topology that replaces those dependencies.
App Configuration, managed identity / RBAC, Container Apps, Container Registry Required platform baseline deployAppConfig, deployContainerApps, deployContainerEnv, deployContainerRegistry, useUAI, service role lists Hosting the runtime services and centralizing runtime settings without hard-coded credentials.
Workload Key Vault and observability Default support, parameter-controlled or reusable deployKeyVault, deployLogAnalytics, deployAppInsights, EXISTING_LOG_ANALYTICS_WORKSPACE_RESOURCE_ID, EXISTING_APPLICATION_INSIGHTS_RESOURCE_ID, EXISTING_APPLICATION_INSIGHTS_CONNECTION_STRING Storing workload secrets and capturing telemetry. Application Insights is created or wired only when an effective Log Analytics workspace is available.
Zero Trust private networking Optional security posture networkIsolation, allowedIpRanges, useExistingVNet, deploySubnets, policyManagedPrivateDns, EXISTING_PRIVATE_DNS_ZONE_* Requiring private endpoints, private DNS, VNet integration, NSGs, and internal Container Apps ingress.
Azure Firewall, Jumpbox, Bastion, NAT Gateway, private ACR build pool Zero Trust operations/build options DEPLOY_AZURE_FIREWALL, DEPLOY_JUMPBOX, DEPLOY_BASTION, DEPLOY_NAT_GATEWAY, DEPLOY_ACR_TASK_AGENT_POOL, EXISTING_JUMPBOX_RESOURCE_ID, EXISTING_BASTION_RESOURCE_ID, EXISTING_NAT_GATEWAY_RESOURCE_ID Operating from inside the VNet, reusing central access/egress resources, or enabling a private ACR Task agent pool. In GPT-RAG, the ACR agent pool defaults to off and is opt-in.
Application Gateway WAF public ingress Optional entry layer publicIngress.enabled Exposing one private Container App through controlled public HTTPS/WAF. See Application Gateway.
Existing platform / AI Landing Zone integration Optional enterprise integration DEPLOYMENT_MODE=ailz-integrated, USE_EXISTING_VNET, EXISTING_*_RESOURCE_ID, HUB_INTEGRATION_* Reusing central network, DNS, observability, Bastion, NAT, or hub-spoke resources.
Scenario capabilities Optional feature add-ons DEPLOY_SPEECH_SERVICE, DEPLOY_GROUNDING_WITH_BING, ENABLE_AGENTIC_RETRIEVAL Enabling voice, Bing grounding, or agentic retrieval scenarios. MCP/tool-hosting and NL2SQL application behavior are configured outside the Bicep-deployed infrastructure shown in this diagram.

Key Capabilities

  • Enterprise-Grade Security
    Optional Zero Trust architecture with private endpoints, Azure Key Vault integration, and comprehensive monitoring.

  • Flexible & Customizable
    Modular design with customizable orchestration, multiple interface options, and bring-your-own-resources support.

  • Multimodal Experience
    Native support for text, images, and voice with SharePoint and Fabric connectors for seamless data integration.

  • Production Ready
    Enterprise-ready infrastructure with support for CI/CD pipelines and quality evaluation integration.

© 2025 GPT-RAG — powered by ❤️ and coffee ☕