Retrieval backend selection
Use this guide when you need to decide how GPT-RAG should retrieve grounding content.
Starting with GPT-RAG v3.0.2 and AI Landing Zone v2.1.2, new deployments use
Foundry IQ by default through a native Azure Blob Knowledge Source. The
deployment provisions the Knowledge Source and Knowledge Base automatically, and
Foundry IQ processes files directly from the documents container.
Existing deployments can stay on Azure AI Search until you explicitly migrate:
RETRIEVAL_BACKEND=ai_search
If you need to keep a custom GPT-RAG ingestion pipeline, use the Foundry IQ
searchIndex path. In that mode, GPT-RAG ingestion processes files, writes
chunks to Azure AI Search, and Foundry IQ retrieves from that existing index.
The two backends
| Backend | What it uses | Best for | Operator impact |
|---|---|---|---|
foundry_iq |
A Foundry IQ knowledge base hosted by Azure AI Search agentic retrieval. | New GPT-RAG v3.0.2+ deployments. | The default native Blob path is provisioned automatically by AI Landing Zone v2.1.2+. |
ai_search |
The GPT-RAG Azure AI Search index, usually ragindex. |
Existing deployments that are not migrating yet, rollback, or compatibility scenarios. | No migration required. Existing deployments can keep this setting until they migrate. |
flowchart LR
User[User asks a question] --> Orchestrator[GPT-RAG orchestrator]
Orchestrator --> Choice{RETRIEVAL_BACKEND}
Choice -->|ai_search| Search[GPT-RAG Azure AI Search index]
Choice -->|foundry_iq| KB[Foundry IQ knowledge base]
KB --> Blob[Default Blob path]
KB --> Index[Custom ingestion path]
Blob --> Documents[documents container]
Index --> SearchIndex[Existing Azure AI Search index]
Foundry IQ setup choices
Foundry IQ has two setup choices:
| Choice | AILZ value | What it means | Use when |
|---|---|---|---|
| Default Blob path | FOUNDRY_IQ_PATTERN=azureBlob |
Foundry IQ reads files directly from the documents container through a native Azure Blob Knowledge Source. Foundry IQ handles source ingestion, chunking, vectorization, index creation, and refresh. GPT-RAG ingestion is not used in this path. |
You want the default new deployment path and do not need a custom GPT-RAG ingestion pipeline. |
| Custom ingestion path | FOUNDRY_IQ_PATTERN=searchIndex |
GPT-RAG ingestion processes files, writes chunks to Azure AI Search, and that existing index is registered as a Foundry IQ searchIndex knowledge source. |
You intentionally keep a custom GPT-RAG ingestion pipeline. |
Use the custom ingestion path only when you intentionally keep document processing outside Foundry IQ. It keeps the GPT-RAG ingestion pipeline and index schema while routing retrieval through the Foundry IQ knowledge base.
Security modes
There are two security mechanisms. They solve different problems and should not be mixed up.
| Security mode | Used by | How it is enforced |
|---|---|---|
| Native ingested permissions | The default Blob path and other sources that ingest permissions, such as Azure Blob RBAC scope, ADLS Gen2 ACLs, SharePoint, OneLake/Fabric, or Purview labels. | The orchestrator forwards the user's delegated Search token in x-ms-query-source-authorization. Foundry IQ evaluates the ingested permissions. |
| GPT-RAG security fields | The custom ingestion path with the existing GPT-RAG index. | The orchestrator sends an OData filterAddOn over GPT-RAG security fields. This is separate from the OBO header. |
Important rules:
- Plain Blob storage provides container-level RBAC for this purpose. Do not rely on it for per-document trimming unless you use Purview labels or an equivalent permission source.
- ADLS Gen2 ACLs, SharePoint, OneLake/Fabric, and Purview labels can support per-document permission trimming when native permission ingestion is configured.
- The custom ingestion path uses the existing GPT-RAG index registered as a
searchIndexknowledge source. GPT-RAG security fields are enforced throughfilterAddOn, not throughx-ms-query-source-authorization. - The
x-ms-query-source-authorizationheader is for native ingested permissions. - Per-user security uses the
2026-05-01-previewAzure AI Search API. - Security-enabled retrieval must fail closed. Missing token, filter, or permission configuration should be treated as an error, not as permission to run an unfiltered query.
Configuration settings
All runtime settings are stored in Azure App Configuration with the gpt-rag
label, or supplied as container environment variables when you use the
containerEnv runtime mode.
| Setting | Default | Used when | Purpose |
|---|---|---|---|
RETRIEVAL_BACKEND |
foundry_iq for new v3.0.2+ deployments |
Always | Selects ai_search or foundry_iq. Existing deployments can keep ai_search until they migrate. |
KNOWLEDGE_BASE_NAME |
Generated during deployment | foundry_iq |
Foundry IQ knowledge base name. |
KNOWLEDGE_BASE_ENDPOINT |
Generated during deployment | foundry_iq |
Azure AI Search endpoint that hosts the knowledge base. |
KNOWLEDGE_BASE_CONNECTION_ID |
Generated during deployment | foundry_iq |
Dedicated AI Foundry connection ID for the knowledge base. Do not reuse SEARCH_CONNECTION_ID. |
FOUNDRY_IQ_PATTERN |
azureBlob |
foundry_iq provisioning |
Selects the default Blob path or the custom ingestion path. Use searchIndex only when you intentionally keep a custom GPT-RAG ingestion pipeline. |
FOUNDRY_IQ_API_VERSION |
2026-05-01-preview |
foundry_iq |
Required for native per-user permissions and custom ingestion path filterAddOn. |
FOUNDRY_IQ_KNOWLEDGE_RETRIEVAL_BILLING_PLAN |
free |
Deployment configuration | Azure AI Search knowledgeRetrieval billing plan. |
FOUNDRY_IQ_KNOWLEDGE_SOURCE_NAME |
Generated during deployment | foundry_iq |
Name of the Foundry IQ knowledge source. |
FOUNDRY_IQ_KNOWLEDGE_SOURCE_KIND |
azureBlob |
foundry_iq |
Knowledge Source kind sent to Foundry IQ. Use searchIndex only for the custom ingestion path. |
FOUNDRY_IQ_STORAGE_CONTAINER_NAME |
documents |
Default Blob path | Storage container that Foundry IQ reads directly. |
FOUNDRY_IQ_INGESTION_PERMISSION_OPTIONS |
["rbacScope"] |
Default Blob path | Permission metadata ingested by the native Azure Blob Knowledge Source. |
FOUNDRY_IQ_FILTER_ADD_ON_ENABLED |
false |
Custom ingestion path | Enables query-time GPT-RAG security filtering through filterAddOn. |
FOUNDRY_IQ_SECURITY_FIELD_NAME |
metadata_security_id |
Custom ingestion path | Field used to build the GPT-RAG security filter. |
FOUNDRY_IQ_MAX_OUTPUT_DOCUMENTS |
5 |
foundry_iq |
Caps the number of documents returned by the knowledge base. |
FOUNDRY_IQ_CONTENT_EXTRACTION_MODE |
standard |
Default Blob path | Native Blob content extraction mode. standard uses the Foundry IQ Content Understanding skill (layout and OCR) so scanned and image-only PDFs are ingested with text. minimal skips Content Understanding and only ingests text already present in the source. The setting is immutable on an existing Knowledge Source. See Content extraction modes. |
The AI Landing Zone also stamps setup values used by deployment configuration:
| Setting | Purpose |
|---|---|
FOUNDRY_IQ_SEARCH_INDEX_NAME |
Existing Azure AI Search index registered as the custom ingestion path knowledge source. |
FOUNDRY_IQ_SEMANTIC_CONFIGURATION_NAME |
Semantic configuration on the existing index. |
FOUNDRY_IQ_SOURCE_DATA_FIELDS |
Source fields exposed by the knowledge source. |
FOUNDRY_IQ_SEARCH_FIELDS |
Searchable fields used by the knowledge source. |
FOUNDRY_IQ_BASE_FILTER |
Optional persisted filter on the knowledge source. |
Billing choice
Foundry IQ retrieval uses Azure AI Search agentic retrieval billing through the
Search service knowledgeRetrieval plan.
| Plan | Use when |
|---|---|
free |
You want to stay within the included allowance. Retrieval can fail when the allowance is exhausted. |
standard |
You explicitly opt in to pay-as-you-go retrieval after the included allowance. |
Set the plan before provisioning:
azd env set FOUNDRY_IQ_KNOWLEDGE_RETRIEVAL_BILLING_PLAN free
Use standard only when the operator has approved the billing change.
Default Foundry IQ deployment
For new GPT-RAG v3.0.2+ deployments with AI Landing Zone v2.1.2+, no manual backend configuration is required. The default configuration is:
RETRIEVAL_BACKEND=foundry_iq
FOUNDRY_IQ_PATTERN=azureBlob
FOUNDRY_IQ_KNOWLEDGE_SOURCE_KIND=azureBlob
FOUNDRY_IQ_STORAGE_CONTAINER_NAME=documents
FOUNDRY_IQ_INGESTION_PERMISSION_OPTIONS=["rbacScope"]
FOUNDRY_IQ_CONTENT_EXTRACTION_MODE=standard
The deployment creates a native Foundry IQ Azure Blob Knowledge Source and a
Knowledge Base. Foundry IQ processes files directly from the documents
container. No separate Knowledge Source or Knowledge Base creation step is
required after deployment.
Content extraction modes and PDF handling
The default Blob path supports two content extraction modes. The mode controls whether Foundry IQ runs the Content Understanding skill on each document while ingesting it.
| Mode | What it does | Use when |
|---|---|---|
standard (default, recommended) |
Foundry IQ runs the Microsoft.Skills.Util.ContentUnderstandingSkill on each document for layout, OCR, and structured extraction. Scanned PDFs, image-only PDFs, and PDFs without a text layer are OCR'd and become searchable. |
Use this for any real-world document library. It is the recommended default. |
minimal |
Foundry IQ extracts only the text already present in the source. PDFs without a text layer ingest with empty content and are not searchable. No Content Understanding billing. | You have validated that the corpus is text-only PDFs (or non-PDF text files) and you want to skip Content Understanding cost. |
Picking minimal for a corpus that contains scanned PDFs is a silent failure:
the file lands in the container, ingestion reports success, and every retrieval
against it returns nothing. The default is standard so you do not have to
think about this.
What standard requires
- The Foundry resource must be in a region that supports Content Understanding. See the region support list.
- On first use, Foundry IQ may require a one-time
PATCH /contentunderstanding/defaultscall against the Foundry resource. If this has not been done, Knowledge Source creation fails withDefaultsNotSet. The GPT-RAG post-provision setup handles this for new deployments; if you hitDefaultsNotSet, see the troubleshooting note at the bottom of this section. standardis billed through Content Understanding meters in addition to the Azure AI SearchknowledgeRetrievalplan.- Per-document limits: 300 pages and 5 minutes of processing time.
Documents that exceed those limits, or scenarios that need custom chunking, custom enrichment, or Excel chunking, should use the alternative GPT-RAG ingestion path (see When to use the GPT-RAG ingestion path).
Switching the mode after deployment
FOUNDRY_IQ_CONTENT_EXTRACTION_MODE is immutable on an existing Knowledge
Source. To change it you have to recreate the Knowledge Source, and because the
Knowledge Base references the Knowledge Source by name, you typically recreate
both. There is no in-place update.
To switch from minimal to standard (or back) on an existing deployment:
-
Update the App Configuration value (or
azd env set) to the new mode:azd env set FOUNDRY_IQ_CONTENT_EXTRACTION_MODE standard -
Delete the existing Foundry IQ Knowledge Source and Knowledge Base for the environment from the Azure AI Search resource (data plane). Use the Search data-plane REST API with the
2026-05-01-previewversion. - Re-run the GPT-RAG post-provision step that creates the Foundry IQ resources, so the Knowledge Source is recreated with the new mode and the Knowledge Base is rebound to it.
- Re-ingest the documents container so the new Knowledge Source processes the files with the chosen mode.
flowchart LR
Old[Existing KS + KB on minimal] --> Delete[Delete KB and KS]
Delete --> Update[Update FOUNDRY_IQ_CONTENT_EXTRACTION_MODE]
Update --> Recreate[Re-run post-provision to recreate KS + KB]
Recreate --> Reingest[Re-ingest documents container]
Troubleshooting
DefaultsNotSetwhen creating the Knowledge Source onstandard. The Foundry resource has not been initialized for Content Understanding. Make a one-timePATCH /contentunderstanding/defaultscall against the Foundry resource and re-run the setup. See the Content Understanding skill reference for the payload.- Scanned PDF ingests with empty content. The Knowledge Source is on
minimal. Either re-create it onstandard(steps above) or run the document through the GPT-RAG ingestion path. - PDF over 300 pages or processing times out.
standardcannot handle it. Use the GPT-RAG ingestion path with Document Intelligence.
When to use the GPT-RAG ingestion path
The default Blob path is the simplest production setup, but the GPT-RAG
ingestion service (gpt-rag-ingestion plus Azure AI Search) is still the right
choice for several scenarios:
| Scenario | Why the GPT-RAG ingestion path |
|---|---|
| PDFs over 300 pages, or single documents that take more than 5 minutes to process. | Exceeds the standard content extraction limits. GPT-RAG ingestion uses Azure AI Document Intelligence and can chunk and process longer documents. |
| Custom chunking strategy (sentence/window/semantic chunking, custom overlap, custom metadata enrichment). | The GPT-RAG ingestion pipeline is the place to customize chunking. The default Blob path uses Foundry IQ's built-in chunker. |
| Excel and spreadsheet chunking. | GPT-RAG ingestion has dedicated handling for Excel sheets. |
Per-document security trimming via custom metadata_security_* fields. |
Use the custom ingestion path (FOUNDRY_IQ_PATTERN=searchIndex) with FOUNDRY_IQ_FILTER_ADD_ON_ENABLED=true, or stay on RETRIEVAL_BACKEND=ai_search. |
| Low-latency ingestion after a runtime upload. | The default Blob path source refresh can take seconds to minutes. GPT-RAG ingestion writes directly into the index. |
Two ways to combine the GPT-RAG ingestion service with Foundry IQ retrieval:
RETRIEVAL_BACKEND=ai_search. Retrieval bypasses Foundry IQ entirely and goes straight to the GPT-RAG Azure AI Search index. Simplest rollback target.RETRIEVAL_BACKEND=foundry_iqwithFOUNDRY_IQ_PATTERN=searchIndex. Keep GPT-RAG ingestion producing the index, but register that index as a Foundry IQsearchIndexKnowledge Source so retrieval still goes through the Foundry IQ Knowledge Base. This is the custom ingestion path described below.
Use Foundry IQ with a custom GPT-RAG ingestion pipeline
Use the custom ingestion path only when you need to keep using a custom GPT-RAG ingestion pipeline.
- Start from a working
ai_searchdeployment. -
Set the Foundry IQ parameters before provisioning:
azd env set RETRIEVAL_BACKEND foundry_iq azd env set FOUNDRY_IQ_PATTERN searchIndex azd env set FOUNDRY_IQ_API_VERSION 2026-05-01-preview azd env set FOUNDRY_IQ_KNOWLEDGE_RETRIEVAL_BILLING_PLAN free azd env set FOUNDRY_IQ_FILTER_ADD_ON_ENABLED true -
Run
azd provision. - Deploy or restart the orchestrator so it reads the backend selector at startup.
Roll back to Azure AI Search
Rollback is intentionally simple.
-
Set the backend selector back to Azure AI Search:
az appconfig kv set ` --name <app-config-name> ` --key RETRIEVAL_BACKEND ` --value ai_search ` --label gpt-rag ` --yes -
Restart the orchestrator Container App so the startup selector is re-read.
- Ask a known retrieval question and confirm citations come from the GPT-RAG Azure AI Search index.
You do not need to delete the knowledge base to roll back. Leave it in place while you investigate.
Migration guidance
- New GPT-RAG v3.0.3+ deployments use the Foundry IQ default Blob path with
FOUNDRY_IQ_CONTENT_EXTRACTION_MODE=standard, which handles scanned and image-only PDFs out of the box. - Existing deployments can stay on
ai_searchuntil you explicitly migrate. - Use the custom ingestion path if you need to keep using a custom GPT-RAG ingestion pipeline.
- Use the GPT-RAG ingestion path for PDFs over 300 pages, custom chunking, Excel chunking, or low-latency runtime uploads.
- Do not claim per-document security for plain Blob unless you use Purview labels or another per-document permission source. The default Blob path uses RBAC scope permissions.
- Keep the rollback command ready during the first production change window.
Known limitations
- The default Blob path on
FOUNDRY_IQ_CONTENT_EXTRACTION_MODE=standardhas per-document limits of 300 pages and 5 minutes of processing time. Documents that exceed those limits need the GPT-RAG ingestion path with Document Intelligence. FOUNDRY_IQ_CONTENT_EXTRACTION_MODEis immutable on an existing Knowledge Source. Switching modes requires deleting and recreating the Knowledge Source and Knowledge Base, and re-ingesting the container.standardmode requires a Foundry resource in a Content Understanding supported region and is billed through Content Understanding meters in addition to theknowledgeRetrievalplan.- Multimodal captioning parity for the default Blob path is deferred. Keep
multimodal on
ai_searchor the custom ingestion path until the native Foundry IQ output is proven equivalent. - Runtime document uploads still need the self-managed GPT-RAG index for low-latency searchability. Use the custom ingestion path for that path.
- The default Blob path source refresh can take seconds to minutes. It is not a low-latency upload path.
- The custom ingestion path requires the existing index to have a semantic configuration. Vector fields must have the expected vectorizer setup.
- The knowledge base and custom ingestion path index must live on the same Search service.
- Per-user security and
filterAddOnrequire2026-05-01-preview.