Troubleshooting
This page covers common issues, debugging tools, and how to inspect logs in GPT-RAG.
Showing Response Time Statistics in the Chat UI
The GPT-RAG UI includes a built-in option to display response time after each agent answer. To enable it, set the SHOW_STATISTICS application setting to true in your Container App (or App Configuration). Once enabled, each response in the chat will show timing information, helping you identify slow responses and compare performance across different queries or configurations.
Enabling Debug Logging
To increase log verbosity for any GPT-RAG component (orchestrator, ingestion, or UI), set the LOG_LEVEL environment variable to DEBUG in the corresponding Container App. For example, in the Azure Portal go to your Container App → Environment variables → set LOG_LEVEL = DEBUG. After restarting the container, the application will emit detailed logs including internal function calls, SDK diagnostics, and step-by-step execution traces. Remember to revert to INFO or WARNING after troubleshooting to avoid excessive log volume and cost.
Viewing Logs in Application Insights
All GPT-RAG components send telemetry to Application Insights. Open your Application Insights resource in the Azure Portal and go to Logs to run KQL queries.
To find recent errors across all components:
traces
| where timestamp > ago(1h)
| where severityLevel >= 3
| project timestamp, message, cloud_RoleName, severityLevel
| order by timestamp desc
| take 50
To see errors specifically in the orchestrator:
traces
| where timestamp > ago(24h)
| where cloud_RoleName contains "orchestrator"
| where severityLevel >= 3
| project timestamp, message, operation_Id
| order by timestamp desc
To trace a single request end-to-end using its operation ID (you can get this from a previous query or from the UI response headers):
traces
| where operation_Id == "YOUR_OPERATION_ID"
| order by timestamp asc
| project timestamp, message, severityLevel, cloud_RoleName
To check for rate-limit (429) or throttling issues in ingestion:
traces
| where timestamp > ago(24h)
| where cloud_RoleName contains "ingest"
| where message contains "429" or message contains "throttl" or message contains "rate limit"
| project timestamp, message
| order by timestamp desc
To view exceptions with stack traces:
exceptions
| where timestamp > ago(24h)
| project timestamp, problemId, outerMessage, details, cloud_RoleName
| order by timestamp desc
| take 20
Known Issues and Fixes
Below is a list of commonly reported issues that have been resolved. If you encounter one of these, make sure you are running the version that includes the fix.
OOM container restarts during parallel ingestion — Ingestion containers could run out of memory when processing multiple large files concurrently. Fixed by adding memory guards, temp-file downloads for large PDFs, and lowering default concurrency. See #438.
Re-indexing caused by embedding retry issues — Transient embedding failures could cause documents to be unnecessarily re-indexed. Fixed in ingestion v2.2.5. See #437.
All documents re-indexed when permissionFilterOption is enabled — Enabling document-level security caused a full re-index instead of incremental updates. Fixed in ingestion v2.2.5. See #436.