Skip to content

What's New

📌 Check out what's coming next (Azure org only)

April 2026

Release 2.6.3 - Ingestion Overhaul, Admin Dashboard, and Cost Optimization

Ingestion Admin Dashboard

A new React-based admin dashboard is available at /dashboard for monitoring and managing ingestion jobs. It provides paginated job and file tables, search, filters, and the ability to unblock stuck files. Processing timings are displayed as stacked color bars showing each phase (download, analysis, chunking, index upload), and per-file cost estimates break down spending by service.

Content Understanding Integration

Document analysis now uses Azure AI Foundry Content Understanding (prebuilt-layout) by default instead of Document Intelligence, resulting in approximately 69% cost reduction per page.

Reliability and Large File Handling

Files that fail during ingestion are now tracked per attempt. After exceeding the maximum retries (default 3), the file is automatically blocked, preventing repeated reprocessing and unnecessary document analysis costs. Stale jobs stuck after a container crash are auto-recovered after 2 hours. Additionally, large PDFs exceeding the analysis page limit (default 300 pages) are split automatically, and a memory guard skips oversized files to prevent OOM crashes.


Release 2.6.1 - Conversation History and Multimodal Improvements

Conversation History

Users can now list, resume, and delete past conversations directly from a sidebar in the chat UI.

Multimodal Improvements

Images now appear inline between response steps instead of grouped at the bottom, with improved validation accuracy.


March 2026

Release 2.5.3 - New Orchestration Strategies, Infrastructure Overhaul, and Multimodality

New Orchestration Strategies

The orchestrator now supports new agentic strategies:

  • Agent Service v2 uses Azure AI Foundry Agent Service v2 for managed orchestration.

  • Microsoft Agent Framework, Lightweight orchestration with direct Foundry access, no Agent Service.

  • Agent Service + Agent Framework combines Agent Service v2 with the Microsoft Agent Framework for advanced scenarios.

  • Multimodal adds image understanding support for multimodality scenarios.

Infrastructure as External Bicep Module

Bicep infrastructure extracted to the external bicep-ptn-aiml-landing-zone module for better maintainability and reuse. Deploy scripts hardened. #424


January 2026

Release 2.4.0 - Authentication and Document-Level Security

This release introduces Microsoft Entra ID authentication in the frontend, with orchestrator-side user identity validation, plus RBAC-based access control and document-level authorization in retrieval workflows. It propagates user identity context through ingestion and orchestration so Azure AI Search can enforce fine-grained ACL/RBAC permissions end-to-end. #417

How to configure it: Authentication and Document-Level Security

December 2025

Release 2.3.0 - SharePoint Lists and Azure Direct Models

Azure Direct Models (Microsoft Foundry)

You can use Microsoft Foundry “Direct from Azure” models (for example, Mistral, DeepSeek, Grok, Llama, etc.) through the Foundry inference APIs with Entra ID authentication. #296

How to configure it: Azure Direct Models

Demo Video:

SharePoint Lists

The SharePoint connector now covers both SharePoint Online document libraries (files like PDFs/Office docs) and generic lists (structured fields) so your Azure AI Search index stays in sync with list items and documents. #369

How to configure it: SharePoint Data Source and SharePoint Connector Setup Guide


October 2025

Release 2.2.0 - Agentic Retrieval and Network Flexibility

This release introduces major enhancements to support more flexible and enterprise-ready deployments.

Bring Your Own VNet

Enables organizations to deploy GPT-RAG within their existing virtual network, maintaining full control over network boundaries, DNS, and routing policies. #370

Agentic Retrieval

Adds intelligent, agent-driven retrieval orchestration that dynamically selects and combines information sources for more grounded and context-aware responses. #359


September 2025

Release 2.1.0 - User Feedback Loop

Introduces a mechanism for end-users to provide thumbs-up or thumbs-down feedback on assistant responses, storing these signals alongside conversation history to continuously improve response quality.

© 2025 GPT-RAG — powered by ❤️ and coffee ☕