Introducing Agent-Driven Retrieval-Augmented Generation to Amazon Q Business

Amazon Q Business is an AI-driven enterprise assistant that empowers organizations to derive value from their data. By linking with enterprise data sources, users can leverage Amazon Q Business to swiftly acquire answers, create content, and streamline tasks—from accessing HR policies to enhancing IT support processes—all while adhering to existing permissions and providing clear citations. At the core of systems like Amazon Q Business is Retrieval Augmented Generation (RAG), which allows AI models to ground their responses in an organization’s enterprise data.

The evolution of RAG

Traditional RAG implementations typically adhere to a linear method: retrieve pertinent documents or passages based on a user query and then generate a response using these documents as context for the large language model (LLM). While effective for basic factual queries, enterprise environments pose unique challenges that reveal the limitations of this single-step retrieval process.

Consider an employee inquiring about the differences between two benefits packages or requesting a comparison of project outcomes over multiple quarters. Such inquiries necessitate synthesizing data from various sources, grasping company-specific nuances, and often require several retrieval steps to collect comprehensive information on each aspect of the query.

Traditional RAG systems often falter with this complexity, providing incomplete responses or failing to adjust their retrieval strategies in the face of insufficient initial results. For more intricate queries, users are left without visibility into the system’s progress, resulting in an opaque experience.

Bringing agency to Amazon Q Business

Introducing agency to Amazon Q Business presents a new paradigm for addressing complex enterprise queries through intelligent, agent-based retrieval strategies. By deploying AI agents that dynamically orchestrate sophisticated retrieval plans with a suite of data navigation tools, Agentic RAG marks a substantial advancement in how AI assistants engage with enterprise data, delivering more accurate and comprehensive responses while upholding the expected speed.

With Agentic RAG in Amazon Q Business, users gain several new features, including query decomposition, transparent events, agentic tool utilization, enhanced conversational abilities, and optimized responses. Let’s explore what each of these entails.

Query decomposition and transparent response events

Traditional RAG systems often struggle with complex enterprise queries, especially those that involve multiple steps, composite elements, or comparative analysis. The latest release of Agentic RAG in Amazon Q Business aims to tackle this issue through advanced query decomposition techniques, where AI agents intelligently break down complex questions into discrete, manageable components.

When an employee asks, “Please compare the vacation policies of Washington and California?”, the query is decomposed into two distinct queries: one for Washington vacation policies and another for California vacation policies.

Since Agentic RAG anticipates a series of parallel steps to delve into the data source and gather thorough information for more accurate query resolution, it now offers real-time visibility into its processing steps displayed on-screen as data is retrieved for response generation. After arriving at a response, the steps will be consolidated, with the response streamed. The following image illustrates how the decomposed queries are shown alongside the relevant data fetched for response formulation.

This functionality allows users to observe meaningful updates in the system’s workings, including query decomposition patterns, document retrieval paths, and response generation workflows. This granular visibility into the system’s decision-making process boosts user confidence and offers valuable insights into the intricate mechanisms steering accurate response generation.

The agentic solution ensures comprehensive data collection and enables more precise, nuanced responses. The outcome is enhanced responses that maintain both detailed precision and a holistic understanding of complex, multi-faceted business queries, with the LLM synthesizing the retrieved information. As depicted in the image below, the information collected individually for California and Washington vacation policies was synthesized by the LLM and presented in a rich markdown format.

Agentic tool use

The designed RAG agents can adeptly utilize a range of data exploration tools and retrieval methods, optimizing strategies by considering the retrieval plan while maintaining context across multiple conversational turns. These retrieval tools encompass those crafted within Amazon Q Business, such as tabular search, which facilitates intelligent data retrieval through either code generation or tabular linearization across small and large tables embedded in documents (like DOCX, PPTX, PDF, etc.) or stored in CSV or XLSX files. Another retrieval tool, long context retrieval, identifies when the complete context of a document is necessary for retrieval. For instance, if a user asks, “Summarize the 10K of Company X,” the agent detects the query’s intent as a summarization request needing document-level context, leading it to deploy the long context retrieval tool that fetches the complete document—the 10K of Company X—as part of the context for the LLM’s response generation (as shown in the figure below). This intelligent selection and deployment of tools signify a substantial advancement over traditional RAG systems, which often depend on fragmented passage retrieval that can hinder the coherence and completeness of complex document analysis for question answering.

Improved conversational capabilities

Agentic RAG introduces multi-turn query capabilities, enhancing the conversational abilities of Amazon Q Business into dynamic, context-aware dialogues. The agent retains conversational context throughout interactions by using short-term memory, facilitating natural follow-up questions without requiring users to restate prior context. Moreover, when the agent uncovers multiple potential answers derived from your enterprise data, it poses clarifying questions to disambiguate the query, aiming for clearer understanding to provide more accurate responses. For example, “Q” can refer to any of various Amazon Q implementations. The system effectively handles semantic ambiguity, recognizing multiple potential meanings of “Q” and asking for clarifications to ensure accuracy and relevance. This sophisticated dialogue management approach streamlines complex tasks like policy interpretation or technical troubleshooting, as the system incrementally enhances its understanding through targeted clarifications and follow-up exchanges.

In the image below, the user queries, “Tell me about Q,” prompting the system to deliver an overview of the various implementations while asking a follow-up question to refine the user’s intent.

Upon successful disambiguation, the system retains both the conversation state and previously retrieved contextual data in memory, allowing the generation of precisely targeted responses that align with the user’s clarified intent—thus resulting in responses that are more accurate, relevant, and comprehensive.

Agentic response optimization

Agentic RAG rolls out dynamic response optimization, where AI agents actively assess and refine their answers. Unlike traditional systems that provide responses even under insufficient context, these agents continuously evaluate response quality and plan new actions to enhance information completeness. They can recognize when initial retrievals lack critical information and independently initiate additional searches or alternative retrieval methods. This means when navigating complex topics like compliance policies, the system captures all pertinent updates, exceptions, and interdependencies while preserving context across multiple conversational turns. The diagram below illustrates how Agentic RAG manages conversation history across multiple exchanges, with the agent planning and reasoning through the utilization of retrieval tools and response generation processes. Based on initial retrievals, while considering the ongoing conversation state and history, the agent will re-strategize as needed to deliver the most complete and accurate responses to the user’s inquiries.

Using the Agentic RAG feature

Getting started with the advanced features of Agentic RAG in Amazon Q Business is straightforward and can markedly enhance how your organization engages with its enterprise data. To begin, simply toggle on the Advanced Search option in the Amazon Q Business web interface, as shown in the image below.

Once advanced search is enabled, users will experience richer, more comprehensive responses from Amazon Q Business. Agentic RAG excels in handling complex business scenarios based on your enterprise data—envision inquiring about performance comparisons across AWS Regions, exploring policy implications among departments, or analyzing historical trends in project deliveries. The system adeptly breaks down these intricate queries into manageable search tasks while maintaining context throughout the discourse.

For optimal results, users should feel empowered to pose detailed, multi-part questions. Contrary to traditional search systems, Agentic RAG efficiently tackles nuanced queries like, “How have our metrics changed across the southeast and northeast regions in 2024?” The system will meticulously navigate such inquiries, visibly demonstrating its progress as it analyzes and dissects the query into component parts to procure adequate context and generate a thorough and accurate response.

Conclusion

Agentic RAG signifies a substantial advancement for Amazon Q Business, reshaping how organizations utilize their enterprise data while upholding the robust security and compliance expected from AWS services. Through its advanced query processing and contextual comprehension, the system facilitates deeper, more nuanced interactions with enterprise data—from comparative multi-step queries to engaging multi-turn chat experiences. All this unfolds within a secure framework that respects established permissions and access controls, ensuring users receive only authorized information while maintaining rich, contextual responses essential for actionable insights.

By merging advanced retrieval capabilities with intelligent, conversation-aware interactions, Agentic RAG allows organizations to fully harness their data’s potential while adhering to the highest standards of data governance. The outcome is an enriched chat experience and a more capable query-answering engine that maximizes the value of your data assets.

Explore the potential of Amazon Q Business for your organization with your data, and feel free to share your feedback in the comments.

About the authors

Sanjit Misra is a technical product leader at Amazon Web Services, steering innovation on Amazon Q Business, Amazon’s generative AI product. He oversees product development for core Agentic AI features enhancing accuracy and retrieval—including Agentic RAG, conversational disambiguation, tabular search, and long-context retrieval. With over 15 years in product and engineering roles in data, analytics, and AI/ML, Sanjit combines deep technical expertise with a history of delivering business outcomes. He is based in New York City.

Venky Nagapudi is a Senior Manager of Product Management for Amazon Q Business, focusing on RAG features, accuracy enhancement, user identity management, and user subscriptions.

Yi-An Lai is a Senior Applied Scientist with the Amazon Q Business team at Amazon Web Services in Seattle, WA. His expertise encompasses agentic information retrieval, conversational AI systems, LLM tool orchestration, and advanced natural language processing. With over a decade in ML/AI, he is dedicated to developing sophisticated AI solutions that connect cutting-edge research with practical enterprise applications.

Yumo Xu is an Applied Scientist at AWS, specializing in building helpful and responsible AI systems for enterprises. His primary research interests revolve around foundational challenges of machine reasoning and agentic AI. Prior to AWS, Yumo earned his PhD in Natural Language Processing from the University of Edinburgh.

Danilo Neves Ribeiro is an Applied Scientist on the Q Business team in Santa Clara, CA. He is focused on designing innovative solutions for information retrieval, reasoning, language model agents, and conversational experiences for enterprise applications within AWS. He holds a Ph.D. in Computer Science from Northwestern University (2023) and has over three years of experience as an AI/ML scientist.

Kapil Badesara is a Senior Machine Learning Engineer at AWS Q Business, dedicating his efforts to optimizing RAG systems for improved accuracy and efficiency. Based in Seattle, he has over 10 years of experience in building large-scale AI/ML services.

Sunil Singh is an Engineering Manager on the Amazon Q Business team, where he leads the development of next-generation agentic AI solutions designed to enhance Retrieval-Augmented Generation (RAG) systems for greater accuracy and efficiency. Based in Seattle, he has more than 10 years of experience in architecting secure, scalable AI/ML services for enterprise-grade applications.



Source link

Alex Parker

Alex Parker is a tech enthusiast and digital tools reviewer with over a decade of experience exploring software solutions that boost productivity. He specializes in file management, conversion technologies, and emerging AI-driven applications, helping readers choose the right tools for their needs.