About Coveo Search Agents

This is for:

In this article

What the Search Agent provides
Search Agent overview
How the Search Agent works
What’s next?

Beta feature

Coveo Search Agent is currently available as a beta offering. Contact your Customer Success Manager for access to this feature. Your use of this feature is subject to the beta and pre-release terms of your agreement with Coveo, including any applicable beta or pre-release provisions therein. To the extent your agreement does not contain specific beta or pre-release terms, Section 8 (Beta Features) of the Coveo Customer Agreement shall apply. This feature is provided "as-is," without warranty or SLA coverage, and may be modified, suspended, or discontinued at any time. You should not use this feature to process sensitive or regulated data.

The Coveo Search Agent adds a conversational search experience to a Coveo-powered search interface. It uses agentic AI capabilities, such as reasoning and decision-making, to orchestrate multiple rounds of content retrieval and answer generation based on follow-up questions.

Conversational answer generation component

The Search Agent works with existing Coveo indexing, AI, personalization, recommendation, machine learning, relevance, and security features. This results in a powerful enterprise-ready agentic conversational search experience that generates answers that are relevant, personalized, and secure. For more information, see Search Agent data security.

Instead of relying on a static execution pattern, the Search Agent workflow is dynamic and uses agentic reasoning to adapt to user input. The Search Agent allows users to self-serve and find answers to complex queries through natural dialogue and multi-turn answer generation based on conversational context.

Note

For complete implementation steps for the Coveo Search Agent, see Search Agent implementation overview.

The Coveo Search Agent is designed to be used in a Coveo-powered search interface and uses Coveo AI tools. To use Coveo’s search, retrieval, and answer capabilities in your own AI agent framework or Large Language Model (LLM) applications, you can use the Coveo MCP Server instead of the Coveo Search Agent.

The Search Agent’s value lies in providing a conversational search experience within your search interface, enabling users to refine their queries through natural follow-up questions and receive more relevant, context-aware answers over multiple turns.

Although you can disable the Conversational AI capabilities for your Search Agent, this isn’t recommended, as doing so limits the Search Agent to generating a single answer based solely on the initial user query. Many of the features and capabilities described in this article rely on a conversational experience. Without it, the Search Agent behaves like a traditional answer-generation system and doesn’t support follow-up questions and multi-turn interactions.

What the Search Agent provides

The Coveo Search Agent adds an agentic retrieval-augmented generation (RAG) search experience with conversational, multi-turn interactions to your Coveo-powered search interface. It’s designed to be used in self-service scenarios, enabling users to find answers to complex questions on their own.

The Search Agent provides the following main features:

Fully-managed RAG experience

The Coveo Search Agent is an easy-to-deploy agentic RAG solution for your enterprise. Coveo manages the content retrieval, grounding, answer generation, agent orchestration, model interactions, and ongoing optimizations. Just configure it for your specific use case, such as for self-service, and deploy it in your Coveo-powered search interface on your website. The Search Agent provides you with advanced agentic capabilities without having to build and maintain it yourself.
Conversational, multi-turn interactions

User queries can often be complex, requiring more than a single round of content retrieval to generate an answer that fully addresses the user’s needs.

The Coveo Search Agent uses conversational capabilities to orchestrate multiple rounds of content retrieval and reasoning based on follow-up questions. This allows the agent to generate answers to complex queries that span multiple sources of information. The agent asks clarifying questions when needed, and generates a new answer for every new interaction within a conversation. For a given conversation, each new answer is generated by retrieving new content based on the context of the ongoing conversation.
Agentic capabilities

The Coveo Search Agent is an AI agent that analyzes your enterprise content and the conversation context to deliver relevant answers to user queries.

It maintains the conversation state across multiple turns, remembering prior questions and user choices to understand the user’s intent over time. By analyzing both the current query and the conversation history, the agent determines what the user is trying to accomplish and decides how to respond. To do this, it reformulates the follow-up query to retrieve new relevant passages based on the latest context, reason over the retrieved information, and determine whether to provide an answer or ask a clarifying question.
Grounded, secure, and relevant answers

The Search Agent uses content retrieved from your Coveo index to ground the generated answers. It uses two layers of content retrieval to ensure relevance, and enforces content permissions so that answers only include information the authenticated user is authorized to access.

The Coveo Search Agent orchestrates multiple rounds of content retrieval and answer generation based on follow-up questions. The Search Agent relies on an internal conversation ID to ensure that generated answers are confined to the current conversation, keeping responses grounded in its existing context.

For more information, see Search Agent data security.

Search Agent overview

The following diagram provides a high-level overview of how the Search Agent operates in response to a user query in a Coveo-powered search interface.

A user enters a query in a Coveo-powered search interface configured with a Search Agent.
The query is simultaneously sent to both the query pipeline that’s used by the search interface to process queries, and the Search Agent.

The query sent to the query pipeline retrieves the most relevant items from the index, which are displayed as traditional search results in the search interface. The query that’s sent to the Search Agent is used to generate an answer.
- Search results
  
  The query pipeline processes the query as it normally does, applying pipeline rules and machine learning to optimize personalization and relevance, to find the most relevant items from the index.
  
  The search results are based on the initial user query and search context. The results are independent from the Search Agent and don’t update when a user enters a follow-up question. The search results remain the same throughout the conversation, and refresh only when the search context changes and a new conversation begins.
- Answer generation
  
  The Search Agent uses advanced reasoning and decision-making to coordinate the use of Coveo platform features and tools to generate an answer for the user query.
  
  The Search Agent uses two layers of content retrieval to identify the most relevant content to use for answer generation. Through multi-turn conversational search, users can find answers to complex queries by asking follow-up questions. The Search Agent preserves conversational context, allowing it to better interpret user intent and iteratively refine its answers for greater relevance.
  
  Note
  
  The answer generation process is covered in more detail in How the Search Agent works.

How the Search Agent works

This section describes the conversational flow and details how the Search Agent retrieves content and generates answers.

Conversational flow

Before looking at how the Search Agent generates answers in a conversational search experience, it’s important to first understand the conversational flow.

The following example illustrates this flow through a typical user search session with the Search Agent:

A user enters an initial query using the main search box in a Coveo-powered search interface configured with a Search Agent.

The Search Agent generates the initial answer, known as the head answer, for the user query based on the search context at that time, which includes the query itself, and any applied filters, selected facets, and sorting. This marks the start of a new conversation.

The user asks a follow-up question using the conversational search.

The Search Agent generates an answer for the follow-up. To generate the new answer, the Search Agent analyzes the follow-up query in relation to the conversation context, which includes the search context and the conversation history. This allows the Search Agent to understand the user’s intent and reformulate the query accordingly. The Search Agent uses this reformulated query to retrieve new content and generate the answer.

Note

For information on how the Search Agent uses reasoning and decision-making when generating answers, see Search Agent reasoning and decision-making.

The user resets the search context. To ensure continuity, coherence, and relevance in the follow-up answers, the search context must remain the same throughout the conversation. Because the search context has changed, the Search Agent ends the previous conversation and generates a new answer based on the updated context. This marks the beginning of a new conversation that users can continue to explore through follow-up questions. For more information, see Conversation context and lifecycle.

Conversation context and lifecycle

In the Search Agent workflow, a conversation consists of a series of interactions that begins with an initial user query and an initial generated answer, referred to as the conversation’s head answer. This is followed by a series of follow-up questions and corresponding generated answers.

To ensure continuity and relevance in the follow-up answers, each conversation is grounded in a fixed search context that’s defined at the moment the initial query is submitted. The search context includes the query itself, and any applied filters, selected facets, sorting, etc., present in the search interface at that time.

The conversation context includes both the search context and the full conversation history, which includes the initial query, subsequent follow-up queries, and the answers generated along the way.

A change in the search context marks the end of the current conversation and the beginning of a new conversation, even if the initial query remains the same.

Search Agent conversation lifecycle | Coveo

The search context changes, and a new conversation begins, when a user performs the following on the search interface:

Submits a new query in the main search box (not a follow-up question).
Applies or removes facets.
Switches tabs.
Modifies sorting.

When a change in search context occurs:

The Search Agent generates a new answer for the query based on the search context at that time.
A new conversation begins, which means answers to follow-up questions will now be based on the new search context and the new conversation history.

The conversation context is a key component of the Search Agent answer generation process, enabling the system to produce follow-up answers that remain relevant, coherent, and aligned with the user’s evolving intent throughout the conversation.

Content retrieval and answer generation

The Search Agent operates through an iterative cycle of content retrieval and answer generation within a conversational search experience. It applies agentic reasoning to dynamically orchestrate various Coveo platform features and tools, ensuring each response is informed, relevant, and context-aware.

This section provides an overview of the answer generation process, as well as describes how the Search Agent uses reasoning and decision-making when generating answers.

Note

Your enterprise content permissions are enforced during content retrieval. This ensures that an output generated from the retrieved content only shows the information that the authenticated user is allowed to access.

Answer generation overview

To make sure that answers are generated based on the most relevant content, the Search Agent orchestrates two layers of content retrieval before sending the final list of the most relevant passages to the answer generation model.

Note

This section references Coveo Machine Learning (Coveo ML) models and configurations that are required as part of the Search Agent implementation.

To accomplish this, the Search Agent orchestrates the use of the following Coveo platform features and tools:

The Coveo Search API for first-stage content retrieval to retrieve the most relevant items from the index.
The CPR model for second-stage content retrieval to retrieve the most relevant passages from the items retrieved during first-stage content retrieval.
The answer generation model to create a detailed grounded prompt based on the retrieved passages and the query, which it sends to a third-party generative LLM to generate the final answer.

Note

For information on how the Search Agent uses reasoning and decision-making when generating answers, see Search Agent reasoning and decision-making.

Search Agent answer generation flow | Coveo

The Search Agent sends the query to the Coveo Search API for first-stage content retrieval, where the most relevant items are retrieved from the index. The Search Agent uses the query pipeline that’s associated with the search interface to apply rules and machine learning to optimize personalization and relevance when retrieving the content.

The most relevant items retrieved during first-stage content retrieval, along with the query, are sent to the CPR model for second-stage content retrieval. The CPR model retrieves the most relevant passages from the most relevant items.

Note

The maximum number of items the CPR model considers when retrieving the most relevant passages is set by the Items to consider option in the Search Agent configuration.

The Search Agent calls the answer generation model and provides it with the most relevant passages obtained during second-stage content retrieval, as well as the query. The answer generation model creates a prompt that includes instructions, the query, and the most relevant passages.

Note

The answer generation model that’s used by the Search Agent is created and managed by Coveo personnel.

To generate the answer, the Search Agent uses a third-party generative LLM that’s hosted on an external foundation model service server. The prompt that’s created by the answer generation model is sent to the foundation model service where the LLM generates the answer based only on the grounded prompt.

While the foundation model service hosts the generative LLM and processes your data for the purpose of generating answers, it doesn’t store your data.

The generated answer is returned to the Search Agent and displayed in the search interface.

Search Agent reasoning and decision-making

Instead of relying on a static execution pattern of content retrieval and answer generation, the Search Agent uses agentic reasoning and decision-making to orchestrate an iterative, multi-step process that determines the best course of action for generating answers in a conversational search experience.

New answers aren’t generated in isolation or by initiating a completely new search, but rather by building on the current conversation. With each follow-up question, the Search Agent triggers a new round of content retrieval based on the updated conversational context.

For a given query, the Search Agent:

Analyzes the query in relation to the conversation context, which includes the search context and the conversation history, to understand the user’s intent.
Applies reasoning based on its analyses to either reformulate the user query while taking into account the user’s intent and the conversation context, or it asks a clarifying question to the user to gather more information before reformulating the query.
Initiates a new round of content retrieval using the reformulated query.

Note

The Search Agent can also decide to respond using the current context and previously retrieved passages, without performing a new search when it isn’t needed.
Evaluates whether the retrieved content is sufficiently relevant to the query, taking the full conversation context into account. This evaluation occurs after both the first-stage and second-stage content retrieval steps. If the Search Agent determines that the content isn’t relevant enough after these stages, it returns to the reasoning step to either refine the query or ask a clarifying question. If the content meets the relevance threshold, the agent proceeds to generate a response.
Maintains the conversation context and history in memory, allowing it to build on previous interactions and responses to iteratively refine its understanding of the user’s intent and generate more relevant answers as the conversation progresses.

Note

For the initial query, the Search Agent initiates the answer generation process without any query reformulations or clarifying questions, as the Search Agent is designed to provide an answer to the initial query whenever possible without asking the user for more information. The answer is generated based on the initial query and search context, which includes any applied filters, selected facets, sorting, etc. at the moment the initial query is submitted.

What’s next?

Implement the Search Agent in your Coveo organization.