About Semantic Encoder (SE)

In this article

When used in a Coveo-powered search interface with Relevance Generative Answering (RGA), a Coveo Machine Learning (Coveo ML) Semantic Encoder (SE) model uses vector search to retrieve items from your index based on semantic similarity with the query. The model does this by first creating embeddings for the enterprise content that you specify for the model, and then referencing the embeddings at query time.

When a user enters a query, the query passes through a query pipeline where pipeline rules and machine learning are applied to optimize relevance. However, in addition to the traditional lexical (keyword) search, the SE model adds vector-based search capabilities to your search interface. Vector search improves search results by using embeddings to retrieve the items in the index with high semantic similarity with the query. The most relevant search results are then sent to the RGA model for answer generation (see RGA overview).

Traditional lexical search relies on matching keywords or phrases that appear in a query with the words and phrases in items. Due to the exact-match nature of lexical search, user queries that yield the best results tend to be concise and precise. The user needs to know what they’re searching for and use the proper keywords.

However, what if a user doesn’t know exactly what they’re looking for, or what if the query is more complex? With the emergence of generative AI, customer expectations are evolving and search interfaces must be able to understand and provide answers for more complex natural language queries. A complex query is one where the user provides context and asks a question.

For instance, if a user enters the query “What is Coveo machine learning and what are its benefits”, lexical search results will include items that discuss “Coveo machine learning”. The results may also contain items with high occurrences of the word “benefit”, which may not be relevant to the query. Lexical search is fast, cost-efficient, and has a proven track record in many enterprises. However, lexical search doesn’t consider the meaning of words or the context. While lexical search isn’t suited to finding similarities in items based on meaning, vector-based search is designed for just that purpose.

When combined with Relevance Generative Answering (RGA), an SE model allows a Coveo-powered search to extract the meaning in a complex natural language query to find the most relevant items. This, in turn, ensures that the RGA model uses the most relevant content to generate answers.

What’s next?