About Catalog Semantic Encoder (CSE)

This is for:

System Administrator
Important

Contact your Coveo representative to enable Catalog Semantic Encoder (CSE) in your Coveo organization.

Coveo Machine Learning Catalog Semantic Encoder (CSE) enhances product discovery by interpreting the meaning behind user queries. When integrated within a Coveo-powered commerce search interface, CSE leverages vector search and natural language processing (NLP) to retrieve products based on semantic similarity to queries, significantly extending beyond traditional keyword-based search.

In digital commerce environments, product descriptions and attributes often differ from the language shoppers use in queries. CSE bridges this gap by aligning customer queries more closely with the content included in your catalog data.

By using CSE in a Coveo-powered commerce search interface, you:

  • Enhance relevance: Improve the quality of search results by interpreting the intent behind user queries, complementing exact keyword matches.

  • Handle complex queries: Manage verbose, vague, or conversational queries more effectively by understanding the query’s underlying meaning.

  • Reduce manual tuning efforts: Automatically identify relationships between products and queries, minimizing the need for manual synonyms or boosting rules.

Prerequisites

Before contacting your Coveo representative to enable CSE in your Coveo organization, make sure that you:

How CSE works

CSE uses multilingual semantic encoders to create a vector representation of the textual information found in your catalog data and places products into a high-dimensional vector space. At query time, CSE converts the query into a vector in the same high-dimensional space as the product vectors. It then computes the similarity between the query vector and the product vectors to retrieve the most relevant products.

Here’s how the process works:

  1. When the model is trained, it uses the information contained in your catalog data to place the products into a high-dimensional vector space where similar products are close together, and dissimilar products are far apart.

  2. When a user enters a query, CSE encodes the query into a vector in the same high-dimensional space as the product vectors. This encoding captures the semantic meaning of the query.

  3. The CSE model then computes the similarity between the query vector and the product vectors to retrieve the products that are closest to the query in the vector space. This allows CSE to retrieve products that are semantically similar to the query, even if the query doesn’t contain the exact words used in the product data.

  4. Finally, CSE works with Coveo ranking algorithms to optimize the ranking of the results based on both semantic and keyword relevance. Note that CSE is designed to work seamlessly with Coveo AI ranking models, such as Automatic Relevance Tuning (ART) and Intent-Aware Product Ranking (IAPR).

Important

While CSE significantly enhances search capabilities through semantic understanding, optimal results are achieved when used alongside other Coveo AI and search features. Semantic matching alone may not always capture the full context of a query, and thus benefits from complementary keyword-based search capabilities.

For example, semantic matching can be less precise with very technical and specialized terminology because such terms may lack sufficient representation in general semantic training data. Short or ambiguous queries might also have limited semantic context, necessitating additional keyword-based support or tuning for accurate results.

Use case examples

Here are some examples of how CSE can enhance product discovery in a Coveo-powered commerce search interface:

  • A visitor searches for high-definition display. Traditionally, this might not match products described in catalog data as 4K monitor. With CSE, the system identifies these phrases as semantically similar, successfully returning relevant products.

  • A query like comfortable running shoes will effectively match products described as ergonomic athletic sneakers in catalog data.

Here are some additional examples where CSE could be less effective when used alone:

  • Short queries such as XS shirt might return less precise matches due to their limited semantic context and the ambiguity inherent in short or highly abbreviated terms.

  • Highly technical terms, brand-specific jargon, or product codes like SKU12345 may not be represented well in general semantic vector spaces.

In these scenarios, keyword-based search combined with intent-aware ranking, popularity-based models, and targeted manual tuning ensures that such queries return precise and relevant results.