About Catalog Semantic Encoder (CSE)
About Catalog Semantic Encoder (CSE)
|
|
Contact your Coveo representative to enable Catalog Semantic Encoder (CSE) in your Coveo organization. |
Coveo Machine Learning Catalog Semantic Encoder (CSE) enhances product discovery by interpreting the meaning behind user queries. When integrated within a Coveo-powered commerce search interface, CSE leverages vector search and natural language processing (NLP) to retrieve products based on semantic similarity to queries, significantly extending beyond traditional keyword-based search.
In digital commerce environments, product descriptions and attributes often differ from the language shoppers use in queries. CSE bridges this gap by aligning customer queries more closely with the content included in your catalog data.
By using CSE in a Coveo-powered commerce search interface, you:
-
Enhance relevance: Improve the quality of search results by interpreting the intent behind user queries, complementing exact keyword matches.
-
Handle complex queries: Manage verbose, vague, or conversational queries more effectively by understanding the query’s underlying meaning.
-
Reduce manual tuning efforts: Automatically identify relationships between products and queries, minimizing the need for manual synonyms or boosting rules.
Prerequisites
Before contacting your Coveo representative to enable CSE in your Coveo organization, make sure that you:
-
Have configured a catalog entity.
-
Make sure that the standard commerce fields are populated and mapped in your catalog entity.
-
-
Have configured commerce search pages.
-
Use a tracking ID to identify the storefront where you want to integrate the model.
How CSE works
CSE uses multilingual semantic encoders to create a vector representation of the textual information found in your catalog data and places products into a high-dimensional vector space. At query time, CSE converts the query into a vector in the same high-dimensional space as the product vectors. It then computes the similarity between the query vector and the product vectors to retrieve the most relevant products.
Here’s how the process works:
-
When the model is trained, it uses the information contained in your catalog data to place the products into a high-dimensional vector space where similar products are close together, and dissimilar products are far apart.
-
When a user enters a query, CSE encodes the query into a vector in the same high-dimensional space as the product vectors. This encoding captures the semantic meaning of the query.
-
The CSE model then computes the similarity between the query vector and the product vectors to retrieve the products that are closest to the query in the vector space. This allows CSE to retrieve products that are semantically similar to the query, even if the query doesn’t contain the exact words used in the product data.
-
Finally, CSE works with Coveo ranking algorithms to optimize the ranking of the results based on both semantic and keyword relevance. Note that CSE is designed to work seamlessly with Coveo AI ranking models, such as Automatic Relevance Tuning (ART) and Intent-Aware Product Ranking (IAPR).
|
|
While CSE significantly enhances search capabilities through semantic understanding, optimal results are achieved when used alongside other Coveo AI and search features. Semantic matching alone may not always capture the full context of a query, and thus benefits from complementary keyword-based search capabilities. For example, semantic matching can be less precise with very technical and specialized terminology because such terms may lack sufficient representation in general semantic training data. Short or ambiguous queries might also have limited semantic context, necessitating additional keyword-based support or tuning for accurate results. |
How CSE processes queries
CSE uses a semantic query, which is a clean version of the user’s search query that represents the core meaning or intent. The semantic query sits between the user’s raw input (the original query) and the fully processed query sent to the index (index query).
The semantic query is extracted from the basic query expression and includes only keywords and phrases. It excludes elements that don’t contribute to the core meaning, such as:
-
Query syntax operators (such as
AND,OR, andNOT) -
Field expressions (such as
@field=value) -
Thesaurus rule expansions
-
Stop words removals
For example, if you have the following rules:
-
Thesaurus rule expanding
tvtotelevision -
Stop word rule removing
standfrom queries
The raw query tv stand for the living room AND @brand=Acme yields a semantic query like tv stand living room Acme.
-
Field expressions and query syntax operators are removed, leaving only the keywords that contribute to the core meaning.
-
Thesaurus expansions aren’t applied, so
tvremainstvin the semantic query. -
Stop words aren’t removed, so
standremainsstandin the semantic query.
This keyword-focused approach ensures that CSE can effectively interpret the semantic meaning of the query without being affected by complex search syntax or query transformations applied later in the query pipeline.
Query correction interaction
When the query correction feature is applied to a user’s query:
-
The index returns items based on the corrected query.
-
CSE boosts items based on the semantic query (which is extracted before query correction is applied, including any typos that may be present).
This means that CSE can help surface relevant products even when the original query contains typos, potentially complementing the query correction mechanism by retrieving semantically similar products based on the user’s intent.
Redirect and trigger interaction
While CSE doesn’t directly modify the redirect trigger rules, it can affect their execution.
For example, if a redirect trigger is set up to redirect users to a product detail page (PDP) when they query for a SKU or part number, CSE could block the redirection from occurring if it interprets the query semantically and retrieves other products instead.
To prevent this from happening, you can configure query pipeline conditions on the CSE model association so that the model isn’t triggered for specific query patterns.
For example, you could set a condition to prevent CSE from being applied to all-numerical queries (for SKUs or product codes).
This ensures that CSE enhances semantic search for natural language queries while allowing business-critical redirects to function as intended.
Use case examples
Here are some examples of how CSE can enhance product discovery in a Coveo-powered commerce search interface:
-
A visitor searches for
high-definition display. Traditionally, this might not match products described in catalog data as4K monitor. With CSE, the system identifies these phrases as semantically similar, successfully returning relevant products. -
A query like
comfortable running shoeswill effectively match products described asergonomic athletic sneakersin catalog data.
Here are some additional examples where CSE could be less effective when used alone:
-
Short queries such as
XS shirtmight return less precise matches due to their limited semantic context and the ambiguity inherent in short or highly abbreviated terms. -
Highly technical terms, brand-specific jargon, or product codes like
SKU12345may not be represented well in general semantic vector spaces.
In these scenarios, keyword-based search combined with intent-aware ranking, popularity-based models, and targeted manual tuning ensures that such queries return precise and relevant results.