Manage thesaurus rules
Manage thesaurus rules
The thesaurus of a Coveo organization is a list of equivalent words used to transparently add keywords or phrases to the query entered by a user before it’s sent to the index.
The list of thesaurus rules for the index of an organization is empty by default, but members with the required privileges can define query pipeline thesaurus rules in their organization. Thesaurus rules are defined independently for each pipeline.
Your index contains several items pertaining to the unfortunately named ACME CTRLR game controller (user manuals, troubleshooting articles, etc.).
usage analytics reports indicate that a sizable portion of end users who are obviously looking for information on this product in your Coveo-powered community portal are actually searching for acme pad, and not getting any relevant results.
To address the issue, you create a thesaurus rule that includes acme ctrlr
when acme pad
is part of the user’s query.
Prerequisites
Before creating a rule, first make sure that you have the following:
-
Access to a search page
You need access to a Coveo-powered search interface to be able to test the rule that you create.
-
An existing query pipeline
The queries from the search page must travel through a specific query pipeline.
-
Required privileges
You need specific privileges to be able to add and edit rules in a query pipeline.
Once you meet these requirements, you can create a rule on the Query Pipelines (platform-ca | platform-eu | platform-au) page. To test the rule, use the A/B test feature to compare the results of the rule with the results of the original pipeline.
Common use cases
You can use thesaurus rules for various reasons.
-
Different terminology
You use different terminology to designate the same reality (see "Synonym" thesaurus rule type).
ExampleYou have two versions of a document you send to new employees. Depending on the version of the document, one is named
New Employee Guide
and the otherNew Employee Manual
. -
Acronyms
Your users use acronyms in their search and you want to fine-tune what they get as results (see "Synonym" thesaurus rule type).
ExampleYou notice a high query count for
b2b
. Therefore, you set a thesaurus rule so items that only containbusiness-to-business
are also returned as search results. -
Name changes
Your users search for a product name that has recently been changed, and some items still refer to the old name (see "One-way synonym" thesaurus rule type).
ExampleOne of your products named
Nice Product
has changed toAwesome Product
. Therefore, you set a thesaurus rule so users who search forNice Product
also obtain items related toAwesome Product
as search results. -
Number normalization
Your service agents search for Salesforce case numbers with leading zeros. You want the search to also include case numbers without the leading zeros (see "One-way synonym" thesaurus rule type and Use Java-style regular expressions).
Examples-
When someone searches for
00001008
, you want the system to automatically search for00001008 OR 1008
.The matching regular expression could be:
/[0]*(?<num>[1-9]{1}[0-9]*)/
where
num
is a captured group name. Each captured group name must be inside parentheses (()
).The replacement expression would be:
num
-
Inversely, you want people searching for
1008
to automatically also search for1008 OR 00001008
.The matching regular expression can be:
/(?<num>[0-9]{4})/
where
num
is a captured group name. Each captured group name must be inside parentheses (()
).The replacement expression would be:
0000_num_
-
-
Query typos
You want to correct common typos in user queries (see "Replace" thesaurus rule type).
ExampleYou have a new product line named
Wild Rose
, but you notice that users often search forwildrose
, where they remove the space between the two words. Therefore, you set a thesaurus rule so that when users search forwildrose
, Coveo considers the query to beWild Rose
.In this example, the Original expressions field contains the expression
"wildrose"
and the Replacement expressions field contains the string"Wild Rose"
. The use of the double quotes ensures that the system searches and replaces the exact term.
Create thesaurus rules
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you want to add a rule, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Thesaurus.
-
On the upper-right corner of the page, click Add Rule to access the Add a Thesaurus Rule [1] subpage.
-
On the Add a Thesaurus Rule subpage, under Type, select the type of thesaurus rule you want to add. Options are Synonym, One-way synonym, Replace, and Match terms exactly.
-
Depending on your selection, you must enter expressions (keywords or phrases):
-
If you selected Synonym, in the Expressions inputs, enter the desired expressions.
-
If you selected One-way synonym, in the Original expressions and Additional expressions inputs, enter the desired expressions.
-
If you selected Replace, in the Original expressions and Replacement expressions inputs, enter the desired expressions.
-
If you selected Match terms exactly, in the Original expressions and (optionally) the Exact match replacement expressions inputs, enter the desired expressions.
-
-
On the right-hand side, under Condition, you can optionally select a query pipeline condition in the dropdown menu or create a new one. Your rule applies to queries meeting this condition.
-
Under Description, optionally enter text with information that could help you and your colleagues to manage the rule in the future.
-
Click Add Rule.
The new thesaurus rule is effective immediately.
Edit thesaurus rules
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you want to edit a rule, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Thesaurus.
-
Click the rule you want to edit, and then click Edit in the Action bar [2] to access the Edit a Thesaurus Rule subpage.
-
On the Edit a Thesaurus Rule subpage, under Type, select the type of thesaurus rule you want to edit. Options are Synonym, One-way synonym, Replace, and Match terms exactly.
-
Depending on your selection, you must enter expressions (keywords or phrases):
-
If you selected Synonym, in the Expressions inputs, enter the desired expressions.
-
If you selected One-way synonym, in the Original expressions and Additional expressions inputs, enter the desired expressions.
-
If you selected Replace, in the Original expressions and Replacement expressions inputs, enter the desired expressions.
-
If you selected Match terms exactly, in the Original expressions and (optionally) the Exact match replacement expressions inputs, enter the desired expressions.
-
-
On the right-hand side, under Condition, you can optionally select a query pipeline condition in the dropdown menu or create a new one. Your rule applies to queries meeting this condition.
-
Under Description, optionally enter text with information that could help you and your colleagues to manage the rule in the future.
-
Click Save.
The edited thesaurus rule is effective immediately.
Duplicate thesaurus rules
When creating thesaurus rules in a query pipeline, you may want to create a new rule that’s similarly configured to an existing one. An efficient way to do this is to duplicate the existing rule and then modify it as needed.
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline for which you want to duplicate rules, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Thesaurus.
-
In the Thesaurus subtab, select each checkbox next to the rules you want to duplicate within the same pipeline (typically to create slightly different rules).
-
In the Action bar, click Duplicate.
The duplicated thesaurus rules appear at the bottom of the list in the pipeline component tab.
Copy thesaurus rules to another pipeline
When you have more than one query pipeline that serve different purposes but require similar rules, you may want to copy thesaurus rules from one pipeline to another.
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline from which you want to copy rules, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Thesaurus.
-
In the Thesaurus subtab, select each checkbox next to the rules you want to copy to another pipeline.
-
In the Action bar, click More, and then click Copy to….
-
In the dialog that appears, select the target pipeline to which you want to copy the rules, and then click Copy.
Review information about the rule’s creation or last modification
You can verify who created or last modified a given thesaurus rule by inspecting the Details column of the Thesaurus subtab. The Details column also indicates the hour and date the rule was created or last modified.
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline containing the rule for which you want to inspect the information of the Details column, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Thesaurus.
-
In the Thesaurus subtab, inspect the information of the Details column for the desired rule.
Delete thesaurus rules
You can delete thesaurus rules from a query pipeline.
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline for which you want to delete rules, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Thesaurus.
-
In the Thesaurus subtab, select each checkbox next to the rules you want to delete.
-
In the Action bar, click More, and then Delete.
-
Click Delete to confirm.
Change the rule order
Query pipeline rules are executed in the order in which they appear on the page until a condition is satisfied.
In the context of thesaurus rules, this also means that only one thesaurus rule can apply per expression (keyword or phrase). If a given query matches multiple thesaurus rules that expand the same expression, only the first matching rule in the query pipeline applies.
The following Synonym thesaurus rules both contain the same HDMI
Expression:
Since the include HDMI, "HDMI cable" when any are present
thesaurus rule is the first to appear in the list, this is the only rule that applies if a user query contains either HDMI
or HDMI cable
.
If you want the thesaurus rule to also consider "high definition multimedia interface"
, you must add it in a single thesaurus rule as follows:
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline for which you want to manage the rules' execution order, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Thesaurus.
-
In the Thesaurus subtab, click the rule whose position you want to change.
-
In the Action bar, click Move up or Move down to change the position of the rule.
Leading practices
Consider the following leading practices when creating thesaurus rules:
Use thesaurus rules for legitimate reasons
-
Thesaurus rules are case-insensitive. Therefore entering casing variants is unnecessary.
-
Identify searched keywords that don’t return optimal results because users aren’t entering the indexed synonym keywords, and then create a thesaurus entry that expands the query to the appropriate synonyms.
-
Be careful to enter only legitimate synonyms to prevent excessive search result broadening that can negatively affect search results ranking and confuse users.
-
Avoid using the thesaurus to expand a typo to its correct form. Based on the relative occurrences of a typo and its correct form in the index, the Did You Mean feature automatically corrects or suggests the better spelling.
Use thesaurus rules sparingly
-
When a query pipeline contains Coveo Machine Learning (Coveo ML) models, avoid or minimize the use of thesaurus rules. Thesaurus rules are static and can therefore negatively impact Coveo ML models, which follow trends. Therefore, create thesaurus rules with caution.
NoteHowever, if used carefully, thesaurus rules can be useful for training ART models. For more information, see Use thesaurus rules to train ART models.
-
For synonym rules, the thesaurus entry expansion is omnidirectional or reciprocal to all keywords/expressions in the thesaurus entry, so be careful not to enter many synonyms in a given entry to prevent drastically increasing the length of the query.
-
Be aware that only one thesaurus rule can apply per expression (keyword or phrase). If a given query matches multiple thesaurus rules that expand the same expression, only the first matching rule in the query pipeline applies. This means that you must group equivalent expressions into a single thesaurus entry.
ExampleThe following thesaurus rules both expand the same Original expression (
"HD TV"
):Since the thesaurus rule that expands
"HD TV"
to"high-definition television"
is the first to appear in the list of thesaurus rules, this is the only rule that applies if a user query containsHD TV
.If you want the thesaurus rule to expand
"HD TV"
to"high-definition television"
and"4K television"
, you must group them into a single rule as follows: -
Thesaurus rules apply before the stemming expansion made by the index, meaning that thesaurus entries are only expanded for exact matches (see About stemming). While you can consider entering multiple thesaurus rules for each stem variants (for example, singular/plural, conjugation, one versus two-word, and other synonym variants), the leading practice is to create a single thesaurus rule that covers the term and all its variants using a regular expression.
ExampleWhen a user searches for
kitty
orkitten
, you want the system to also automatically search forcat
. Instead of creating two distinct thesaurus rules for each variant, you create the following rule:
Use thesaurus rules to train ART models
-
Adding thesaurus rules to a query pipeline allows you to get the same search results for the synonym or acronym of a term. This helps to train the ART model as users are more likely to click the same search results after querying the terms you’ve included in your thesaurus rules.
-
Make sure to remove thesaurus rules once the data period for training your ART model has expired. This is recommended as not doing so can cause the wrong results to be returned in response to users' queries, as demonstrated by the example below:
ExampleDuring a shopping session, a customer typically browses through multiple items after performing a query.
Taking advantage of this, you might want to promote a related item that was not searched for in the user’s query. An instance of this could be setting up a thesaurus rule which includes search results for
sports socks
every time a user searches forrunning shoes
.Since ART learns from searches and clicks to boost search results, the model will establish a relation between the query
running shoes
and index items representingsports socks
that were clicked after the user searched forrunning shoes
.After your ART model has been trained, you should remove this thesaurus rule. Not doing so will result in including all the index items representing
sports socks
in your search results when the user queries forrunning shoes
. This is not desirable as only those items representingsports socks
that were clicked on after searching forrunning shoes
should be included in the search results.For more information on linking queries to results, see Troubleshoot ART models.
NoteThe search results for queries that include the terms you created the rule for will remain the same even after you remove the thesaurus rules (unless the Match the query option was selected when associating the model). To ensure that this is the case, you can test your ART model by comparing results when it’s connected to a query pipeline that contains the thesaurus rules with a query pipeline that doesn’t.
Apply thesaurus rules conditionally when appropriate
In most cases, thesaurus rules don’t need query pipeline conditions. However, there are certain scenarios in which you must add a condition:
-
If the pipeline is used by different search interfaces (each denoted by its own search hub).
-
If the thesaurus entry is specific to a single language.
-
If the thesaurus entry is only used to transform the query (for example, in a Commerce application, you might use thesaurus rules to modify user input and extract certain values only when a specific condition is satisfied).
Test your thesaurus rules
-
Immediately test your thesaurus entry creation or modification in the search interface. You can use the Content Browser (platform-ca | platform-eu | platform-au) search interface to ensure that the rule works as expected.
-
Run A/B tests to monitor the effectiveness of your thesaurus entry on your search results relevance.
Handling contiguity characters
Contiguity characters, such as hyphens (-
) or underscores (_
), play a crucial role in term matching.
To ensure a rule is triggered as expected, these characters need to be explicitly included in your rule.
When creating a thesaurus rule for a term that includes a contiguity character like e-mail
, the rule must exactly match the term, including the hyphen.
This ensures that the rule is applied correctly, recognizing e-mail
as a distinct term from email
.
Understanding quoted terms
Terms enclosed in double quotes are treated as exact phrases by thesaurus rules.
In contrast, terms not enclosed in double quotes are treated individually and can trigger thesaurus rules based on partial matches. This means a rule can apply to any individual term within a search query, providing a broader scope for rule application.
Terms in the query: "customer support"
In this scenario, a thesaurus rule that targets the exact phrase "customer support"
will be triggered.
However, if the rule is defined for customer
or support
as separate terms without quotes, it will not apply to the quoted phrase "customer support"
in a search query.
This distinction ensures that only precise matches to phrases in double quotes are affected by thesaurus rules, allowing for more targeted modifications to search queries.
Reference
When creating thesaurus rules, consider that they apply to:
-
Free text queries.
-
large query expression (
lq
) keywords extracted by Intelligent Term Detection (ITD).
Note
Thesaurus rules don’t apply to:
|
Use Java-style regular expressions
When creating a thesaurus rule, you can use Java-style regular expressions (see java.util.regex Class Pattern) to match and even replace values in thesaurus entries.
You must include the / /
delimiters for the matching keyword.
If you use named-capturing groups, the syntax to include a named-capturing group in the replacement keyword is groupName
.
You want to separate two product name parts that are concatenated (for example, replacing iphone6
with iphone 6
).
The matching expression can be: /iphone(?<ver>[0-9])/
where ver
is a captured group name.
The replacement expression would be: iphone _ver_
Note that in the above example, the first part of the expression (iphone
) must be present in the user query for the thesaurus rule to apply.
If you want this expression to apply for another product name (ipad
for example), you can use the .
and *
regex characters so that the thesaurus rule can match the keywords used before the matching expression in the user query.
For example, if you want your thesaurus rule to replace iphone6
or ipad6
with iphone 6
or ipad 6
, you could enter the following regex in the Original expressions section:
/i.*(?<ver>[0-9])/
Thesaurus rule types
When creating or editing a thesaurus rule from the Query Pipelines (platform-ca | platform-eu | platform-au) page of the Administration Console, you can choose one of the following thesaurus sub-types:
Synonym
Searches the index for all thesaurus expressions as soon as one expression is part of the user query.
Synonym rules are evaluated in the order they’re defined. This means that when a query is sent, the Synonym thesaurus rule type evaluates the terms defined in the rule in order until it finds a match with the queried keywords. Therefore, when defining Synonym rules that are meant to expand terms that share a single prefix, you should define more meaningful terms in the first position of the statement to avoid relevance issues. |
When considering the following statement:
alias "vacation", "vacation leave", "vacation policy"
When a user queries vacation policy
, their query is parsed as follows:
(vacation OR (vacation leave) OR (vacation policy)) policy
However, when defining the same statement using vacation policy
in first position as follows:
alias "vacation policy", "vacation leave", "vacation"
The same vacation policy
query is parsed as follows:
(vacation policy) OR (vacation leave) OR vacation
One-way synonym
Searches the index for all original thesaurus expressions as soon as one term is part of the user query. However, the "One-way synonym" sub-type doesn’t expand original expressions when target expressions are queried.
Leading practice
You can enter expressions between double-quotes to expand an exact phrase. This is useful to expand acronyms or initialisms. |
Replace
Overwrites specific end-users' expressions when queried.
Leading practice
The Replace rule type should only be created when you’re certain that your index doesn’t, and will never contain the expressions to substitute. The "One-way synonym" rule type should first be considered. |
Match terms exactly
The Match terms exactly thesaurus rule type lets you specify Original expressions and Exact match replacement expressions.
When you only specify Original expressions, the specified expressions are turned into their corresponding exact phrase match expression.
Considering the following rule:
When a user searches for king of the jungle
in a search box, their query becomes "king of the jungle"
.
When you also specify Exact match replacement expressions, the specified Original expressions are replaced with the exact phrase match expression of the specified Exact match replacement expressions.
Considering the following rule:
When a user searches for lion
in a search box, their query becomes "king of the jungle" OR "big cat"
.
QPL syntax
When creating a thesaurus rule with code or editing the code of an existing thesaurus rule, use the following query pipeline language (QPL) syntax:
-
For the One-way synonym thesaurus rule type:
<terms>
to
<otherTerms>
-
For the Replace thesaurus rule type:
replace
<terms>
to
<otherTerms>
-
For the Match terms exactly thesaurus rule type:
quote
<terms>
to
<otherTerms>
The following table summarizes how statements using each of the different thesaurus
sub-features would process the basic part (q
) of the combined query expression, assuming its current value is kitty cat
:
Statement definition | Processed q expression |
---|---|
|
|
|
|
|
|
|
|
|
|
Parameters
terms
A comma-separated list of quoted strings and/or regular expressions where each quoted string must contain one or more basic query terms (for example, "foo bar", "baz", /^meo+w$/
).
When using the |
otherTerms
A comma-separated list of quoted strings where each quoted string must contain one or more basic query terms (for example, "hello world", "biz"
).
Order of execution
Thesaurus rules apply before the stemming expansion made by the index, meaning that thesaurus entries are only expanded for exact matches (see About stemming).
The following diagram illustrates the overall order of execution of query pipeline features:
Required privileges
By default, members with the required privileges can view and edit elements of the Query Pipelines (platform-ca | platform-eu | platform-au) page.
The following table indicates the required privileges to view or edit thesaurus rules (see Manage privileges and Privilege reference).
Action | Service - Domain | Required access level |
---|---|---|
View thesaurus rules |
Organization - Organization |
View |
Edit thesaurus rules |
Organization - Organization |
View |
Search - Query pipelines |
Edit |