Manage stop word rules
Manage stop word rules
Stop words are words that are filtered out from a query before it’s sent to the index. By filtering out certain words, other keywords are given more importance, which helps in providing users with more relevant search results.
By default, Coveo search results contain all searched terms, but the Coveo ranking algorithm gives less importance to frequently used terms such as articles (for example, a, the), prepositions (for example, of, in), and pronouns (for example, my, them). Consequently, the presence of these terms in an item doesn’t significantly increase the item ranking in search results.
Depending on your specific needs, however, you may want to add a list of words to ignore from queries. Words that are very common terms in your industry that appear in many of your items, for instance, and therefore offer little in the way of differentiation between items in search results.
The list of stop words for the index of a Coveo organization is empty by default, but members with the required privileges can add stop word rules, which are defined independently for each query pipeline.
A user enters the following natural language query:
Expert doctor treatment for patient with headaches
In a clinical setting, you may want to create a list of stop words such as doctor
, patient
, and expert
, as they’re frequently repeated across many documents and don’t add much value.
Adding these common words as stop words, and therefore filtering them out of the query, gives more importance to the more meaningful keywords of the query: (treatment
and headaches
).
When these common terms are included as stop words, the query sent to the index is:
treatment headaches
The search results for only these two keywords will most likely include more relevant items.
Prerequisites
Before creating a rule, first make sure that you have the following:
-
Access to a search page
You need access to a Coveo-powered search interface to be able to test the rule that you create.
-
An existing query pipeline
The queries from the search page must travel through a specific query pipeline.
-
Required privileges
You need specific privileges to be able to add and edit rules in a query pipeline.
Once you meet these requirements, you can create a rule on the Query Pipelines (platform-ca | platform-eu | platform-au) page. To test the rule, use the A/B test feature to compare the results of the rule with the results of the original pipeline.
Leading practices
When managing stop word rules, consider the following recommendations and tips:
Use stop words sparingly
-
Avoid adding more than a dozen stop words. Too many stop words can negatively impact search results since most of them may still convey some meaning and provide syntactical information used by the search engine to better match content.
-
The index assigns a semantic value to every term by taking into account their frequency in the index. Very frequent terms in indexed items are considered to carry less meaning. Consequently, the index already attributes minimal ranking weight for the occurrence of stop words in search results. Therefore, it’s recommended to add stop word rules only for specific use cases.
NoteYou might consider adding stop word rules to exclude bad keywords from queries to not impact the Coveo ML model learning process. However, if end users perform queries containing only banned words, the model learning process could be affected depending on the returned search results (if any) (see Stop word special cases).
For more information on the management of the blocklist words, see Blocklists.
Apply conditions
Typically, stop word rules should only apply when a certain condition is fulfilled.
In general, you should ensure that this is the case by associating such a rule, and/or the query pipeline it’s defined in, to a query pipeline condition.
Consider using the partialMatch
parameter
If for a specific implementation you expect a high number of long natural language queries, consider using the partialMatch
parameter in your search interface as an alternative to adding all possible stop words.
Using this search interface option, you can define a minimum number of keywords to be found in a search result before this search result is returned (see enablePartialMatch).
This way, the index favors the most important keywords and stop words become optional.
Test your stop words
When creating stop word rules, you should always perform tests to ensure that your stop words don’t negatively impact the search experience in cases other than the one you’re trying to improve (see Test stop words relevance).
Stop word special cases
The stop words management rule is simple.
Stop words are removed only when they’re part of an AND
(explicit or implicit) or OR
expression, or sub-expression.
Stop words aren’t removed when they occur alone, because otherwise the removal could create an invalid expression.
Coveo indexes all terms contained in your source items, including the stop words. This allows the Coveo to manage exceptions and keep stop words in the query in the following cases:
-
The query pipeline condition associated to a stop word rule isn’t fulfilled.
ExampleSince you applied the
Language is en_US
condition on your stop word rule, the common word it contains won’t be removed if present in a query from a user in Germany. -
Stop words within a phrase search.
ExampleA user is looking for an item that contains a very specific phrase and encloses the phrase between double-quotes in the search box:
"in the plan for year 2015"
Even if the common terms
in
,the
, andfor
are stop words, all keywords of this query are sent to the index, so only items containing same order contiguous occurrences of these keywords are returned. -
A query containing only stop words.
ExampleA user searches for:
to be or not to be
If all the keywords of this query are stop words, they’re all kept and sent to the index.
-
A stop word is an argument of the
NOT
orNEAR
operators.Examples-
A user searches for:
how NEAR:10 export
If
how
is a stop word, because it’s an argument of theNEAR
operator, it will be kept to return items containing bothhow
andexport
occurring within ten terms from each other. -
A user searches for:
(NOT how export)
If
how
is a stop word, because it’s an argument of theNOT
operator, and theNOT
operator has precedence over the implicitAND
operator, it will be kept to return items containing bothhow
andexport
.
-
Create stop word rules
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you want to add a rule, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Stop words.
-
Click Add stop word.
-
In the input, add the words that will be ignored when they appear in queries by entering one or more words separated by commas, and then select Enter.
-
(Optional) Click Add condition to set a condition for when the stop word rule applies.
-
In the Select a condition panel that opens, in the Select a condition dropdown menu, select one of the available conditions, or create a new one by clicking Create a new condition.
-
Click Apply Condition.
-
Your new rule is now active.
Edit stop word rules
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you want to edit a rule, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Stop words.
-
Click the rule you want to edit.
-
In the input, add the words that will be ignored when they appear in queries by entering one or more words separated by commas, and then select Enter.
-
(Optional) Click Add condition to set a condition for when the stop word rule applies.
-
In the Select a condition panel that opens, in the Select a condition dropdown menu, select one of the available conditions, or create a new one by clicking Create a new condition.
-
Click Apply Condition.
-
Your edited rule is now active.
Duplicate stop word rules
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline for which you want to duplicate query pipeline rules, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Stop words.
-
In the Stop words subtab, click the rule you want to duplicate within the same pipeline (typically to create a slightly different rule).
-
At the end of the row of the desired rule, click , and then select Duplicate.
The duplicated rule appears at the bottom of the list in the pipeline component tab.
Review information about the rule’s creation or last modification
You can verify who created or last modified a given stop word rule by inspecting the Details column of the Stop Words subtab. The Details column also indicates the hour and date the rule was created or last modified.
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline containing the rule for which you want to inspect the information of the Details column, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Stop Words.
-
In the Stop Words subtab, inspect the information of the Details column for the desired rule.
Delete stop word rules
-
On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline for which you want to delete query pipeline rules, and then click Edit components in the Action bar.
-
On the page that opens, select the Search Terms tab.
-
In the Search Terms tab, on the left side of the page, select Stop words.
-
In the Stop words subtab, click the rule you want to delete.
-
Click Delete to confirm.
Reference
Order of execution
The following diagram illustrates the overall order of execution of query pipeline features:
Required privileges
By default, members with the required privileges can view and edit elements of the Query Pipelines (platform-ca | platform-eu | platform-au) page.
The following table indicates the required privileges to view or edit stop word rules (see Manage privileges and Privilege reference).
Action | Service - Domain | Required access level |
---|---|---|
View stop word rules |
Organization - Organization |
View |
Edit stop word rules |
Organization - Organization |
View |
Search - Query pipelines |
Edit |