Manage Stop Word Rules

Stop words are words which are filtered out from a query entered by an end user before it’s sent to the index which allows items not containing the stop words to be included in the results, increases the importance of other keywords, and help getting more relevant search results.

Stop words are typically very common words considered to carry less meaning such as articles, prepositions, and pronouns. By default, all Coveo search results contain all searched terms, but the Coveo ranking algorithm considers very frequent terms to carry less meaning and their presence in an item doesn’t significantly increase the item ranking.

EXAMPLE

A user enters the following natural language query:

How do I change my password for the intranet

The common words do, I, my, for, and the will contribute to restrict return search results and dilute the most meaningful keywords of the query (change, password, and intranet).

When these common terms are included in the stop word rules, the query sent to the index is:

how change password intranet

The search results for only these four keywords will most likely include more relevant items.

You can however add stop words that you want to ignore from queries if you see they can impact ranking and search results.

The list of stop words for the index of a Coveo organization is empty by default, but users with the required privileges can add stop word rules, which are defined independently for each query pipeline.

Stop Word Special Cases

The stop words management rule is simple. Stop words are removed only when they’re part of an AND (explicit or implicit) or OR expression, or sub-expression. Stop words aren’t removed when they occur alone, because otherwise the removal could create an invalid expression.

The Coveo Platform indexes all terms contained in your source items, including the stop words. This allows the Coveo Platform to manage exceptions and keep stop words in the query in the following cases:

  • The query pipeline condition associated to a stop word rule isn’t fulfilled.

    EXAMPLE

    Since you applied the Language is en_US condition on your stop word rule, the common word it contains won’t be removed if present in a query from a user in Germany.

  • Stop words within a phrase search.

    EXAMPLE

    A user is looking for an item that contains a very specific phrase and encloses the phrase between double-quotes in the search box:

    "in the plan for year 2015"

    Even if the common terms in, the, and for are stop words, all keywords of this query are sent to the index, so only items containing same order contiguous occurrences of these keywords are returned.

  • A query containing only stop words.

    EXAMPLE

    A user searches for:

    to be or not to be

    If all the keywords of this query are stop words, they’re all kept and sent to the index.

  • A stop word is an argument of the NOT or NEAR operators.

    EXAMPLE
    • A user searches for:

      how NEAR:10 export

      If how is a stop word, because it’s an argument of the NEAR operator, it will be kept to return items containing both how and export occurring within ten terms from each other.

    • A user searches for:

      (NOT how export)

      If how is a stop word, because it’s an argument of the NOT operator, and the NOT operator has precedence over the implicit AND operator, it will be kept to return items containing both how and export.

Create Stop Word Rules

  1. On the Query Pipelines page, click the query pipeline in which you want to add a rule, and then in the Action bar, click Edit Components.

  2. On the page that opens, select the Search Terms tab.

  3. In the Search Terms tab, on the left-hand side of the page, select Stop words.

  4. Click Add stop word.

  5. In the input, enter one or more common words separated with commas to be ignored when they appear in queries.

  6. On the right-hand side, under Condition, click Apply condition to optionally select a query pipeline condition.

    • In the Select a condition panel that opens, in the Select a condition drop-down menu, select one of the available conditions, or create a new one by clicking Create a new condition.

  7. Press Enter.

Your new rule is now effective.

Edit Stop Word Rules

  1. On the Query Pipelines page, click the query pipeline in which you want to edit a rule, and then in the Action bar, click Edit Components.

  2. On the page that opens, select the Search Terms tab.

  3. In the Search Terms tab, on the left-hand side of the page, select Stop words.

  4. Click the rule you want to edit.

  5. In the input, enter one or more common words separated with commas to be ignored when they appear in queries.

  6. On the right-hand side, under Condition, click Apply condition to optionally select a query pipeline condition.

    • In the Select a condition panel that opens, in the Select a condition drop-down menu, select one of the available conditions, or create a new one by clicking Create a new condition.

  7. Press Enter.

Your edited rule is now effective.

Duplicate Stop Word Rules

  1. On the Query Pipelines page, click the query pipeline for which you want to duplicate query pipeline rules, and then in the Action bar, click Edit components.

  2. On the page that opens, select the Search Terms tab.

  3. In the Search Terms tab, on the left-hand side of the page, select Stop words.

  4. In the Stop words subtab, click the rule you want to duplicate within the same pipeline (typically to create a slightly different rule).

  5. At the end of the row of the desired rule, click Menu, and then select Duplicate.

The duplicated rule appears at the bottom of the list in the pipeline component tab.

Delete Stop Word Rules

  1. On the Query Pipelines page, click the query pipeline for which you want to delete query pipeline rules, and then in the Action bar, click Edit components.

  2. On the page that opens, select the Search Terms tab.

  3. In the Search Terms tab, on the left-hand side of the page, select Stop words.

  4. In the Stop words subtab, click the rule you want to delete.

  5. At the end of the row of the desired rule, click Menu, and then select Delete.

Leading Practices

When managing stop word rules, consider the following recommendations and tips:

Use Stop Words Sparingly

  • Avoid adding more than a dozen stop words. Too many stop words can negatively impact search results since most of them may still convey some meaning and provide syntactical information used by the search engine to better match content.

  • The index assigns a semantic value to every term by taking into account their frequency in the index. Very frequent terms in indexed items are considered to carry less meaning. Consequently, the index already attributes minimal ranking weight for the occurrence of stop words in search results. Therefore, it’s recommended to add stop word rules only for specific use cases.

    You might consider adding stop word rules to exclude bad keywords from queries to not impact the Coveo ML model learning process. However, if end users perform queries containing only banned words, the model learning process could be affected depending on the returned search results (if any) (see Stop Word Special Cases).

    For more information on the management of the blocklist words, see Add Coveo Machine Learning Blocklist Words.

Apply Conditions

Typically, stop word rules should only apply when a certain condition is fulfilled.

In general, you should ensure that this is the case by associating such a rule, and/or the query pipeline it’s defined in, to a query pipeline condition.

Consider Using the partialMatch Parameter

If for a specific implementation you expect a high number of long natural language queries, consider using the partialMatch parameter in your search interface as an alternative to adding all possible stop words. Using this search interface option, you can define a minimum number of keywords to be found in a search result before this search result is returned (see enablePartialMatch). This way, the index favors the most important keywords and stop words become optional.

Test your Stop Words

When creating stop word rules, you should always perform tests to ensure that your stop words don’t negatively impact the search experience in cases other than the one you’re trying to improve (see Test Stop Words Relevance).

Reference

Order of Execution

The following diagram highlights in orange the position of stop word rules in the overall order of execution of query pipeline features.

Apply stop word rules

Required Privileges

By default, members of the Administrators and Relevance Managers built-in groups can view and edit elements of the Query Pipelines page.

The following table indicates the required privileges to view or edit elements of the Query Pipelines page (see Manage Privileges and Privilege Reference).

Action Service - Domain Required access level

View stop word

Search - Query pipelines

View

Edit stop word

Search - Query pipelines

Edit

Recommended Articles