About Stop Words

Stop words are typically very common words considered to carry less meaning such as articles, prepositions, and pronouns. By default, all Coveo search results contain all searched terms, but the Coveo ranking algorithm considers very frequent terms to carry less meaning and their presence in an item doesn’t significantly increase the item ranking.

You can however add stop words that you want to ignore from queries if you see they can impact ranking and search results.

When stop words are defined, they’re removed from a query entered by a user before it’s sent to the index which allows item not containing the stop words to be included in the results, increases the importance of other keywords, and help getting more relevant search results.

A user enters the following natural language query:

How do I change my password for the intranet

The common words do, I, my, for, and the will contribute to restrict return search results and dilute the most meaningful keywords of the query (change, password, and intranet).

When these common terms are included in the stop word rules, the query sent to the index is:

how change password intranet

The search results for only these four keywords will most likely include more relevant items.

Stop Word Special Cases

The stop words management rule is simple. Stop words are removed only when they’re part of an AND (explicit or implicit) or OR expression, or sub-expression. Stop words aren’t removed when they occur alone, because otherwise the removal could create an invalid expression.

The Coveo Platform indexes all terms contained in your source items, including the stop words. This allows the Coveo Platform to manage exceptions and keep stop words in the query in the following cases:

  • The condition associated to a stop word rule isn’t fulfilled.

    Since you applied the Language is en_US condition on your stop word rule, the common word it contains won’t be removed if present in a query from a user in Germany.

  • Stop words within a phrase search.

    A user is looking for an item that contains a very specific phrase and encloses the phrase between double-quotes in the search box:

    "in the plan for year 2015"
    

    Even if the common terms in, the, and for are stop words, all keywords of this query are sent to the index, so only items containing same order contiguous occurrences of these keywords are returned.

  • A query containing only stop words.

    A user searches for:

    to be or not to be
    

    If all the keywords of this query are stop words, they’re all kept and sent to the index.

  • A stop word is an argument of the NOT or NEAR operators.

    • A user searches for:

      how NEAR:10 export
      

      If how is a stop word, because it’s an argument of the NEAR operator, it will be kept to return items containing both how and export occurring within ten terms from each other.

    • A user searches for:

      (NOT how export)
      

      If how is a stop word, because it’s an argument of the NOT operator, and the NOT operator has precedence over the implicit AND operator, it will be kept to return items containing both how and export.

What’s Next?

Learn which words should be stop words.

Recommended Articles