Understanding Stemming

Stemming is a process which reduces words to their stem, base, or root form. The Coveo Cloud platform uses the stem of each queried term to expand the query by searching for the original term and related terms that share the same root. This important automatic query expansion process often helps to find what you are looking for by returning more relevant results that would not appear otherwise.

Searching for a term typed in its singular form returns items containing the singular and plural form of the term, and vice-versa.

The words search, searching and searched share the same root or stem: search-. When you query searching, the Coveo Cloud returns items containing the words searching, search, searches, and searched.

The returned items containing the original form of queried terms are however ranked higher.

While expanded queries are generally useful, you can disable the stemming expansion when you want to search a specific term or phrase (see Searching for an Exact Term and Searching for a Phrase).

The stemming rules vary from one language to another as a term can yield different stems for different languages.

The term attention can stem to attentio in English and attenti in French.

Even when a term stems to the same root in two different languages, their respective stem class can very well be different. Coveo Cloud overcomes this problem. At indexing time, Coveo Cloud detects and saves the language of each indexed item. When expanding query terms, the appropriate language-specific stemming algorithm is used for each indexed item (see Supported Languages - Coveo Cloud V2).

Stemming confusion can also occur when the stemming algorithm regroups words of different nature under the same stem.

In English, the terms university and universe stem to the same root, although they are not related.

Coveo Cloud further minimizes possible stemming errors by calculating a correlation factor between the searched term and every possible expansion. In search results, highly correlated expansions are ranked higher than poorly correlated ones.

Stemming applies to free text queries, but often not to field queries.

Indeed, field queries with the == operator or with a phrase search will not be stemmed.

Also, only the fields of type string with the Stemming option selected are stemmed (see Add or Edit Fields).