Stemming is a process which reduces words to their stem, base, or root form. The index uses the stem of each queried term to expand the query by searching for the original term and also indexed terms that share the same root. This important automatic query expansion process often returns relevant results that wouldn’t appear otherwise.
Stemming only applies to words of four characters or more.
Searching for a term typed in its singular form returns items containing the singular and plural form of the term, and vice-versa.
The words search, searching and searched share the same root or stem: search-. When you query
searching, the index returns items containing the words searching, search, searches, and searched.
For your index to use a certain term as a stem expansion, that term must appear in at least one indexed item, where:
- The search interface language matches one of the languages of that item.
- The term appears in the body of that item.
In addition to only using terms from indexed items in the same language as the search interface, Coveo employs language-specific stemming algorithms in order to improve the relevance of stem expansions (see Supported Languages).
The term attention can stem to attentio in English and attenti in French.
The stem expansion attentif is only relevant in French.
By default, the index assumes the search interface language to be the main languages it detects in indexed items. You can however set the
forwardLanguageToCoveoIndex Search API query parameter to
true to force the index to use the language passed in the
locale Search API query parameter.
Stemming in Field Queries
You can also leverage stemming in field queries by enabling the
stemming option of the target fields (see Add or Edit a Field - Stemming). That being said, keep in mind that doing so can impede performance.
The index gives higher result ranking to items containing the original form of queried terms. Moreover, the index calculates a correlation factor in your index between the searched term and every possible expansion. In search results, highly correlated expansions are ranked higher than poorly correlated ones. This decreases the risks of stemming confusion that could occur when words of different natures share the same stem.
In English, the terms university and universe stem to the same root, although they’re not semantically related.
When you search for
universe, the Coveo index expands your query using terms from the univer stem classes that can include university. However, since the terms universe and university rarely co-occur in your indexed items, items containing university rank lower.