About stemming
About stemming
stemming is a process which reduces words to their stem, base, or root form. The index uses the stem of each queried term to expand the {query] by searching for the original term and also indexed terms that share the same root. This important automatic query expansion process often returns relevant results that wouldn’t appear otherwise.
Note
Stemming only applies to words of four characters or more. |
-
Searching for a term typed in its singular form returns items containing the singular and plural form of the term, and vice-versa.
-
The words search, searching and searched share the same root or stem: search-. When you query
searching
, the index returns items containing the words searching, search, searches, and searched.
Stem expansions
For your index to use a certain term as a stem expansion, that term must appear in at least one indexed item, where:
-
The search interface language matches one of the languages of that item.
-
The term appears in the body of that item.
Languages
In addition to only using terms from indexed items in the same language as the search interface, Coveo employs language-specific stemming algorithms in order to improve the relevance of stem expansions. For more information, see our list of supported languages.
The term attention can stem to attentio in English and attenti in French.
The stem expansion attentif is only relevant in French.
By default, the index assumes the search interface language to be the main languages it detects in indexed items.
You can however set the forwardLanguageToCoveoIndex
Search API query parameter to true
to force the index to use the language passed in the locale
Search API query parameter.
Stemming in field queries
You can also leverage stemming in field queries by enabling the stemming
option of the target fields.
That being said, keep in mind that doing so can impede performance.
Ranking
The index gives higher result ranking to items containing the original form of queried terms. Moreover, the index calculates a correlation factor in your index between the searched term and every possible expansion. In search results, highly correlated expansions are ranked higher than poorly correlated ones. This decreases the risks of stemming confusion that could occur when words of different natures share the same stem.
In English, the terms university and universe stem to the same root, although they’re not semantically related.
When you search for universe
, the Coveo index expands your query using terms from the univer stem classes that can include university. However, since the terms universe and university rarely co-occur in your indexed items, items containing university rank lower.
Disabling stemming
While expanded queries are generally useful, you can disable the stemming expansion when you want to search for an exact term or an exact phrase.