Supported languages for indexing features

Important
Language support and relevance disclaimer

The Coveo Platform supports a wide range of languages across indexing, machine learning models, and user interfaces; however, the depth of linguistic processing varies by language. Support for a language doesn’t constitute a commitment to equivalent relevance, accuracy, or feature parity.

For example, queries and indexed content in languages such as English, French, German, and Spanish achieve higher and more consistent relevance due to more comprehensive language analysis and optimized platform components. Conversely, for other languages that are technically supported but for which the platform isn’t fully optimized, search relevance, generated answers, and overall accuracy can be reduced or be less consistent. Accordingly, language support doesn’t guarantee equal relevance quality, and observed outcomes can vary depending on the language.

The table below lists the 58 languages Coveo supports where encoding, excerpt, language detection, and thesaurus features are available for indexing. Coveo can also index content in languages other than those listed below as long as:

  • The language uses spaces to separate words.

  • The item encodes characters in Unicode.

For languages meeting these requirements, most language features are supported except language detection, decompounding, and stemming.

Note

Since result ranking is partly based on summarization and key-concept extraction techniques, English, French, German, and Spanish queries return the most relevant results consistently.

Language Locale Encoding, excerpt, and thesaurus Did you mean Language detection Stemming Decompounding

English[1]

en

check

check

check

check

N/A

French[1]

fr

check

check

check

check

N/A

German[1]

de

check

check

check

check

check

Spanish[1]

es

check

check

check

check

N/A

Danish

da

check

check

check

check

check

Dutch

nl

check

check

check

check

check

Finnish

fi

check

check

check

check

check

Hungarian

hu

check

check

check

check

N/A

Italian

it

check

check

check

check

N/A

Norwegian

no

check

check

check

check

check

Portuguese

pt

check

check

check

check

N/A

Swedish

sv

check

check

check

check

check

Turkish

tr

check

check

check

check

N/A

Catalan

ca

check

check

check

check

N/A

Romanian

ro

check

check

check

check

N/A

Valencian

ca

check

check

check

check

N/A

Armenian

hy

check

check

check

check

N/A

Russian

ru

check

check

check

check

N/A

Chinese (traditional and simplified)[2],[3]

zh

check

check

check

x

N/A

Greek

el

check

check

check

check

N/A

Hindi

hi

check

check

check

check

N/A

Japanese[2],[3]

ja

check

check

check

x

N/A

Korean[2],[3]

ko

check

check

check

x

N/A

Thai

th

check

check

check

x

N/A

Arabic

ar

check

check

check

check

N/A

Basque

eu

check

check

x

check

N/A

Lithuanian

lt

check

check

check

check

N/A

Czech

cs

check

check

check

check

N/A

Indonesian

id

check

check

check

check

N/A

Polish

pl

check

check

check

check

N/A

Albanian

sq

check

check

check

x

N/A

Afrikaans

af

check

check

check

x

N/A

Belarusian

be

check

check

check

x

N/A

Bulgarian

bg

check

check

check

check

N/A

Burmese

my

check

check

check

x

N/A

Croatian

hr

check

check

check

check

N/A

Esperanto

eo

check

check

check

x

N/A

Estonian

et

check

check

check

check

N/A

Filipino

fil

check

check

x

x

N/A

Hebrew

he

check

check

check

x

N/A

Icelandic

is

check

check

check

x

N/A

Kazakh

kk

check

check

check

x

N/A

Latvian

lv

check

check

check

x

N/A

Macedonian

mk

check

check

check

x

N/A

Malay

ms

check

check

x

x

N/A

Moldovan

ro

check

check

check

check

N/A

Mongolian

mn

check

check

x

x

N/A

Norwegian Bokmål[4]

nb

check

check

x

check

check

Persian

fa

check

check

check

x

N/A

Serbian

sr

check

check

check

check

N/A

Slovak

sk

check

check

check

check

N/A

Slovenian

sl

check

check

check

check

N/A

Swahili

sw

check

check

check

x

N/A

Tagalog

tl

check

check

check

x

N/A

Ukrainian

uk

check

check

check

x

N/A

Uzbek

uz

check

check

check

x

N/A

Vietnamese

vi

check

x

check

x

N/A

Yiddish

yi

check

check

x

check

N/A

2. Wildcards in queries are not supported.

1. The index can also generate item summaries.

3. A specialized tokenizer based on dictionaries is used to split CJK characters into words, which can impact search results relevancy.

4. Will be detected as Norwegian (no).

Troubleshooting language indexing issues

Note

Context and symptoms

You see inconsistent results when searching in a specific language on your website.

Cause

  • Most languages are limited to the Snowball stemming algorithms. Some rules are incomplete, meaning variations of words aren’t always stemmed correctly.

    For example, when searching in German, the words passieren, passierend, and passiert might result in different stems, while passieren and passiere share the same stem.

  • The correct language isn’t being detected. Stemming is only applied if the language is detected for both the query and the indexed documents.

Resolution