Advanced Model Configurations API

The Machine Learning Advanced Model Configurations API lets you manage advanced configuration files for your Coveo Machine Learning (Coveo ML) models.

You can use the API to manage the following advanced model configurations:

Note

Interactive reference documentation is available through Swagger UI (see Coveo Machine Learning API - Advanced Model Configurations API).

Note

All REST API string fields are case-sensitive unless otherwise specified. For example, search queries aren’t case-sensitive.

To keep behavior consistent across the Coveo Platform, the same value passed to different REST APIs must always be passed using the same case. For example, if the unique product identifier, such as ec_product_id, is passed to the Commerce API in lowercase, then it should also be passed to the Usage Analytics Write API in lowercase.

Blocklist

For Automatic Relevance Tuning (ART) and Query Suggestion (QS) models, you can configure a blocklist file that lists the terms to be ignored by the model. Specifically:

  • For ART, if a blocklisted term appears in a query, the model ignores the entire query. This means the model won’t use the query to learn from, and the query won’t be used by the model to influence the relevance of results.

  • For QS, if a blocklisted term appears in a query, the model ignores the entire query. This means the model won’t use the query to learn from, and queries that contain the blocklisted term will never be suggested by the model.

A blocklist is meant to prevent queries that contain undesirable terms, such as to protect against offensive words or for brand protection, from influencing a model output.

Note

A blocklist shouldn’t be used for language-based filtering, such as to specify terms using a language that you don’t want the model to learn from. Instead, to restrict the model to a specific language, use a custom filter for ART or QS.

Important

Terms defined in a default queries file override those defined in a blocklist file.

If a term appears in both a blocklist and default queries file, it won’t be blocked by the model and can still be used or suggested.

Example

You configured a blocklist file that contains the following terms: knights, sabres, and horse.

A user performs the following query: Why do knights use swords?.

Since the blocklist file contains the term knights, the entire query is ignored by the model.

Blocklist file configuration

The blocklist file must be a UTF-8 encoded CSV file, with each term listed as a separate row in a single column.

Once the blocklist file is ready, you can upload it to your model.

Note

The CSV file must contain only the terms to block, and shouldn’t contain headers. Every term in the file is treated as a term to block. In the example CSV file below, knight will be blocked even if it’s intended to be a header in the CSV file.

knight
black knight
dark knight
sword
sabre

Stop words

For Automatic Relevance Tuning (ART) and Query Suggestion (QS) models, you can configure a stop words file that lists the words to be ignored by the model when analyzing user queries.

Stop words are typically common words, such as articles (a, an, the, etc.), prepositions (on, in, at, etc.), and pronouns (he, she, it, etc.). This is done to reduce the impact of common words on the relevance of the model output. However, given a lack of sufficient relevant content, there’s the possibility that the output of an ART model or QS model is influenced by stop words.

Example

You configured a stop words file that contains the following terms: do, I, my, for, the.

A user performs the following query: How do I change my password for the intranet.

Since the stop words file contains the words do, I, my, for, and the, the query is analyzed as follows by the model:

how change password intranet.

Stop words file configuration

The stop words file must be a UTF-8 encoded CSV file, with each word listed as a separate row in a single column.

Once the stop words file is ready, you can upload it to your model.

Note

The CSV file must contain only a list of stop words, and shouldn’t contain headers. Every term in the file is treated as a stop word. In the example CSV file below, how is considered to be a stop word even if it’s intended to be a header in the CSV file.

how
a
in
to
Example
how
a
in
to
for
on
the
and
I
is
of
do
can
not
or
isn't

Default queries

For Query Suggestion (QS) models, you can configure a default queries file that contains a list of queries to be added as suggestion candidates.

This is useful in test environments to make sure that a QS model makes suggestions or to help a new model provide suggestions by including queries originating from an existing site.

Note

The limit of default queries is 5,000 per language. If the file contains more than 5,000 queries for a given language, the model only considers the 5,000 most performed queries.

Important

Terms defined in a default queries file override those defined in a blocklist file.

If a term appears in both a blocklist and default queries file, it won’t be blocked by the model and can still be used or suggested.

Note

To ensure optimal performance, a Query Suggestion (QS) model limits the number of possible suggestions per language to a preset maximum. The limit is enforced after the most relevant query suggestions are identified and ranked, and after any manually defined default query suggestions are applied. The enforced limit is large enough to not negatively impact the quality of the suggestions. The most relevant suggestions are always recommended to the user, regardless of the enforced limit. The limit, however, may explain why a query that appears as a candidate in your data isn’t suggested for a given user query.

Default queries file configuration

The default queries file must be a UTF-8 encoded CSV file. The queries must be listed in a two-column table, where the columns are separated by a comma (,). The first column must contain the queries, whereas the second column can optionally contain an integer value representing the relative importance of each query. For example, a common value would be the past occurrence count of these queries. If there’s no value, all queries are considered to be of equal importance.

Once the default queries file is ready, you can upload it to your model.

Note

The CSV file must contain only a list of queries, and shouldn’t contain headers. Every term in the file is treated as a default query. In the example CSV file below, black knight is considered to be a query even if it’s intended to be a header in the CSV file.

black knight
knight,1200
dark knight,400
sword,250
sabre
Important

Default queries configuration files are set per language using a language code, such as en and fr. But a file can also be configured using the commons language value. When uploading a file that uses the commons language value, the queries in that file are automatically added to every other language-specific default queries files in your model. Languages without a default queries file aren’t affected by the commons file.

For example, your model processes queries in English (en), French (fr), and Spanish (es), but your model includes only en and fr default queries configuration files. Uploading a commons file will add its list of queries to the en and fr configuration files, but not to any other language.

ID mappings

Each indexed item is assigned a permanentid that shouldn’t change in time. However, in some situations, most commonly when the source is changed, a document may be assigned a new permanentid. In this situation, the model would require an ID Mappings file linking the old IDs to the new ones. This ensures that the model can use the Coveo Analytics events that were recorded using the old IDs.

You can configure an ID mapping file for any type of model.

ID mapping file configuration

The ID mapping file must be a UTF-8 encoded CSV file. The mappings must be listed in a two-column table, where the columns are separated by a comma (,).

The first row of the table consists of a header for which the first entry must contain the old field name (urihash in the example below). The second header entry must contain the new field name (permanentid in the example below). For the other rows, the first column must contain the older item ID whereas the second column must contain the one that should now be used by the model.

Once the ID mapping file is ready, you can upload it to your model.

Note

Unlike the CSV file that’s used for the blocklist, stop words, and default queries, the first row of the CSV file for ID mapping must be a header row.

Example
urihash,permanentid
waF9ZfCfOtNtLBrw,4897a0839e4f5fdb757050bb9c7e9128d3b30a6064656001c5e1dceb922a
naQndYJbCSR0iXAk,d2cd76589dd14f0cd6b430cb241af55010737023ddb7eb68796759d7edeb