Machine Learning Overview

Coveo Machine Learning (Coveo ML) is a cloud and analytics-based machine learning service that continually analyzes search behavior patterns to understand which results and content lead to the best outcomes, such as customer self-service success. In addition to intuitively enhancing search results so the best-performing content always rises to the top, Coveo ML automatically delivers the most relevant search results and proactive recommendations with minimal effort.

Coveo ML continuously learns from evolving user activity and rapidly adapts recommendations following changes such as seasons, new product adoption, or industry news.

Coveo ML is available with Coveo-powered applications such as Coveo for Salesforce - Community Cloud Edition and Coveo for Sitecore - Pro Cloud Edition (see Coveo for Sitecore Editions and Pricing).

Coveo ML offers five features:

A consumer electronics retailer has many online community visitors seeking help configuring a popular media player console. Using queries such as media console help, many of them found a particular article to be very helpful, and it proved successful in preventing ticket submissions. Coveo ML ART learns and automatically boosts the relevance of this article for new visitors running similar queries.

After the company releases a new media console model that quickly becomes very popular, visitors searching for media console help find an article on the new model to be more helpful. ART automatically learns this new trend and updates its recommendations.

1727Diagram

Behind the scenes, Coveo ML features process usage analytics data to build and maintain complex Coveo-managed predictive models to make recommendations. You can use the Coveo Administration Console to activate and configure these Coveo ML features with just a few clicks.

Models

The Coveo Machine Learning (Coveo ML) features leverage usage analytics data by creating and training algorithmic models to predict and recommend which content is most helpful to users. Coveo ML is a service managed by Coveo that you can view like a ‘scientist in a box’ that handles the model complexity.

Members of the Administrators and Relevance Managers built-in groups can activate and configure a Coveo ML feature in minutes through the Administration Console. Behind the scenes, a predictive model is automatically built and is typically ready to make recommendations within 30 to 60 minutes.

The time required to build a model depends mainly on the system load (i.e., the number of model requests in the queue) and the size of the training set. Therefore, even models with small training sets can take several minutes to build when the request is in the queue waiting for resources.

A Coveo ML feature starts making recommendations as soon as sufficient data is available to the model. Some features have threshold values. Consequently, if you just start collecting usage analytics data and enable a Coveo ML feature, depending on your search traffic, it will take some time before an operational model starts to actually make recommendations, but they will get better and better as more data becomes available.

Training and Retraining

A model is trained with usage analytics data from a given recent period and regularly retrained.

By default, a Coveo ML model is based on usage analytics data of the last 3 months from when the model is built to ensure sufficient data is available and is updated every week to maintain the model freshness.

The more data is available for the model to learn from, the better will be the recommendations. As a general guide, a usage analytics data set of 10,000 queries or more typically allows a Coveo ML feature model to provide very relevant recommendations. You can look at your Coveo Usage Analytics (Coveo UA) data to evaluate the volume of queries on your search hub, and ensure that your Coveo ML features are configured with a training Data Period that correspond to at least 10,000 queries. When your search hub serves a very high volume of queries, you can consider reducing the data period so that the model learns only more recent user behavior and be more responsive to trends.

A Coveo ML feature model is regularly retrained on a more recent Coveo UA data set, to ensure that recent user behavior is learned and the model freshness maintained.

Set your Coveo ML model training Frequency parameter in relation with the Data Period value. Select a longer time interval for a larger Data Period and shorter time interval for a smaller Data Period as recommended () in the following table.

Data Period Frequency
Daily Weekly Monthly
1 week
1 month
3 months
6 months
1 year

= recommended = available

  • Because very frequently retraining a model based on a long period would have very little effect and consume significant Coveo ML service resources, some Data Period and training Frequency parameter value combinations aren’t allowed.

  • If your Coveo organization has not yet collected enough data according to requirements, but your search interface has more than 55 visits per day in which a user query is followed by a click for a specific language, you can reproduce the following configuration depending on when you start collecting data:

    Data collected for Data Period Frequency
    Daily Weekly Monthly
    1 to 29 days 1 month
    1 month 3 months

Sub-Models

By default, models are built for each combination of languages, search hubs, and tabs, because these attributes normally define different types of users and use cases.

Query suggestions that were recommended based on your internal search interface logged events aren’t recommended in your external search interface.

A model learns separately from search visits made in search interfaces offered in different languages since keywords will often be different from one language to another. You can review the number of recommended items for each submodel of ART models in the Administration Console (see Reviewing Coveo Machine Learning Model Information).

  • The number of submodels doesn’t matter. However, the quality of submodels depends on the number of events that were used to build each submodel. You can review the number of recommended items for each submodel of ART models in the Administration Console (see Reviewing Coveo Machine Learning Model Information).

  • Submodels aren’t grouped, meaning that submodels built on very different user behaviors don’t negatively impact the quality of the parent model.

  • The variation in dataset sizes used to build submodels has no negative impacts on the parent model quality.

Your Community search page and your content are available in several languages, but 75% of the queries are made in the English version of the search page, only 4% on the Greek search page.

On the Greek search page, a user searches for DFT-400, a product name that’s the same in all languages. Because a submodel learned only the user behavior for the Greek search page, Automatic Relevance Tuning (ART) can recommend relevant Greek items for the DFT-400 product. Without language submodels, ART would most likely rather recommend DFT-400 English items that wouldn’t be included in search results because they’re not part of the Greek search interface scope.

Different search hub or interfaces typically serve different purposes where users expect or seek for different results for the same query. Submodels actually filter recommendations when they don’t match the current hub and interface combination to prevent recommending items outside of the expected scope.

When you want the relevance on one search page to influence other search interfaces for a unified experience, you can set the filterFields custom model parameter value accordingly. If the parameter value only contains the desired search page, the model will provide recommendations or suggestions based on user behavior on that specific search page even if the model is active on a search interface in another hub. Before modifying the value, we strongly recommend that you consult your Coveo Customer Success Manager (CSM) or Coveo Support for appropriate guidance. Moreover, you should test changes thoroughly in a sandbox environment before deploying in production.

Recommended Articles