Coveo machine learning model development and evaluation

This is for:

In this article

Model offline evaluations
Model online testing

Coveo conducts research, development, and evaluation activities regarding its Coveo Machine Learning (Coveo ML) models to develop or improve them. Specifically, Coveo’s evaluation processes include:

Offline model evaluations performed before model deployment
Testing models in real-world conditions (see Model online testing)

Model offline evaluations

Coveo consistently develops and evaluates existing and new Coveo Machine Learning (Coveo ML) models to introduce new features, enhance current functionalities, and address customer-specific needs or identified problems. To ensure that these ML models perform well in real-world conditions, Coveo uses subsets of customer data during the evaluation of these models.

Offline evaluations are performed in a controlled environment within the Coveo Platform. These evaluations are necessary to develop high-quality ML models that are adaptive to the unique characteristics of customer data, without having to deploy and test them online. It serves as a cornerstone in the ML model development lifecycle by allowing Coveo to meticulously evaluate and enhance the performance and reliability of its ML models. This process ensures robustness and trustworthiness prior to deployment.

Customers can always opt out of offline evaluations, which means that Coveo will not use their data for the purposes described above. However, this may impact the overall robustness of updated or new models in these customers' online environments. While Coveo still ensures that deployment of updated or new models will result in improved outcomes for all customers (see Model online testing), opting out removes the possibility for Coveo to evaluate and fine-tune the performance of the model on these customers' specific dataset prior to deployment.

Example

The Coveo ML team identifies a need to improve the Intent-Aware Product Ranking (IAPR) model, which uses an algorithm to personalize search result ranking based on each end user’s intent.

Coveo begins by improving the existing training pipeline, and then utilizes it to train the new IAPR model on various datasets separately. The Coveo ML team subsequently evaluates the new IAPR model against existing historical data to confirm that it outperforms the current model. Given that the new model demonstrates improvements in offline metrics, the Coveo ML team completes the training pipeline code and deploys it to production.

As the IAPR model uses a personalization algorithm, its real performance can only be accurately assessed online with real users. Therefore, Coveo will conduct live testing of the IAPR model, as described in Model online testing.

Model online testing

To ensure continuous enhancement of Coveo Machine Learning (Coveo ML) models, Coveo conducts tests aimed at validating the performance of Coveo’s ML features in real-world conditions for specific use cases. This is referred to as model online testing.

Model online testing not only prevents underperforming model updates from being deployed, but also accelerates the pace of innovation, which directly translates into enhancements of Coveo’s ML functionalities. While customers have the option to opt out of online testing at any time, doing so could lead to the deployment of functionalities that accidentally degrade the existing performance of Coveo ML features without Coveo being aware beforehand. Therefore, Coveo recommends that customers leverage model online testing to ensure access to the latest ML innovations in their specific use cases.

Example

A customer is currently using an Intent-Aware Product Ranking (IAPR) model to personalize search result ranking based on each end user’s intent. As this model relies on a vector representation of products, which is called a “the product vector space”, minor adjustments to the way the vectors are built can result in positive enhancements to the end user’s overall experience.

During offline model evaluation, Coveo identifies a possible optimization in a parameter that’s used to construct the product space. This optimization could lead to increased search conversion rates. Before deploying this change in a customer’s production environment for all users, Coveo opts instead to divert a portion of end-user “traffic” to this new version of the IAPR model. Coveo then performs rigorous A/B testing to verify the assumptions made during offline evaluation.

Upon confirmation that the adjustment yields a positive impact, it’s then deployed to that customer’s production environment, which allows the customer to benefit from the enhanced performance as quickly as possible. Conversely, should model online testing indicate negative outcomes, the change is reverted to prevent any degradation in the model’s performance.