Query Suggestion (QS) model card

What’s a model card?

A model card is a document that provides a summary of key information about a Coveo Machine Learning (Coveo ML) model. It details the model’s purpose, intended use, performance, and limitations.

Model details

The Coveo Query Suggestion (QS) model is designed to provide Coveo customers' end users with relevant suggestions for user queries in real time. It’s primarily designed to enhance user search experience in Coveo-powered search solutions. The QS model displays query suggestions as customers' end users type in the Coveo search box that’s implemented by Coveo customers. The QS model uses past usage data and popular trends to provide the most relevant and accurate suggestions to customers' end users.

  • Development team: Coveo ML team

  • Initial release date: November 21, 2015. Major changes can occur and are communicated via Coveo release notes.

  • Activation: The QS model is created and assigned to query pipelines using the Coveo Administration Console.

Intended use

  • Intended purpose:

    • Primary use case: To enhance the end users' experience by suggesting potential queries as they type into the Coveo search box.

    • Secondary use case: To enhance the end users' search accuracy by improving query corrections.

  • Intended output:

    • Primary use case: Query suggestions will be provided as the end user types in the Coveo search box.

    • Secondary use case: In situations where the query correction system cannot provide a correction, based on the index, the QS model will use the first query suggestion to find a match.

  • Intended users: End users of Coveo customers.

Note

When choosing a use case, Coveo customers should also ensure that it aligns with Coveo’s Acceptable Use Policy.

Factors

The QS model uses a combination of factors to suggest queries. The two most important factors being performance metrics of common searched queries, and the similarity between candidate queries and the characters typed by the end user in the Coveo search box. Other factors include the end user’s search history, end user’s context, and end user’s main topic of interest.

Training data

The data used to train the QS model is managed by and specific to each Coveo customer’s organization. It contains usage analytics data which reflects the use of Coveo’s hosted services by the customer’s end users. More specifically, the training data is composed of the queries typed by end users and the search results they clicked on.

As the training data is used with the QS model, the model learns query performance in different user contexts (for example, end user role or end user office), and learns end users' main topic of interest.

The QS model then learns to predict what a user is likely to type next based on the initial few characters entered in the Coveo search box.

The more quality data the QS model is trained on, the better it becomes at assessing the quality of queries in different contexts, thereby providing more accurate and relevant query suggestions.

Performance

The QS model quality and performance is based on diverse performance metrics. Metrics include:

  • Clickthrough rate metrics: The clickthrough rate of the suggested queries which will help measure the relevance of the suggestions.

  • Search Metrics: Evaluate performance improvements by comparing search results metrics, such as clickthrough rate and average click rank, with and without the use of query suggestions.

Customers also have the ability to test QS model performance by performing an A/B test on a specific set of queries with and without the QS model or with different model configurations.

Also, the performance of the QS model improves over time, as it gathers more usage analytics data.

Limitations

Some factors might degrade the QS model's performance:

  • Quality of training data: The QS model's effectiveness is largely dependent on the quality of the training data. If the data is biased, non-representative, irrelevant, incomplete, or inadequate, the model’s performance will be affected. For instance, the QS model's performance might not perform optimally if the model is not trained with enough analytics data.

  • Language limitations: The QS model can provide suggestions in all languages, but some languages are better supported than others with specific tokenizers and stemmers. The QS model might therefore experience limitations to provide accurate suggestions when faced with languages for which the model isn’t optimized.

  • Adapting to new trends: End-user search behavior and popular queries can change rapidly. While the QS model is designed to learn and adapt to those trends, there might be a short delay in adapting to rapidly evolving trends or shifts in user behavior.

  • No knowledge of the index: The QS model may continue to suggest queries even after the removal of the related items in the customer’s index. This may result in zero results queries or affected relevancy. The model is frequently retrained to mitigate this limitation.

  • Model system limits: To ensure optimal response time performance, a QS model limits the number of possible suggestions per language to a preset maximum. The limit is enforced after the most relevant query suggestions are identified and ranked. The enforced limit is large enough to not negatively impact the quality of the suggestions. The most relevant suggestions are always recommended to the user, regardless of the enforced limit. The limit, however, may explain why a query that appears as a candidate in your data isn’t suggested for a given user query.

Best practices

Best practices for the QS model are documented in Leading practices.

Coveo QS follows a list of prohibited terms that should not be suggested, such as sexual content. Each customer can also use Blocklists to provide a custom list of prohibited terms for a specific use case.