Relevance Generative Answering (RGA) model card
Relevance Generative Answering (RGA) model card
What’s a model card?
A model card is a document that provides a summary of key information about a Coveo Machine Learning (Coveo ML) model. It details the model’s purpose, intended use, performance, and limitations.
Model details
The Coveo Relevance Generative Answering (RGA) model provides Coveo customers' end users with generative answers to queries performed in real time. RGA is primarily designed to enhance end user search experience in Coveo-powered search solutions. The RGA model uses text data from a customer’s index to generate answers that are relevant, personalized, and secure.
-
Development team: Coveo ML team
-
Initial release date: December 14, 2023. Major changes can occur and are communicated via Coveo release notes.
-
Activation: The RGA model is created and assigned to query pipelines using the Coveo Administration Console.
Intended use
-
Intended purpose: To enhance an end user’s search experience by providing a generated answer to a search query using natural language.
-
Intended output: The answer is generated using only the customer’s content that’s indexed to the Coveo Platform. The indexing process is managed by the customer’s administrator.
-
Intended users: End users of Coveo customers.
Factors
The RGA model generates an answer through a combination of factors. The first set of factors involves content retrieval. This includes retrieving the right documents using a hybrid approach that integrates lexical and semantic search, business rules, and behavioral analytics, as well as retrieving the appropriate text chunks using semantic search. The second set of factors pertains to generating the answer based on prompt instructions and the retrieved text chunks.
Training data
The data used within the RGA model is tailored to each Coveo customer’s organization. Specifically, the RGA model uses text data from selected content within the customer’s index. RGA breaks down each document into text chunks and creates embeddings from these chunks.
Customers retain ownership of their data. When using the RGA model, customers grant Coveo a non-exclusive license to use, display, and create derivative works based on such content. These datasets may also include personal information or aggregate consumer information, if provided by customers, which Coveo processes on their behalf.
The selected content and its quality directly impact the quality of the answers generated by RGA. Higher quality data results in better embeddings and more relevant answers. For optimal results, it’s crucial to adhere to the requirements specified in Coveo’s documentation and best practices.
RGA's generated answers and the utilized text chunks can be inspected in Snowflake.
Evaluation data
To support the offline development and evaluation of the RGA model, Coveo uses a combination of publicly available datasets, such as HotspotQA and BoolQ, and customer datasets.
These datasets help test the RGA model’s ability to handle a wide range of query types, including ambiguous or adversarial questions. These datasets are typically not modified unless required to improve robustness testing.
For more information on offline evaluation, see to Coveo machine learning model development and evaluation.
Data use
Coveo collects and processes the datasets during the term of an active subscription between Coveo and its customers.
Performance
The quality and performance of the RGA model is measured by examining how well the model operates based on retrieval and generation metrics performed offline and online. Coveo also uses internal performance metrics to measure the overall reliability of the RGA model, such as response time and average build time.
-
Retrieval metrics: Coveo uses offline retrieval metrics based on publicly available datasets (for example, MTEB dataset) to assess the effectiveness and relevance of the RGA model under controlled conditions without the variability of real-time user interaction.
-
Generation metrics: Coveo uses generation metrics to evaluate the performance, quality, and accuracy of the RGA model’s outputs:
-
Coveo uses offline generation metrics based on internal or public datasets (for example, ASQA public dataset) to assess the answering capabilities of the RGA model. In practice Coveo uses the following:
-
The weighted mean[1] metric to assess when the RGA model:
-
The precision[4], recall[5], and F1 score[6] metrics to evaluate the RGA model’s ability to accurately cite the chunks that are used.
-
The weighted mean.footnote[1] metric to evaluate the repeatability of answers over time for the same end user’s question.
-
-
Coveo uses aggregated information to assess the average online answer rate of queries performed on the Coveo Platform, based on the weighted mean[1] metric.
-
1. A measure that calculates the average value of a set of data points, where each data point contributes to the final average in proportion to its assigned weight. This is particularly important when certain data points are considered more important or relevant than others, which allows for a more accurate representation of the overall data.
2. Soft negatives are examples that are somewhat similar to the positive answers, but aren’t exactly correct.
3. Hard negatives are examples that are very similar to the positive answers, and more challenging to distinguish from positive answers.
Limitations
-
Quality of training data: The effectiveness of the RGA model is largely dependent on the quality of the training data. If the customer’s selected documents that form the dataset are biased, non-representative, irrelevant, incomplete, or inadequate, the model’s performance will be affected. For instance, the RGA model’s performance might be sub-optimal if trained on non-factual documents.
-
Risk of AI hallucinations: The output of the RGA model is based on a customer’s internal content. Therefore, if a customer’s dataset contains meticulously curated informational content that’s accurate and up-to-date, the risk of AI hallucination is drastically reduced. Conversely, if a customer’s internal content contains false or inaccurate information, the risk of AI hallucination increases.
-
Language limitations: The RGA model provides generated answers in multiple languages, including English.
By default, only English content is supported. However, Coveo offers beta support for languages other than English. Learn more about multilingual content retrieval and answer generation.
-
Indirect feedback loop: The RGA model doesn’t directly take end user feedback (thumbs up/thumbs down) into account when generating the answer. However, all behavioral signals from an end user will be taken into account by other ML models (ART, DNE) that influence the ranking of documents that RGA uses to extract text chunks and generate answers.
Best practices
Best practices for the RGA model are documented in Relevance Generative Answering (RGA) content requirements and best practices.