Relevance Generative Answering (RGA) content requirements and best practices
Relevance Generative Answering (RGA) content requirements and best practices
Relevance Generative Answering (RGA) uses your enterprise content as the raw data from which to generate answers. The content that you choose to use for RGA, and the quality of that content, has a direct impact on the quality of the answers that are generated by RGA.
An RGA implementation requires you to create an RGA model. When creating the model, you must specify the indexed content that the model will learn from and use to generate answers.
This article describes the requirements and best practices with regards to the content that you choose to use for RGA.
Note
An optimal RGA implementation includes both an RGA model and a Semantic Encoder (SE) model. For best results, both models should be configured to use the same content. See RGA overview for information on how RGA and SE work together in the context of a search session to generate answers. |
How RGA uses your content
Before deciding on the content to use for RGA, it’s important to have a basic understanding of how your content is used to generate answers.
When an item is indexed, the item’s content is typically mapped to the body
field in the Coveo index.
The RGA model uses a pre-trained sentence transformer language model to convert your indexed content’s body text to mathematical representations (vectors) in a process called embedding.
When a user enters a query, the model references the vector space to retrieve the most relevant content.
This retrieval is based on semantic similarity using embeddings, which are created using the text in your content’s body text.
In summary, the RGA model parses your content’s body text when creating the embeddings, and uses only that content to generate answers. An item’s body data, therefore, should be as clean, focused, and relevant as possible. The better the data, the better the embeddings, and the better the answers. For best results, you should adhere to the requirements and best practices detailed in this article when choosing the content to use for RGA.
For more information on how your content is used to generate answers, see RGA processes.
The RGA model uses only the content in an item’s |
Note
RGA uses fields such as |
Requirements
-
The content you want to use must be indexed in your Coveo organization before creating the model.
You don’t have to use all the content in your index. In fact, best practices dictate that you should choose a reasonably sized dataset to keep the content focused and relevant. When creating the model, you can choose to use a subset of your indexed content by selecting the sources that contain the items, and then further filtering the source dataset. For more information, see Choose your content.
NoteIf the indexed items you want to use aren’t optimized for use with the model, re-index the items with the proper configuration.
-
An indexed item must contain a unique value in the
permanentid
field in order for the item’s content to be embedded and used by the model.NoteBy default, an item indexed using a standard Coveo source automatically contains a value in its
permanentid
field that Coveo uses as the item’s unique identifier. However, if you’re using a custom source, such as PUSH API, you must make sure that the items that you want to use for answer generation contain a unique value in thepermanentid
field. If not, you must map unique metadata to the item’spermanentid
field.To verify if an item contains a unique value in the
permanentid
field, you can use the Content Browser (platform-ca | platform-eu | platform-au) page of the Coveo Administration Console to check the item’s properties. -
The indexed item’s
language
field is English.NoteTo verify an item’s
language
field, you can use the Content Browser (platform-ca | platform-eu | platform-au) page of the Coveo Administration Console to check the item’s properties.
Supported file types
Coveo has tested and supports the following file types for use with the model:
-
HTML
-
PDF
Notes
|
Best practices
This section describes best practices when it comes to choosing the content to use for the model and how to optimize the content for best results. It also covers strategies for presenting the generated answer to enhance the user experience and avoid potential liabilities.
Choose your content
When deciding on the content to use, consider the following:
-
Prioritize content that’s designed to answer questions such as knowledge base articles, support documents, FAQs, community answers, and product documentation.
-
Prioritize content that’s written in a conversational tone.
-
Prioritize shorter documents that are focused on a single topic.
NoteAvoid using very long documents that cover multiple topics. This may result in text being embedded as semantically similar, even though the context or topic is different.
-
Content should be written using a single language (English).
-
Avoid multiple documents with similar content.
-
Choose a reasonably sized dataset to keep the content focused and current.
Keep the model embedding limits in mind when choosing the content for your model. |
Optimize your content
To optimize your content for the model, follow these best practices:
-
Ensure that boilerplate content, such as headers, footers, and extra navigation elements, are removed from the
body
data when the items are indexed. -
Review the
body
data and source mappings to make sure that the body contains the desired content.
Notes
|
Add a disclaimer
Coveo strongly recommends the inclusion of a disclaimer for the answer generated by RGA, particularly in public-facing use cases.
The purpose of this disclaimer is to prompt users to use the citation links to refer to the original sources of information in your indexed content. This precaution helps mitigate the risk of misinformation resulting from potential inaccuracies in the generated answer.
A disclaimer is included by default in the RGA search interface component when using the hosted search page builder, hosted Insight Panel builder, In-Product Experience (IPX) builder, or the latest versions of the Coveo Atomic library or Coveo Quantic libraries.
Coveo advises consulting with your company’s legal counsel to obtain approval of the final wording.
When is an answer not generated?
Adhering to the requirements and best practices outlined in this article greatly improves the relevancy of the answers that are generated by RGA. In certain cases, however, an answer can’t be generated for a given user query. This can be caused either by insufficient relevant content or not sorting by relevance.
Insufficient relevant content
The segments of text (chunks) that are used to generate an answer must meet a minimum similarity threshold with the user query. An initial verification is made by the RGA model when retrieving the most relevant chunks during second-stage content retrieval, and a second verification is made by the generative large language model (LLM) when generating the answer.
-
If the chunks identified during second-stage content retrieval don’t meet RGA's minimum similarity threshold with the user query, an answer isn’t generated because the prompt isn’t created or sent to the generative LLM (see Answer generation for details on this process). In this scenario, the RGA component doesn’t appear at all in the search results.
You can use the
chunksSimilarityThreshold
model association parameter to modify the semantic similarity threshold that’s used by the RGA model to retrieve relevant chunks. -
If the chunks identified during second-stage content retrieval meet RGA's minimum similarity threshold with the user query, but the chunks don’t contain enough relevant content for the generative LLM to generate an answer, the RGA search interface component appears with a message informing the user of a lack of relevant content.
Not sorting by relevance
If your search interface includes a sorting option, RGA works best when results are sorted by relevance, which is the default sorting option. Otherwise, an answer may not be generated.
Model embedding limits
The RGA model converts your content’s body text to numerical representations (vectors) in a process called embedding. It does this by breaking the text up into smaller segments called chunks, and each chunk is mapped as a distinct vector. For more information, see Embeddings.
Due to the amount of processing required for embeddings, the model is subject to the following embedding limits:
Note
The same chunking strategy is used for all sources and item types. |
-
1 million items or 10 million chunks
NoteCoveo strongly recommends that you add a Semantic Encoder (SE) model as part of your RGA implementation. If you have more than one RGA model in your Coveo organization, for best results each RGA model should use only the items that are used by the SE model.
-
1000 chunks per item
This means that for a given item, there can be a maximum of 1000 chunks. So if an item is extremely long with a lot of text, such as more than 200,000 words or 250 pages, the model will embed the item’s text until the 1000-chunk limit is reached. The remaining text won’t be embedded and therefore won’t be used by the model.
-
250 words per chunk
NoteThere can be an overlap of up to 10% between chunks. In other words, the last 10% of the previous chunk can be the first 10% of the next chunk.