Smart Snippets deployment overview

To provide relevant snippets of content in a search results list, Coveo Machine Learning (Coveo ML) Smart Snippets models require the content they use to be formatted in a certain way. This article provides best practices for properly scoping, formatting, and testing the content you want the model to use.

Step 1: Scope the content

To optimize the output of a Coveo ML Smart Snippet model, we strongly recommend that you first identify the content that the model must use. This will help you better understand how to configure the model.

Item language value

Coveo Smart Snippet models return snippets of content only for items whose language field value is English.

HTML content

Coveo Smart Snippet models only return snippets of content for items that contain content in HTML format. Therefore, you should ensure that the content that you want to use for creating the model contains HTML elements.

Tip
Leading practice

You can use the Content Browser (platform-ca | platform-eu | platform-au) page of the Coveo Administration Console to verify the items that are available for the model.

For example, you select the My Source source and HTML File file type to find the items from the My Source source that can be used as input by the model.

content browser with field selections to target html items in a given source
Important

When checking the items that could be usable by the model in the Content Browser (platform-ca | platform-eu | platform-au), the number of items that match your requirements may differ from the number of items you’ll see when inspecting the Item count section of your model’s model building statistics.

During the build process, the model can ignore some items because they either contain invalid HTML, or no snippets could be extracted from the parsed HTML.

permanentid field

The items from which you want the model to extract snippets must use the value of the permanentid field as their unique identifier.

Tip
Leading practice

You can use the Content Browser (platform-ca | platform-eu | platform-au) page of the Coveo Administration Console and check an item’s proprieties to verify whether an item uses the permanentid as its unique identifier.

Document types

When you inspect the items that the model can use (that is, items containing the required HTML tags in a given source), you may notice that many of these items are available, but not all of them are relevant for the model.

When configuring a Coveo ML Smart Snippet model, you can optionally target certain document types that must be used by the model. This allows you to further narrow the content that the model will use as an input.

Example

The content you want the model to extract resides in a Salesforce source. This content is available in items whose documenttype has the Knowledge value. Therefore, you select Knowledge in the Document type dropdown menu so that the model uses only items whose documenttype field value is Knowledge.

In the Content Browser (platform-ca | platform-eu | platform-au) page of the Coveo Administration Console, you can verify the items that would be available for the model by scoping the items that have the Knowledge value for the documenttype field. For example, you can use a field query to scope the documents as follows:

content browser with field selections to target knowledge items in a given source

Content fields

When you inspect the items that the model can use (that is, items of a specific document type containing the required HTML tags in a given source), you may notice that these items contain multiple fields that embed HTML content. The content of some of these fields may not be relevant and you may not want the model to use it.

When configuring a Coveo ML Smart Snippet model, you can optionally target certain fields to be used by the model. This allows you to further narrow the content that the model will use as an input. If you don’t mention specific fields, the model will use the value of the item body field by default.

Example

The content you want the model to extract resides in a Salesforce source. The relevant information, formatted in HTML, is located in a custom field named sf_case_details_c. Therefore, you select the sf_case_details_c field in the Field(s) containing HTML content dropdown menu so that the model uses only the content that appears in the sf_case_details_c field when extracting the document.

In the Content Browser (platform-ca | platform-eu | platform-au) page of the Coveo Administration Console, you can verify the items that contain the sf_case_details_c field. For example, you can use a field query to scope items of the My Salesforce Source source whose documenttype value is Knowledge and that contains the sf_case_details_c field as follows:

content browser with field selections to target specific items in a given source

Step 2: Optimize the content

Now that you’ve targeted the content that must be used by the model, you must ensure that this content is properly configured.

Coveo ML Smart Snippets establish correlations between the headers appearing in result items and user queries. Therefore, your content must be configured accordingly.

For optimal results, we recommend that you use Google structured data in JSON-LD format in the <head> of the HTML items that must be used by the model to extract snippets.

The following code sample shows a simple HTML markup that contains JSON-LD formatted content within the <head>:

<html>
  <head>
    <title>Example Site - Frequently Asked Questions(FAQ)</title>
    <script type="application/ld+json">
    {
      "@context": "https://schema.org",
      "@type": "FAQPage",
      "mainEntity":[{
        "@type": "Question",
        "name": "What is Smart Snippets?",
        "acceptedAnswer": {
          "@type": "Answer",
          "text": "<p>Coveo Smart Snippet models provide users with answers to their queries directly on the results page by displaying a snippet of the most relevant result item. This allows users to quickly find answers without having to open links from the results page.</p>"]
    }
    </script>
  </head>
  <body>
  </body>
</html>

However, Coveo ML Smart Snippets will also work with raw HTML. When using this approach, it must be noted that Coveo ML Smart Snippet models don’t exclude navigation menus, featured content, or any other peripheral content in an item. To help the algorithm identify this type of content, you should specify CSS properties to exclude.

Notes
  • If your web page doesn’t contain Google structured data, and the questions contained on the web page aren’t formatted using HTML headers (<h> tags), you can use the pre-conversion IPE extension script to specify CSS selectors to identify questions and answers in an HTML item.

  • If the content you want the model to use resides in specific fields, and that this content isn’t properly configured to be optimally used by the model (for example, the item doesn’t contain JSON-LD, or well-formatted HTML) you can use the post-conversion IPE extension script to specify fields whose content will be identified as questions and answers, and converted in JSON-LD format.

HTML supported tags

In an item, a Smart Snippet model takes the content that appears within the following tags into consideration when they’re attached to the last header (<h> tag) in a header stack:

  • <br>

  • <ol>

    Note

    Smart Snippet models support the start attribute in <ol> tags

  • <ul>

  • <li>

  • <p>

  • <b>

  • <i>

  • <em>

  • <span>

  • <a>

    Notes
    • For a snippet to correctly render the hyperlink, the URL value used in the href attribute must be absolute and use a secure scheme (that is, https://).

    • When a user clicks an inline link displayed on a snippet, the openSmartSnippetInlineLink click event is logged. This can be useful to report on the usage of links dipslayed in snippets.

Example

Considering a page that is configured as follows:

<body>
<h1>FAQ</h1>
    <h2>Synchronizing Speedbit Watches</h2>
        <p>The procedure differs depending on the device with which you want to synchronize your watch.</p>
            <h3>Synchronizing a Speedbit Watch With a Smartphone</h3>
                <p>Procedure to synchronize your Speedbit watch with your smartphone.</p>
            <h3>Synchronizing a Speedbit Watch With a Computer</h3>
                <p>Procedure to synchronize your Speedbit watch with your computer.</p>
</body>

The model would process the page as follows:

[
  {"headers":  ["FAQ", "Synchronizing Speedbit Watches", "Synchronizing a Speedbit Watch With a Smartphone"], "excerpt": "<p>Procedure to synchronize your Speedbit watch with your smartphone.</p>"},
  {"headers":  ["FAQ", "Synchronizing Speedbit Watches", "Synchronizing a Speedbit Watch With a Computer"], "excerpt": "<p>Procedure to synchronize your Speedbit watch with your computer.</p>"}
]
Important
  • Coveo ML Smart Snippets ignores the content that appears within the following tags:

    • <script>

    • <style>

    • <form>

    • <table>

    • <img>

    • <input>

  • By default, when a Coveo ML Smart Snippet model finds identical headers in both the JSON-LD and the HTML content, only those found in the JSON-LD are retained. This behavior can be changed by using the parsingMode advanced model JSON parameter.

Step 3: Create the model

Now that you have scoped the content that must be used by the model and your content is properly formatted, you can create your Coveo ML Smart Snippet model and configure it as desired.

See Create a Smart Snippet model for instructions on how to create a Coveo ML Smart Snippet model.

Review the model build information

Now that your model is created and is Active, you can verify whether the model is able to provide snippets of content.

The Get detailed information about a specific model call of the Machine Learning Models API allows you to obtain detailed information about your Coveo ML Smart Snippet model, such as:

  • The number of items it can use to extract snippets of content.

  • The number of HTML headers it can target to find related content.

  • The average length of the snippets it extracted.

  • The total number of snippets that the model can provide.

When performing this API call, you should receive a response that contains a modelBuildingStats object as follows:

"modelBuildingStats": {
    "documentCount": 2688,
    "headerCount": 11239,
    "meanSnippetLength": 104.94903911094738,
    "snippetCount": 16287
}

Step 4: Associate the model with the desired query pipeline

Now that your model is created, you must associate the model with the query pipeline to which the traffic of the desired search interface is directed.

See Associate a Smart Snippet model with a query pipeline for instructions on how to associate your model with a query pipeline.

Step 5: Configure the search interface

Now that your model is configured, and associated with the query pipeline to which the traffic of the desired search interface is directed, you must configure the search interface to include the components that will allow the model to render its output.

Step 6: Test the model

Now that your model is configured and your search interface is set up to display the model’s output, you can test the model to ensure that everything works as expected.

You can test the model on a search interface that contains the required components, and for which the traffic is directed to the query pipeline that you associated with your Coveo ML Smart Snippet model.

Perform a query that would likely trigger a snippet to appear in the search results.

Example

You scope an item for your model from which to extract snippets. It contains a header that reads When exactly is a model retrained?.

When you perform the query When is a model retrained, you see the following:

Example of a smart snippet model in action | Coveo

You can also inspect the request to the Search API to get detailed information about the model’s output for a given query.

  1. Access a search interface which contains the required components, and in which the traffic is directed to the query pipeline that you associated with your Coveo ML Smart Snippet model.

  2. Access your browser’s network monitoring tool.

  3. In the search box, perform a query that would likely trigger a snippet to appear in the search results.

  4. In the network monitoring tool, under the Name column, select the latest request to the Search API. The request path should contain /rest/search/v2.

  5. Select the Preview tab.

  6. Find and expand the questionAnswer property. It provides detailed information about the model’s output for this specific query. This property won’t appear if the model can’t provide snippets for the current query.

    Search API response for a smart snippet model | Coveo

Access your browser’s network monitoring tool

  1. Open your web browser’s developer tools.

    Note

    The examples in this article use the Google Chrome developer tools. For browser-specific information, see:

  2. Select the Network tab.

Search API response reference

This section provides reference information about the response provided by the Search API for the questionAnswer property.

The following snippet displays a Search API response when snippets are found. As you can see, the response is divided into two sections. One listing information about the main snippet, and another listing information about the related questions:

{
 "questionAnswer": {
   "answerFound": "boolean",
   "question": "string",
   "answerSnippet": "string",
   "documentId": {
     "contentIdKey": "string",
     "contentIdValue": "string"
   },
   "score": "int32",
   "relatedQuestions": [
     {
       "question": "string",
       "answerSnippet": "string",
       "documentId": {
         "contentIdKey": "string",
         "contentIdValue": "string"
       },
       "score": "int32"
     }
   ]
 }
}

answerFound (boolean)

Whether snippets were found for the query.

This property evaluates to true even if only a main snippet, or only related questions, were found for the query.

Main snippet

question (string)

The text representing the header attached to the content of the main snippet.

answerSnippet (string)

The text corresponding to the main snippet.

documentId (object)

The contentIdKey and contentIdValue of the item containing the main snippet.

contentIdKey (string)

The content identifier key. Typically, permanentid or urihash.

contentIdValue (string)

The content identifier value.

score (integer [int32])

The relevance score computed for the main snippet based on the query.

relatedQuestions (object)

An object representing the information available for each related question listed for the query.

question (string)

The header attached to the content of the related question.

answerSnippet (string)

The text corresponding to the answer of the corresponding related question.

documentId (object)

The contentIdKey and contentIdValue of the item containing the related question.

contentIdKey (string)

The content identifier key. Typically, permanentid or urihash.

contentIdValue (string)

The content identifier value.

score (integer [int32])

The relevance score computed for the related question based on the query.