Indexing Commerce Catalog Content

To index your commerce content, you need a Coveo organization. The Coveo Platform access lets you create a source which is the bridge to your commerce data.

If you have never used the Cloud Platform before, log in now.

  • Contact your sales representative to enable Coveo for Commerce features in your organization.

  • You can review your organization license limits in the Coveo Platform.

Step 1: Create Your Catalog Source

The recommended way to index a commerce catalog is to stream its data to a Catalog source created within the Coveo Administration Console.

Step 2: Prepare Your Catalog Data

A specific catalog data structure is recommended to optimize the search experience with Coveo. There are often a combination of three types of objects: Products, Variants and Availabilities.

Every Coveo item is represented by a JSON configuration, which can be inspected in the Administration Console (see Review Item Properties - Item JSON Tab).

Products

Products are searchable items. In a catalog without variants, a product is also a purchasable item. In a catalog with variants, users search for products, and then select a variant to purchase.

Here’s an example of a JSON representation of a product:

 {
   "DocumentId": "product://001-red",
   "FileExtension": ".html",
   "title": "Coveo Soccer Shoe - Red",
   "model": "Authentic",
   "brand": ["Coveo"],
   "description": "<p>The astonishing, the original, and always relevant Coveo style.</p>",
   "color": ["Red"],
   "groupid": "001",
   "productid": "001-red",
   "imagesurl": ["https://myimagegallery?productid"],
   "gender": "Men",
   "price": 28.00,
   "category": "Soccer Shoes",
   "objecttype": "Product"
 }

The above JSON contains generic information about the Coveo Soccer Shoe - Red product, such as its description, image, and price.

The objecttype metadata is important, as it will be used to identify the item as a product in the index.

The productid metadata will be used to establish relationships with variant and availability objects. In your catalog, this metadata may have a different label.

If your catalog doesn’t have variants or availability restrictions, proceed to Step 3: Create Fields.

Variants

Variants are never returned as search results. A variant instead provides additional metadata on a parent product. In a catalog with variants, users search for products, and then select a variant to purchase.

Here’s an example of a possible JSON representation of a variant:

 {
   "DocumentId": "variant://001-red-8_wide",
   "FileExtension": ".html",
   "title": "Coveo Soccer Shoe - Red / Size 8 - Wide",
   "sku": "001-red-8_wide",
   "productsize": "8",
   "width": "wide",
   "productid": "001-red",
   "objecttype": "Variant"
 } 

The above JSON contains information specific to a product for sale (or SKU).

In this example, the Coveo Soccer Shoe product varies in size and width, so a distinct variant would be needed for every possible combination of those.

Observe that the product picture isn’t included in the variant, since in this case, the actual Coveo Soccer Shoe - Red product looks the same regardless of its size and width.

The objecttype metadata is important, as it will be used to identify the item as a variant in the index.

The productid metadata is used to establish a relationship with the parent product. In your catalog this metadata may have a different label.

The sku metadata is the unique identifier used to create a relationship with availability objects. In your catalog this metadata may have a different label. Use values that are standardized throughout your index.

We recommend using a simple method to differentiate the metadata. You can use dashes (-) as a separator between the groupid, product descriptor(s), and variant descriptor(s), and using underscores (_) as a substitute to spaces in descriptors, e.g.,:

  • groupid: 001, productid: 001-red, sku: 001-red-8_wide

  • groupid: 026, productid: 026-blue_demo, sku: 026-blue_demo-10_slim

Availabilities

Availabilities determine whether a given user can purchase a given product or variant. An availability can be a store inventory, a price list, or anything that controls which user has access to certain products or variants.

Here’s an example of a possible JSON representation of an availability for a common business-to-consumer (B2C) scenario where a local store contains a finite amount of products:

 {
    "DocumentId": "store://s000002",
    "title": "Montreal Store",
    "lat": 45.4975,
    "long": 73.5687,
    "availableskus": ["001-red-8_wide","001-red-9_wide","001-red-10_wide","001-red-11_wide", "001-blue-8_wide"],
    "availabilityid": "s000002",
    "objecttype": "Availability"
 }

And here’s another example for a common business-to-business (B2B) scenario where a price list determines who has access to what products:

 {
   "DocumentId": "store://42",
   "title": "Group ID 42",
   "subscription_level": "Gold",
   "availableskus": ["001-red-8_wide","001-red-9_wide","001-red-10_wide","001-red-11_wide", "001-blue-8_wide"],
   "availabilityid": "42",
   "objecttype": "Availability"
 }

The objecttype metadata is important, as it’ll be used to identify the item as an availability in the index.

In both scenarios, the availabilityid metadata uniquely identifies each availability channel, while the availableskus metadata defines which variants / products are available through a given channel. In your original catalog, these may have different labels.

When an availability channel contains over 1000 items, and you want to improve the performance of your index, it is recommended to use the same field name (i.e. Availableskus) on both the availability channel and the variant. Furthermore, they both need to be written in an array.

  • Variant
     {
      "sku": "001-red-8_wide",
      "availableskus": ["001-red-8_wide"],
     }
    
  • Availability channel
     {
      "availableskus": ["001-red-8_wide","001-red-9_wide",...],
     }
    

Step 3: Create Your Fields

Coveo organization sources come with a set of standard system fields. However, adding your own fields allows the end user to get additional information in search results and to better target desired content (see Field Uses).

Default fields will not be available in the field picker of the Admin UI (see Field Origins).

You will want to explore your metadata before you create your fields. When your metadata has the same name as your field, it’s mapped to that field by default, therefore make sure you create fields with proper names.

You can create your fields manually through the Administration Console, or programmatically through the Fields API.

Avoid repeating specific field names, that you intend to use as facets, on different types of items. For example, if you are defining the color at a product level, then there’s no need to define the color at the variant level. If you need to include a field at both levels, prefix it with product and variant (e.g, productcolor variantcolor).

In addition to the fields you will want to create to leverage product metadata such as price, color, and description within your commerce interfaces (search and listing pages, recommendation interfaces, etc.), you must create a set of string type fields that you will use to configure your Coveo commerce catalog (see Add or Edit a Field):

Suggested field name Field intent Field settings to enable
"productid" Uniquely identifies each product
  • Facet
  • Use cache for nested queries
"sku" Uniquely identifies each variant
"availabilityid" Uniquely identifies each availability channel
"availableskus" Identifies the list of available product/variants in a given availability channel
  • Multi-value Facet
  • Use cache for nested queries

When your catalog only contains products (i.e., if products don’t have variants), or if the products in your catalog are offered through a single availability channel (e.g., a single store or product list), you won’t need to configure all of the above fields. Minimally, however, you will always have to configure a field that can uniquely identify products in your catalog.

Step 4: Stream Your Catalog Data to Your Source

To send your catalog data to your Catalog source, you must use the Stream API. This process consists of three steps:

  1. Open a stream.
  2. Upload your catalog data into the stream.
  3. Close the stream.

Here are examples of the three API calls to use:

Open a Stream

POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/open
Content-Type: application/json
Accept: application/json
Authorization: Bearer <MyAccessToken>

You will get a response like this one:

{
    "streamId": "1234-5678-9101-1121",
    "uploadUri": "https://s3.amazonaws.com/coveo-nprod-customerdata/[...]",
    "fileId": "b5e8767e-8f0d-4a89-9095-1127915c89c7",
    "requiredHeaders": {
      "x-amz-server-side-encryption": "AES256",
      "Content-Type": "application/octet-stream"
  }
}

Upload Your Catalog Data Into The Stream

Using the uploadUri you received:

PUT {uploadUri}
x-amz-server-side-encryption: AES256
Content-Type: application/octet-stream

The uploadUri is valid for one hour.

When the payload exceeds 256MB it has to be chunked in 256MB parts.

You can now upload your catalog data (JSON file). The following is an example of content payload in the body of the request:

{
    "AddOrUpdate": [
    {
     "DocumentId": "product://001-red",
     "FileExtension": ".html",
     "title": "Coveo Soccer Shoe - Red",
     "model": "Authentic",
     "brand": ["Coveo"],
     "description": "<p>The astonishing, the original, and always relevant Coveo style.</p>",
     "color": ["Red"],
     "groupid": "001",
     "productid": "001-red",
     "imagesurl": ["https://myimagegallery?productid"],
     "gender": "Men",
     "price": 28.00,
     "category": "Soccer Shoes",
     "objecttype": "Product"
   },
   {
     "DocumentId": "variant://001-red-8_wide",
     "FileExtension": ".html",
     "title": "Coveo Soccer Shoe - Red / Size 8 - Wide",
     "sku": "001-red-8_wide",
     "productsize": "8",
     "width": "wide",
     "productid": "001-red",
     "objecttype": "Variant"
   },
   {
      "DocumentId": "store://s000002",
      "title": "Montreal Store",
      "lat": 45.4975,
      "long": 73.5687,
      "availableskus": ["001-red-8_wide","001-red-9_wide","001-red-10_wide","001-red-11_wide", "001-blue-8_wide"],
      "availabilityid": "s000002",
      "objecttype": "Availability"
    },
   // ...
    ]
}

Close the Stream

POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/{streamId}/close
Authorization: Bearer <MyAccessToken>

When you upload a Catalog into a source, it will replace the previous content of the source completely. That is, when you update information on some products, you need to upload the full content of that source, even the data that wasn’t changed.

Step 5: Review and Inspect Your Indexed Items

The Content Browser is a basic Coveo Platform demo search interface to help you navigate and inspect your organization sources content.

For instructions on accessing the Content Browser and making use of its many features, see Inspect Items With the Content Browser.

Step 6: Define Your Coveo Commerce Catalog

See Creating a Coveo Commerce Catalog.

Required Privileges

The following table indicates the privileges required to view or edit elements of the Catalogs page and associated panels (see Manage Privileges and Privilege Reference). The Commerce domain is however only available in Coveo Cloud commerce organizations.

Action Service - Domain Required access level
View catalogs

Commerce - Catalogs

View
Edit catalogs

Commerce - Catalogs

Edit

Step 7: (Optional) Create a Demo Search Page

You have successfully used the Content Browser to filter and view your indexed content. Now create a real, customizable Coveo demo search page in the Cloud Platform (see Manage Hosted Search Pages).

With a demo search page you will get an idea of what you can accomplish using the Coveo JavaScript Search Framework in the next step of the solution implementation.

Indexing Alternatives

Coveo provides many out-of-the-box connectors designed to access and index commerce catalog content. Connectors may be system-specific or generic.

The following table summarizes other connection options for commerce content. Click a given connector name for more details regarding features, content security type support, and instructions on how to create a source.

Indexing Alternatives
The Push API is another solution for Commerce indexing, since it gives you full flexibility on what content to index and when. A new or updated product is searchable in a few minutes, without having to wait for a refresh schedule. You can push content from any system, including, but not limited to, a commerce platform, a product information management (PIM) system, a static database, etc.
Use the Database Connector if you prefer to index the underlying database of your commerce system or product information management (PIM) system directly. The database connector allows for incremental refreshes, which can run every few minutes. The Database connector also uses the Coveo On-Premises Crawling Module, which can be installed behind your firewall to avoid having to create firewall rules for Coveo Cloud.
Use the Generic REST API to get content from a remote repository exposing its data through a REST API. The Generic REST API source runs on a schedule, so expect some delays between new content added/updated and the availability in the search.
Use the Sitemap Connector for simple catalogs where all products data is available online and properly discoverable through a Sitemap file or index file. The Sitemap source runs on a schedule, so expect some delays between new content added/updated and the availability in the search.
Use the Website Connector for simple catalogs where all product data is available online. The Web source runs on a schedule, so expect some delays between new content added/updated and the availability in the search.

What’s Next?

Proceed to Integrating a Search Interface into Your Commerce Solution or Website.

Recommended Articles