Stream your catalog data to your source
Stream your catalog data to your source
This is for:
DeveloperTo send your catalog data to your Catalog source, you must use the Stream API.
You’ll likely use the Stream API in two different stages of your Coveo for Commerce implementation:
-
To push a sample of your catalog data to your source for the first time. This lets you test your catalog data structure, handle field mappings, and inspect content and properties to ensure products are indexed as expected.
-
To push your entire catalog data to your source. This is done once you’ve created and configured your catalog entity, making its data available for Coveo Machine Learning models.
Any other update to your catalog data should be done using a full item update or a partial item update. See How to update your catalog data for more information.
WARNING
When your Catalog source is used in a catalog configuration, currently indexed items not contained in the catalog’s data (JSON file) will be automatically removed from the Catalog source. To prevent accidental deletion of a substantial number of items from a Catalog source, the delete operation is skipped during the stream (rebuild) process if all of existing items were to be deleted. If you wish to delete indexed items, you should carry out a full item update instead. When your Catalog source isn’t used with a catalog configuration, and you open and close a stream with an empty JSON file, all content from your source will be deleted. |
Streaming your catalog data to your source via the Stream API involves the following steps:
When your catalog data requires an update to a subset of products, see How to update your catalog data.
If you use Java in your project, it’s recommended to interact with the Stream API via the Coveo Push API client library for Java, as it can greatly simplify your implementation. Stream API operations are also available in a C# Platform SDK. |
Stream prerequisites
This section outlines the setup required before you can start uploading data to your Catalog source using the Stream API.
Create your Catalog source
The first step is to create a Catalog source that will hold all the products that you want to index.
Once you created your Catalog source, you’ll be able to push your products to the source.
Catalog data structure and configuration setup
A specific catalog data structure is required to optimize the search experience with Coveo. Your catalog data structure can vary in many ways depending on your use case, often a combination of three types of objects: products, variants, and availabilities. This structure is then used to create a catalog configuration in the Coveo Administration Console.
The |
A catalog data structure consists of a JSON file that contains information about your products, variants, and availabilities. For instructions on how to configure items for the different catalog object types, see:
The JSON file must contain an object for each item (product, variant, or availability) that you want to index.
For example, the following catalog data is structured in JSON and has different objects to identify products, variants, and availabilities:
{
"AddOrUpdate": [
{
"documentId": "product://001-red",
"FileExtension": ".html",
"ec_name": "Coveo Soccer Shoes - Red",
"model": "Authentic",
"ec_brand": ["Coveo"],
"ec_description": "<p>The astonishing, the original, and always relevant Coveo style.</p>",
"color": ["Red"],
"ec_item_group_id": "001",
"ec_product_id": "001-red",
"ec_images": ["https://myimagegallery?productid"],
"gender": "Men",
"ec_price": 28.00,
"ec_category": "Soccer Shoes",
"objecttype": "Product"
},
{
"documentId": "variant://001-red-8_wide",
"FileExtension": ".html",
"ec_name": "Coveo Soccer Shoes - Red / Size 8 - Wide",
"ec_variant_id": "001-red-8_wide",
"productsize": "8",
"width": "wide",
"ec_product_id": "001-red",
"objecttype": "Variant"
},
{
"documentId": "store://s000002",
"title": "Montreal Store",
"lat": 45.4975,
"long": -73.5687,
"ec_available_items": ["001-red-8_wide","001-red-9_wide","001-red-10_wide","001-red-11_wide", "001-blue-8_wide"],
"ec_availability_id": "s000002",
"objecttype": "Availability"
},
]
}
Limits
The Stream API enforces certain limits on request size and frequency.
These limits differ depending on whether the organization to which data is pushed is a production or non-production organization.
The following table indicates the Stream API limits depending on your organization type:
organization type | Maximum API requests per day | Burst limit (requests per 5 minutes) | Maximum upload requests per day | Maximum file size | Maximum item size[1] | Maximum items per source[2] |
---|---|---|---|---|---|---|
Production |
15,000 |
250 |
96 |
256 MB |
3 MB |
1,000,000 |
Non-production |
10,000 |
150 |
96 |
256 MB |
3 MB |
1,000,000 |
These limits could change at any time without prior notice. If you need to modify these limits, contact your Coveo representative. |
Catalog data file exceeds 256 MB
The Stream API enforces a limit on the size of your JSON file. As a result, your catalog data JSON file can’t be larger than 256 MB.
When a single catalog data file (JSON file) exceeds 256 MB, you must divide it into smaller JSON files, each not surpassing 256 MB.
Stream API error codes
If a request to the Stream API fails because one of the limits has been exceeded, the API will trigger one of the following response status codes:
Status code | Triggered when |
---|---|
The total Stream API request size exceeds 256 MB when pushing a large file container. See Catalog data file exceeds 256 MB. |
|
The amount of total Stream API (upload and update) requests exceeds 15,000 per day (10,000 for non-production organizations). The quota is reset at midnight UTC. |
|
The amount of total Stream API upload requests exceeds 96 per day (4 per hour). The quota is reset at midnight UTC. |
|
The amount of total Stream API requests exceeds 250 (150 for non-production organizations) in an interval of 5 minutes.
The |
|
Coveo declined your request due to a reduced indexing capacity. |
Step 1: Open a stream
The first step to index your catalog data is to open a stream using the Stream API.
To achieve this, you must perform the following POST
request:
POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/open HTTP/1.1
Content-Type: application/json
Accept: application/json
Authorization: Bearer <MY_ACCESS_TOKEN>
Where you replace:
-
{organizationId}
with the unique identifier of your organization (see Find your organization ID). -
{sourceId}
with the unique identifier of the source to which you want to push content. -
<MY_ACCESS_TOKEN>
with an access token, such as an API key that has the required privileges to push content to the source.
If your request is successful, you’ll get the HTTP response code 200. This will get you a response that looks like this:
{
"streamId": "1234-5678-9101-1121",
"uploadUri": "link:https://coveo-nprod-customerdata.s3.amazonaws.com/[...]",
"fileId": "b5e8767e-8f0d-4a89-9095-1127915c89c7",
"requiredHeaders": {
"x-amz-server-side-encryption": "AES256",
"Content-Type": "application/octet-stream"
}
}
|
Step 2: Upload your catalog data into the stream
To upload your catalog data into the stream, you must attach your JSON file to the following Stream API PUT
request:
PUT {uploadUri} HTTP/1.1
x-amz-server-side-encryption: AES256
Content-Type: application/octet-stream
Where you replace {uploadUri}
with the uploadUri
you received when you opened the stream in step 1.
|
You can now upload your catalog data (JSON file). See Catalog data structure for an example of a catalog data file.
Leading practice
|
Uploading large catalog data files
In the case your catalog data file (JSON file) exceeds 256 MB, you must upload multiple JSON files.
When you initially open the stream, you receive an uploadUri
.
This URI is used to upload your first set of metadata (JSON file).
After uploading the first file, make a POST
request to the following endpoint to get a new uploadUri
:
POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/{streamId}/chunk HTTP/1.1
Content-Type: application/json
Accept: application/json
Authorization: Bearer <MY_ACCESS_TOKEN>
This request returns a new uploadUri
that you can use to upload the second catalog data file.
Next, make a PUT
request to the new uploadUri
:
PUT {uploadUri} HTTP/1.1
x-amz-server-side-encryption: AES256
Content-Type: application/octet-stream
If your request to upload the JSON data is successful, you’ll receive an HTTP response code of 200
.
If you have more catalog data files to upload, repeat this process until all catalog data has been uploaded.
For each file, first obtain a new uploadUri
, then upload the file.
Step 3: Close the stream
Once you uploaded all your catalog data, you must close the stream.
To achieve this, you must perform the following POST
request:
POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/{streamId}/close HTTP/1.1
Authorization: Bearer <MY_ACCESS_TOKEN>
Where you replace:
-
{organizationId}
with the ID of your organization (see Find your organization ID). -
{sourceId}
with the unique identifier of the source to which you want to push content. -
{streamId}
with the ID of your stream (see step 1). -
<MY_ACCESS_TOKEN>
with an access token, such as an API key that has the required privileges to push content to the source.
If the request to close your items is successful, you’ll get the HTTP response code 200 Created.
The response body contains an orderingId
that indicates the time your request was received, as well as the requestId
which is the unique identifier for your request.
200 Created
{
"orderingId": 1716387965000,
"requestId": "498ef728-1dc2-4b01-be5f-e8f8f1154a99"
}
Given that your request is successful, when you upload catalog data into a source, it will completely replace the previous content of the source. Expect a 15-minute delay for the removal of the old items from the index.
After you’ve uploaded all your items, check the Log Browser (platform-ca | platform-eu | platform-au) to ensure that the streaming of products has been successful. For more information see Use the Log Browser to review indexing logs.
Required privileges
The following table indicates the privileges required for your organizations groups to view or edit elements of the Catalogs (platform-ca | platform-eu | platform-au) page and associated panels (see Manage privileges and Privilege reference). The Commerce domain is, however, only available in Coveo commerce organizations.
Action | Service - Domain | Required access level |
---|---|---|
View catalogs |
Commerce - Catalogs |
View |
Edit catalogs |
Content - Fields |
View |
Commerce - Catalogs |
Edit |
|
Search - Execute Query |
Allowed |
What’s next?
-
Once you’re done streaming your catalog data, we strongly recommend that you inspect your content and properties to ensure that your content was indexed correctly.
-
Once your initial catalog data upload is complete, you can make updates to the catalog data by performing a full item update or by making smaller adjustments to information on single products with a partial item update.