Full catalog data updates
Full catalog data updates
To fully update your catalog data in your source (typically a Catalog source), you have to interact with the Coveo Stream API. It supports two types of operations to perform full updates on your catalog data, each suited to specific use cases:
-
Update operations: This operation is typically used to update individual items in your source, but it can also be used to update the entire catalog data. It performs full document replacements—any fields not included in the update payload for an item will be removed from that item in the index. However, items not included in the payload are left unchanged and remain in the source.
-
Load operations: Also known as open and close Stream, this operation overwrites the entire catalog data in your source with the provided data. This means that if you don’t include in the payload an item that was previously indexed, it will be automatically removed from the source.
Prerequisites
To perform the operations listed in this article, you must have:
-
Created a Catalog source or equivalent stream-enabled source.
-
Created and configured a catalog entity that uses the source containing your catalog data.
Leading practices
-
Update operations should be favored over load operations to push or update your catalog data in your source. They provide the same benefits as load operations but are more efficient and have fewer limitations.
Load operations require uploading and processing all file containers at once, which is resource-intensive and delays data availability until the entire load completes. In contrast, update operations process each container as soon as it’s ready, allowing for faster indexing and more up-to-date catalog data throughout the update process.
-
Any update to your catalog data should be done using either update operations or partial item updates operations.
Catalog data structure
To perform full catalog data updates, you need to prepare a JSON file containing your catalog data. This structure can vary in many ways depending on your use case. This is typically a combination of the catalog objects: products, variants, and availabilities. This structure is then used to create a catalog configuration in the Coveo Administration Console.
|
|
The JSON file must contain an object for each item (product, variant, or availability) that you want to index in your source. For instructions on how to configure items for the different catalog object types, see:
The following catalog data (structured in JSON) contains objects that represent products, variants, and availabilities:
{
"AddOrUpdate": [
{
"documentId": "product://001-red",
"FileExtension": ".html",
"ec_name": "Coveo Soccer Shoes - Red",
"model": "Authentic",
"ec_brand": ["Coveo"],
"ec_description": "<p>The astonishing, the original, and always relevant Coveo style.</p>",
"color": ["Red"],
"ec_item_group_id": "001",
"ec_product_id": "001-red",
"ec_images": ["https://myimagegallery?productid"],
"gender": "Men",
"ec_price": 28.00,
"ec_category": "Soccer Shoes",
"objecttype": "Product"
},
{
"documentId": "variant://001-red-8_wide",
"FileExtension": ".html",
"ec_name": "Coveo Soccer Shoes - Red / Size 8 - Wide",
"ec_variant_id": "001-red-8_wide",
"productsize": "8",
"width": "wide",
"ec_product_id": "001-red",
"objecttype": "Variant"
},
{
"documentId": "store://s000002",
"title": "Montreal Store",
"lat": 45.4975,
"long": -73.5687,
"ec_available_items": ["001-red-8_wide","001-red-9_wide","001-red-10_wide","001-red-11_wide", "001-blue-8_wide"],
"ec_availability_id": "s000002",
"objecttype": "Availability"
},
]
}
Update operations
Update operations let you build and update your entire catalog data. It doesn’t overwrite the entire catalog data in your source, meaning that if you don’t include in the payload an item that was previously indexed, it will remain in the source.
If certain metadata exists in the source but is missing from the payload, it will be removed meaning the item is fully replaced by the new version. This is ideal when you need to update all fields of an item, rather than just a subset, without affecting other items in the source.
|
Leading practice
If you only need to update certain metadata in an item (for example, updating a product price), you should use one of the partial catalog data update mechanisms instead. |
To perform a full item update, you must interact with the Coveo Stream API. This section guides you through the different actions that must be taken to update your catalog data.
Refer to the Stream API reference for a comprehensive list of required parameters.
Step 1: Create a file container (Update operation)
|
Make sure that you meet the prerequisites before performing this operation. |
To perform a full document update, you must first create an Amazon S3 file container. Use the Create a file container operation to create an Amazon S3 file container for a specific Coveo organization:
Request template
POST https://api.cloud.coveo.com/push/v1/organizations/<MyOrganizationId>/files?useVirtualHostedStyleUrl=<true|false> HTTP/1.1
Accept: application/json
Content-Type: application/json
Authorization: Bearer <MyAccessToken>
In the request path:
-
Replace
<MyOrganizationId>
with the ID of the target Coveo organization (see Retrieve the organization ID).
In the query string:
-
Optionally, set
useVirtualHostedStyleUrl
totrue
if you want the service to return a virtual hosted-style URL, such ascoveo-nprod-customerdata.s3.amazonaws.com/...
. The default value is currentlyfalse
, which means that the service returns path-style URLs, such ass3.amazonaws.com/coveo-nprod-customerdata/...
.The
useVirtualHostedStyleUrl
query string parameter will soon be deprecated as part of the path-style URL deprecation. From this point onwards, the service will only return virtual hosted-style URLs.
In the Authorization
HTTP header:
-
Replace
<MyAccessToken>
with an access token, such as an API key that has the required privileges to push content to the source.
Payload
{}
The body of a successful response contains important information about the temporary, private, and encrypted Amazon S3 file container that you just created:
{
"uploadUri": "<UPLOAD-URI>",
"fileId": "<FILE_ID>",
"requiredHeaders": {
"x-amz-server-side-encryption": "AES256",
"Content-Type": "application/octet-stream"
}
}
The uploadUri property contains a pre-signed URI to
use in the PUT request of step 2.
|
|||
The fileId property contains the unique identifier of your file container.
You must use this value to send the file container to the source in step 3. |
|||
The requiredHeaders property contains the required HTTP headers for sending
in the PUT request of step 2. |
Step 2: Upload the full item content into the file container
To upload the content to update into the Amazon S3 file container you got from step 1, perform the following PUT
request:
Request template
PUT <MyUploadURI> HTTP/1.1
<HTTPHeaders>
Where you replace:
You can now upload your update data (JSON file) in the body of the request. For example, the following update data is structured in JSON and has items that must be updated and an item that must be deleted:
Payload example
{
"addOrUpdate": [
{
"objecttype": "Product",
"documentId": "product://010",
"ec_name": "Sneaker 010",
"ec_product_id": "010",
"ec_category": "Sneakers",
"gender": "Unisex",
"departement": "Shoes"
},
{
"objecttype": "Product",
"documentId": "product://011",
"ec_name": "Sneaker 011",
"ec_product_id": "011",
"ec_category": "Sneakers",
"gender": "Unisex",
"departement": "Shoes"
},
{
"objecttype": "Variant",
"documentId": "variant://010-blue",
"ec_name": "Sneaker 010 Royal Blue",
"ec_product_id": "010",
"ec_variant_id": "010-blue",
"width": "wide",
"productSize": "9"
},
],
"delete": [
{
"documentId": "store://s000001"
},
]
}
In the request body:
For each item you include in the addOrUpdate array, specifying a unique documentId value for each item is mandatory.
Therefore, you should make sure that all of your items contain a documentId for which the value is a URI that uniquely identifies the item.
This value must be a valid URL with a proper URI prefix, such as product:// , or any other scheme that fits your catalog data. |
|
For each item you include in the delete array, specifying a unique documentId value for each item is mandatory.
This value must be a valid URL with a proper URI prefix, such as product:// , or any other scheme that fits your catalog data. |
A successful response has no content, but indicates that the content update was successfully uploaded to the Amazon S3 file container, as in the following example:
200 OK
{}
|
When the payload exceeds 256 MB, it must be chunked into 256 MB parts. See Uploading large catalog data files for instructions. |
Step 3: Send the file container to update your source (Update operation)
To push the Amazon S3 file container into your source, use the Update a catalog stream source operation as follows:
Request template
PUT https://api.cloud.coveo.com/push/v1/organizations/<MyOrganizationId>/sources/<MySourceId>/stream/update?fileId=<MyFileId> HTTP/1.1
Content-Type: application/json
Authorization: Bearer <MyAccessToken>
Payload
{}
Where you replace:
-
<MyOrganizationId>
with the ID of the target Coveo organization (see Retrieve the organization ID). -
<MySourceId>
with the ID of the source which contains the catalog data that you want to update. -
<MyFileId>
with thefileId
you got from step 1. -
<MyAccessToken>
with an access token, such as an API key that has the required privileges to push content to the source.
A successful response (202
) indicates that the operation was successfully forwarded to the service and that the batch of items is now enqueued to be processed by the Coveo indexing pipeline.
For example:
202 Accepted
{
"orderingId": 1716387965000,
"requestId": "498ef728-1dc2-4b01-be5f-e8f8f1154a99"
}
Where:
orderingId indicates the time your request was received.
You must use this value if you want to delete items that were present in the source before the update. |
|
requestId is the unique identifier for your request. |
|
The contents of a file container can be pushed to multiple sources in the same Coveo organization.
Just update the target The file container remains available for 4 days. |
Step 4: Delete old items
When performing a full item update, you’re either adding catalog data to your source for the first time, or replacing your whole catalog data with newer data. To make sure old items that were previously indexed are removed from your source, you must delete them.
The Delete old documents operation of the Stream API deletes items that are older than a specified date.
To delete old items, you must perform the following POST
request:
Request template
POST https://api.cloud.coveo.com/push/v1/organizations/<MyOrganizationId>/sources/<MySourceId>/stream/deleteolderthan/<MyOrderingId> HTTP/1.1
Content-Type: application/json
Authorization: Bearer <MY_ACCESS_TOKEN>
Where you replace:
-
<MyOrganizationId>
with the unique identifier of your organization (see Find your organization ID). -
<MySourceId>
with the unique identifier of the source to which you want to push content. -
<MyOrderingId>
with the value of theorderingId
you received when you sent the file container to update your source in step 3. If you have to push multiple file containers, you must use theorderingId
of the first file container you sent to update your source. -
<MY_ACCESS_TOKEN>
with an access token, such as an API key that has the required privileges to push content to the source.
A successful response will produce the HTTP response code 201 Created without any content.
Load operations
The load operation (also known as stream) overwrites the entire catalog data in your source. A load operation uses the catalog data you send in the request to completely replace the existing data in the source.
|
|
Performing a load operation via the Stream API involves the following steps:
Limitations
Update operations should be favored over the load operation for both pushing and performing full updates on your catalog data. The load operation has limitations that can affect the performance and reliability of your catalog data updates:
-
Content deletion: When using load operations, indexed items that aren’t sent in the request will be automatically removed from your source.
To prevent the accidental deletion of a substantial number of items from a source, the delete operation is skipped during the process if all of the existing items were to be deleted. Perform an update operation to intentionally delete indexed items.
When your source isn’t used with a catalog configuration, and you open and close a stream with an empty JSON file, all of the content from your source will be deleted.
-
Delayed data ingestion: When using the load operation to push or fully update your catalog data, the index waits until the entire catalog data is uploaded before starting the ingestion process. This means that there’s a delay in the availability of the updated data, causing a mismatch between the data in your system and the data in the Coveo index.
-
Lack of batch processing: When using the load operation to update your catalog data, you must push your entire catalog data every time you want to update it.
To avoid these limitations, consider using update operations instead.
Step 1: Open a stream
|
Make sure that you meet the prerequisites before performing this operation. |
The first step is to open a stream using the Stream API.
To achieve this, you must perform the following POST
request:
POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/open HTTP/1.1
Content-Type: application/json
Accept: application/json
Authorization: Bearer <MY_ACCESS_TOKEN>
Where you replace:
-
{organizationId}
with the unique identifier of your organization (see Find your organization ID). -
{sourceId}
with the unique identifier of the source to which you want to push content. -
<MY_ACCESS_TOKEN>
with an access token, such as an API key that has the required privileges to push content to the source.
If your request is successful, you’ll get the HTTP response code 201. This will get you a response that looks like this:
{
"streamId": "1234-5678-9101-1121",
"uploadUri": "link:https://coveo-nprod-customerdata.s3.amazonaws.com/[...]",
"fileId": "b5e8767e-8f0d-4a89-9095-1127915c89c7",
"requiredHeaders": {
"x-amz-server-side-encryption": "AES256",
"Content-Type": "application/octet-stream"
}
}
|
|
Step 2: Upload your catalog data into the stream
To upload your catalog data into the stream, you must attach the JSON file containing all of your items to the following Stream API PUT
request:
PUT {uploadUri} HTTP/1.1
x-amz-server-side-encryption: AES256
Content-Type: application/octet-stream
-
Where you replace
{uploadUri}
with theuploadUri
you received when you opened the stream in step 1. -
The
x-amz-server-side-encryption
andContent-Type
parameters are authentication headers and so should be included in the request headers section instead of the body of the request.
You can now upload your catalog data (JSON file).
The following catalog data (structured in JSON) contains objects that represent products, variants, and availabilities:
{
"AddOrUpdate": [
{
"documentId": "product://001-red",
"FileExtension": ".html",
"ec_name": "Coveo Soccer Shoes - Red",
"model": "Authentic",
"ec_brand": ["Coveo"],
"ec_description": "<p>The astonishing, the original, and always relevant Coveo style.</p>",
"color": ["Red"],
"ec_item_group_id": "001",
"ec_product_id": "001-red",
"ec_images": ["https://myimagegallery?productid"],
"gender": "Men",
"ec_price": 28.00,
"ec_category": "Soccer Shoes",
"objecttype": "Product"
},
{
"documentId": "variant://001-red-8_wide",
"FileExtension": ".html",
"ec_name": "Coveo Soccer Shoes - Red / Size 8 - Wide",
"ec_variant_id": "001-red-8_wide",
"productsize": "8",
"width": "wide",
"ec_product_id": "001-red",
"objecttype": "Variant"
},
{
"documentId": "store://s000002",
"title": "Montreal Store",
"lat": 45.4975,
"long": -73.5687,
"ec_available_items": ["001-red-8_wide","001-red-9_wide","001-red-10_wide","001-red-11_wide", "001-blue-8_wide"],
"ec_availability_id": "s000002",
"objecttype": "Availability"
},
]
}
|
When the payload exceeds 256 MB, it must be chunked into 256 MB parts. See Uploading large catalog data files for instructions. |
|
Leading practice
|
Step 3: Close the stream
Once you uploaded all your catalog data, you must close the stream.
To achieve this, you must perform the following POST
request:
POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/{streamId}/close HTTP/1.1
Authorization: Bearer <MY_ACCESS_TOKEN>
Where you replace:
-
{organizationId}
with the ID of your organization (see Find your organization ID). -
{sourceId}
with the unique identifier of the source to which you want to push content. -
{streamId}
with the ID of your stream (see step 1). -
<MY_ACCESS_TOKEN>
with an access token, such as an API key that has the required privileges to push content to the source.
If the request to close your items is successful, you’ll get the HTTP response code 200 Created.
The response body contains an orderingId
that indicates the time your request was received, as well as the requestId
which is the unique identifier for your request.
200 Created
{
"orderingId": 1716387965000,
"requestId": "498ef728-1dc2-4b01-be5f-e8f8f1154a99"
}
Given that your request is successful, when you upload catalog data into a source, it will completely replace the previous content of the source. Expect a 15-minute delay for the removal of the old items from the index.
After you’ve uploaded all your items, check the Log Browser (platform-ca | platform-eu | platform-au) to ensure that the streaming of products has been successful. For more information see Use the Log Browser to review indexing logs.
Stream API limits
The Stream API enforces certain limits on request size and frequency.
These limits differ depending on whether the organization to which data is pushed is a production or non-production organization.
The following table indicates the Stream API limits depending on your organization type:
organization type | Maximum API requests per day | Burst limit (requests per 5 minutes) | Maximum upload requests per day | Maximum file size | Maximum item size[1] | Maximum items per source[2] |
---|---|---|---|---|---|---|
Production |
15,000 |
250 |
96 |
256 MB |
3 MB |
1,000,000 |
Non-production |
10,000 |
150 |
96 |
256 MB |
3 MB |
1,000,000 |
|
These limits could change at any time without prior notice. To modify these limits, contact your Coveo representative. |
Stream API error codes
If a request to the Stream API fails because one of the limits has been exceeded, the API will trigger one of the following response status codes:
Status code | Triggered when |
---|---|
The total Stream API request size exceeds 256 MB when pushing a large file container. See Uploading large catalog files. |
|
The amount of total Stream API (upload and update) requests exceeds 15,000 per day (10,000 for non-production organizations). The quota is reset at midnight UTC. |
|
The amount of total Stream API upload requests exceeds 96 per day (4 per hour). The quota is reset at midnight UTC. |
|
The amount of total Stream API requests exceeds 250 (150 for non-production organizations) within a 5 minute period.
The |
|
Coveo declined your request due to a reduced indexing capacity. |
Uploading large catalog data files
The Stream API limits the size of your catalog data JSON file to 256 MB. If your catalog data file exceeds the limit, you must upload multiple JSON files.
To upload multiple JSON files:
When you initially open the stream, you receive an uploadUri
.
This URI is used to upload your first set of metadata (JSON file).
-
After uploading the first file, make a
POST
request to the following endpoint to get a newuploadUri
:POST https://api.cloud.coveo.com/push/v1/organizations/{organizationId}/sources/{sourceId}/stream/{streamId}/chunk HTTP/1.1 Content-Type: application/json Accept: application/json Authorization: Bearer <MY_ACCESS_TOKEN>
This request returns a new
uploadUri
that you can use for the next step. -
Make a
PUT
request using theuploadUri
your received in the previous step. The body of the request must contain the catalog data chunk (maximum 256 MB) that you want to upload.PUT {uploadUri} HTTP/1.1 x-amz-server-side-encryption: AES256 Content-Type: application/octet-stream
If your request to upload the catalog data is successful, you’ll receive a
200
HTTP response code. -
If you have more catalog data files to upload, repeat this process until all of your catalog data has been uploaded. For each file, first obtain a new
uploadUri
, then upload the file.
Required privileges
The following table indicates the privileges required for your organization's groups to view or edit elements of the Catalogs (platform-ca | platform-eu | platform-au) page and its associated panels (see Manage privileges and Privilege reference). The Commerce domain is only available to organizations in which Coveo for Commerce features are enabled.
Action | Service - Domain | Required access level |
---|---|---|
View catalogs |
Commerce - Catalogs |
View |
Edit catalogs |
Content - Fields |
View |
Commerce - Catalogs |
Edit |
|
Search - Execute Query |
Allowed |