Catalog Schema and Ingestion APIs

This is for:

Developer
Important

The Catalog Schema and Ingestion APIs are currently in closed beta. Contact your Coveo representative to learn about these APIs and how to get involved.

The Catalog Schema and Ingestion APIs offer a streamlined approach to managing catalog data indexing and updates, simplifying data integration and maintenance within a Coveo organization.

The Catalog Schema API serves as your starting point by letting you define the metadata keys you’ll want to index in your Coveo organization. This schema is used to validate your data as it’s being indexed. Additionally, the Catalog Schema API automates resource creation tasks previously done manually, such as setting up your Catalog source, catalog entity, and catalog configuration.

The Catalog Ingestion API provides an improved alternative to the existing catalog data operations of the Coveo Stream API. It supports full ingestion, partial updates, and data deletion. The Catalog Ingestion API ingests catalog data by validating it against the schemas you’ve defined, ensuring the integrity and consistency of data from the moment it enters the Coveo index.

The following table summarizes the key differences between using the new Catalog Schema and Ingestion APIs versus the Stream API:

Capability Catalog Schema and Ingestion APIs Stream API

Data validation

Schema-based validation at ingestion time prevents invalid data from entering the index.

Manual validation of data structure. This validation happens post-ingestion, meaning invalid data that could potentially damage the implementation may already be in the index.

Organization management

Manual resource creation and configuration required.

Unique identifier management

You only need to define the ec_product_id field. Coveo automatically populates the permanentid and documentid fields.

Manual management of item identifier fields and mappings is required.

Data submission workflow

One single direct call to the Catalog Ingestion API containing your JSON payload. No need to upload files to Amazon S3 containers anymore.

Multi-step process:

  1. Create Amazon S3 file containers.

  2. Upload JSON data files to Amazon S3 file containers.

  3. Send the file container to the Catalog source.

API complexity

Single, unified Ingestion API surface for all operations (full and partial updates).

Multiple API capabilities required (Push API versus Stream API) with complex stream management.

Limitations

This section outlines the current limitations of using the Catalog Ingestion and Schema APIs to manage your catalog data:

  • The APIs currently only support the ingestion of Product catalog object items. This means that if your catalog data contains items of the Variant or Availability types, you can’t use these APIs yet.

  • The APIs don’t currently support the ingestion of dictionary fields. However, the team is actively working on an approach to handle them.

Leading practices

When using the Catalog Schema and Ingestion APIs, consider the following leading practices:

  • Don’t create Catalog sources manually. When creating a schema, the Catalog Schema API automatically creates the necessary Catalog source for you.

  • Always create a schema using the Catalog Schema API before attempting to ingest data.

  • Leverage the automated nature of these APIs to reduce manual interventions to the following resources, as these APIs manage those configurations automatically:

    • Automatically created fields.

      Important

      Avoid changing the Type and Multi-value facet options for fields that are created automatically. You can modify other field options as needed.

    • Automatically created Catalog entities.

    • Automatically created Catalog configurations.

  • Don’t apply indexing pipeline extensions (IPEs) or modify the source mappings for the sources created by the Catalog Schema API.

  • To perform all operations described in this guide, ensure your API key has the following privileges:

    Access level Domain Action

    Edit

    Catalog setup

    View and modify catalog schemas.

    Edit

    Field

    Add, delete, or modify custom fields in schemas.

    Allow

    Push items to sources

    Ingest data into your catalog.

Working with the APIs

To work with the Catalog Schema and Ingestion APIs