Use the Extensions API

Coveo organization sources can pull content from a variety of systems to make your content searchable for those with the appropriate permissions (see Connector Types).

The indexing pipeline extension (IPE) feature provides a way to execute Python conversion scripts in a securely isolated non-persistent container, allowing developers to customize how items get indexed. Extension scripts can be executed at two different stages of the indexing pipeline: pre-conversion and post-conversion.

Note

You can manage your indexing pipeline extensions from the Coveo Administration Console Extensions page and get more information on indexing pipeline extensions from the Administration Console documentation (see Manage Extensions).

Usage Overview

You can execute an indexing pipeline extension for every item of one or more sources of your organization using the Extension API:

  1. On the Administration Console API Keys page, add an API key to which you grant the privilege to edit extensions (i.e., the Edit access level on the Extensions domain) (see Manage API Keys, Manage privileges, and Extensions Domain).

  2. Write your extension script using the document object (see Document Object Python API Reference).

  3. Create your extension (see Creating an Indexing Pipeline Extension With the API).

  4. Add your script to your extension.

  5. Apply your extension to your source(s) (see Apply an Extension to a Source).

  6. Rebuild your source(s) to make your extension effective.

  7. Validate that your changes perform as expected.

Python version deprecation

Currently, the IPE feature uses Python 3.8. We will update to version 3.11 in the summer of 2023. To see what has been deprecated from 3.8, refer to:

Extensions with deprecation warnings can be seen in the Log Browser (platform-ca | platform-eu | platform-au) as shown below.

Python deprecation message in the Log Browser | Coveo

Execution of extensions using deprecated code may fail once we upgrade to Python 3.11.

Note

The most common warning is the removal of the unescape method, which has been moved from the HTMLParser object to the html module.

The following code has been been deprecated before Python 3.8 and will not be supported in Python 3.11.

from html.parser import HTMLParser
h = HTMLParser()
h.unescape("....")

The code shown above should be replaced with:

from html import unescape
unescape("....")