Use the Extensions API

Coveo organization sources can pull content from a variety of systems to make your content searchable for those with the appropriate permissions (see Connector types).

The indexing pipeline extension (IPE) feature provides a way to execute Python conversion scripts in a securely isolated non-persistent container, allowing developers to customize how items get indexed. Extension scripts can be executed at two different stages of the indexing pipeline: pre-conversion and post-conversion.

Notes

Usage overview

You can execute an indexing pipeline extension for every item of one or more sources of your organization using the Extension API:

  1. On the Administration Console API Keys (platform-ca | platform-eu | platform-au) page, add an API key to which you grant the privilege to edit extensions (that is, the Edit access level on the Extensions domain) (see Manage API keys, Manage privileges, and Extensions domain).

  2. Write your extension script using the document object (see Document object Python API reference).

  3. Create your extension (see Creating an indexing pipeline extension with the API).

  4. Add your script to your extension.

  5. Apply your extension to your source(s) (see Apply an extension to a source).

  6. rebuild your source(s) to make your extension effective.

  7. Validate that your changes perform as expected.

Indexing Pipeline Extensions API versions

When using the Indexing Pipeline Extensions API, you may want to specify the API version you want to use. By default, version 1 is used.

To specify the version to use, add the apiVersion key to your request JSON body. Accepted values are v1 and v2. Default is v1.

The difference between the API versions are the following:

Aspect Version 1 Version 2

Populating dictionary fields

Not supported

Supported

get_meta_data_value method

  • Returns a list of one value when the field contains a single value.

  • Returns an empty list when the field is empty.

  • Returns the field value (without any list wrapper) when the field contains a single value.

  • Returns None when the field is empty.

By default, the methods listed in the Document Object Python Reference use the version of the IPE API that you used to create your extension. For example, if you created an extension with version 2 of the API, the methods retrieving metadata will support dictionary fields seamlessly, as they will also use version 2 of the API by default.

However, if you created an extension using version 1 of the API and now need to handle dictionary fields, do one of the following to ensure that version 2 is always used in the future:

  • If your extension contains only one method, you can edit the extension script to specify the API version to use in the method call. For example, if using document.get_meta_data(), change it to document_api.v2.get_meta_data().

  • If your extension contains multiple methods, make the Update extension API call, using {"apiVersion": "v2"} as the request body.

In either case, avoid using both versions in an extension and make sure that your extension still works as expected after switching versions.

Python version deprecation

Currently, the IPE feature uses Python 3.10. To see what has been deprecated from 3.8, refer to:

Extensions with deprecation warnings can be seen in the Log Browser (platform-ca | platform-eu | platform-au) as shown below.

Python deprecation message in the Log Browser | Coveo

Execution of extensions using deprecated code may fail following our upgrade to Python 3.10.

Note

The most common warning is the removal of the unescape method, which has been moved from the HTMLParser object to the html module.

The following code has been deprecated before Python 3.8 and is not supported in Python 3.10.

from html.parser import HTMLParser
h = HTMLParser()
h.unescape("....")

The preceding code should be replaced with:

from html import unescape
unescape("....")