Coveo indexing pipeline extensions

This is for:

Developer

Items crawled in Adobe Experience Manager are then sent to the Coveo Platform document processing manager (DPM) to be processed and indexed.

The DPM provides two indexing pipeline extension (IPE) stages during which you can modify your items, using Python code, before they’re indexed. The choice of stage depends on your use case.

The Coveo indexing pipeline
Indexing process showing the pre-conversion and post-conversion extension stages you can leverage to customize the process.
Note

Don’t use indexing pipeline extensions for tasks that the connector itself is designed to handle. For example, when using the Web or Sitemap connector, you shouldn’t use an IPE to remove unwanted sections of web pages. These connectors support web scraping configurations for that very purpose.

Choosing between a pre-conversion and a post-conversion extension

An important stage of the Coveo indexing pipeline is the Processing stage. During this stage, incoming items are converted to an index-ready format. Coveo provides indexing pipeline extension stages prior to this conversion stage and after it.

Examples of use cases for pre-conversion extensions:

  • Rejecting a web page using advanced rules.

  • Formatting values.

Examples of use cases for post-conversion extensions:

  • Modifying the body of a page.

  • Adding or modifying metadata.

If you’re unsure about the stage to choose, see our decision table.

Once you’ve chosen the appropriate stage, you can create your indexing pipeline extension.