--- title: Manage indexing pipeline extensions slug: '1645' canonical_url: https://docs.coveo.com/en/1645/ collection: index-content source_format: adoc --- # Manage indexing pipeline extensions An [indexing pipeline extension (IPE)](https://docs.coveo.com/en/206/) is a Python script used to customize the way one or more [sources](https://docs.coveo.com/en/246/) [index](https://docs.coveo.com/en/204/) content. For details on how IPEs work, see [Indexing pipeline extension overview](https://docs.coveo.com/en/1556/). This article explains how to use the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page to manage your IPEs. ## Add an indexing pipeline extension Coveo provides [script samples](https://docs.coveo.com/en/111/) to help you get started with extensions. You'll most likely need developer skills to adapt a sample script to your needs, and then test it. > **Note** > > Avoid including a `sys.exit` in your script, as this can cause issues at the [processing stage](https://docs.coveo.com/en/1893#processing) of the Coveo indexing pipeline. Once you're satisfied with your script, follow these steps to add it to your organization, and then apply it to a source. . On the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page, click **Add extension**. . In the **Add an extension** panel, enter a name for your extension. > **Leading practice** > > * Use a short, descriptive name. > > * Use a developer-friendly name, as it may be used in the extension script to specify a metadata or [data stream](https://docs.coveo.com/en/2891/) indexing pipeline stage `origin`. > > * Prefix your extension name with `pre-` or `post-` to indicate at which [stage of the pipeline](https://docs.coveo.com/en/1556#PreVsPost) the script will apply. . Optionally, enter a description of the extension's purpose. For example, explain what it does to [metadata](https://docs.coveo.com/en/218/) or a [data stream](https://docs.coveo.com/en/2891/), and to which sources it applies. . If applicable, select the types of item data that your extension script needs to access. Select only those you intend to use with [`document.get_data_stream(...)`](https://docs.coveo.com/en/34#get-data-stream-method) to avoid adding overhead to the execution. **Body text**
Details The _body text_ contains all the text found in an item. It's formatted to be used at the pipeline's [indexing stage](https://docs.coveo.com/en/1893#indexing), which makes the content searchable. Since the body text is created at the [processing stage](https://docs.coveo.com/en/1893#processing), it's available to post-conversion extension scripts only. Select **Body text** when your extension script needs to process the text extracted from your content items. Selecting this option when it's not necessary may degrade indexing performances. [NOTE] **Note**
For index size and performance optimization, the body text of an item is limited to 10 MB. For rare items with extensive body text, any text that exceeds the limit won't be indexed and, consequently, won't be searchable in the Coveo-powered search interface. ##### ==== **Body HTML**
Details The _body HTML_ of an item is an HTML version of the item. It's used by the [quickview](https://docs.coveo.com/en/3311/) component of a search interface. Since the body HTML is created at the [processing stage](https://docs.coveo.com/en/1893#processing), it's available to post-conversion extension scripts only. Select **Body HTML** only if your script needs the item's quickview content. Selecting this option when it's not necessary may degrade indexing performances. [NOTE] **Notes**
* If you can define your desired body HTML content as a static HTML markup with metadata placeholders, consider [creating a mapping rule for the `body` field](https://docs.coveo.com/en/1640/) instead of an extension. It's typically simpler and more efficient. * For index size and performance optimization, the body HTML of an item is limited to 10 MB. This means that the quickview of items with a larger body HTML will be truncated. ##### ==== **Body Markdown**
Details The _body Markdown_ of an item is a Markdown version of the item. It uses Markdown format and contains all the body text found in an item. The Markdown data stream is used only by a [CPR model to create chunks](https://docs.coveo.com/en/p9ub0044#chunking-data-stream) for the [embeddings](https://docs.coveo.com/en/ncc87383/) that are used for semantic content retrieval. The body Markdown [data stream](https://docs.coveo.com/en/2891/) preserves an item's formatting and structure during ingestion, which allows the model to create more coherent and semantically focused chunks. The Markdown is also preserved in the chunk itself, which improves the reasoning capabilities of a large language model (LLMs). Since the body Markdown data stream is created at the [processing stage](https://docs.coveo.com/en/1893#processing), it's available to post-conversion extension scripts only. Select **Body Markdown** when your extension script needs to process the Markdown text that's used to create chunks for embeddings. [NOTE] **Notes**
* The Markdown data stream is processed for PDF files only. All other file types are processed only with body text and body HTML data streams. * A PDF file that's already indexed won't have a Markdown data stream until it's re-indexed. To make sure all of your PDF files are processed to include a Markdown data stream, [rebuild your source](https://docs.coveo.com/en/2039#rebuild). * If a Markdown data stream exists for an item, the [CPR](https://docs.coveo.com/en/oaie9196/) [model](https://docs.coveo.com/en/1012/) automatically uses the Markdown data stream to create the chunks. Otherwise, the [CPR](https://docs.coveo.com/en/oaie9196/) [model](https://docs.coveo.com/en/1012/) uses the body text data stream to create the chunks. * To optimize indexing performance, the processing time for an item's Markdown data stream is limited to 15 minutes. If the limit is reached, the Markdown data stream will be truncated. In this case, the [CPR](https://docs.coveo.com/en/oaie9196/) [model](https://docs.coveo.com/en/1012/) still uses the truncated body Markdown data stream to create the chunks. ##### ==== **Thumbnail**
Details A _thumbnail_ is a small image that typically represents the content of the item, such as a miniature image of the first page of a document. When available, the thumbnail can be included in search results templates to help search users identify the item. Select **Thumbnail** only if your post-conversion extension script needs the thumbnail generated by Coveo. Selecting this option when it's not necessary may degrade indexing performances.
**Original file**
Details The _original file_ is the actual binary data, or content, of the item. For example, if the item is a PDF file, then the item data is the actual content of this file. Select **Original file** only if your pre-conversion extension script needs the binary data of the item. There's generally no point in feeding the original file to a post-conversion extension, because the [indexing stage](https://docs.coveo.com/en/1893#indexing) doesn't process it. [NOTE] **Note**
Getting the original file can significantly degrade indexing performances, as each item binary data has to be fetched, decompressed, and decrypted. ##### ==== . Under **Restricted parameters**, select [**Vault parameters**](#vault-parameters) if your extension needs access to any existing vault parameters you created. If there's sensitive information required by your IPE, but you haven't created any vault parameters yet, you can [create them with the API](https://docs.coveo.com/en/l9he0046/). . (Optional) Use the **Project** selector to associate your extension with one or more [projects](https://docs.coveo.com/en/n7ef0517/). . Under **Extension script**, enter your Python script. [start=8] . On the **Access** tab, specify whether each group (and API key, if applicable) in your [Coveo organization](https://docs.coveo.com/en/185/) can view or edit the current extension. For example, when creating a new extension, you could decide that members of Group A can edit its configuration, while Group B can only view it. For more information, see [Custom access level](https://docs.coveo.com/en/3151#custom-access-level). . Click **Add extension**. . [Apply your extension to at least one source](https://docs.coveo.com/en/1936/). > **Important** > > If you edit your IPE in the future, keep in mind that your changes will apply to all sources associated to this IPE. > > Test your changes in a [sandbox organization](https://docs.coveo.com/en/2959/) before applying them to your production sources. > This helps you avoid any unexpected behavior. ### Vault parameters _Vault parameters_ serve as placeholders for sensitive information required in your IPEs. They're restricted key-value pairs stored by the [Vault API](https://platform.cloud.coveo.com/docs?urls.primaryName=Migration#/Vault) and accessed by your extensions during execution. You can set the value of your vault parameter as a password, API key, or any other secret in your configurations that you want to keep protected. For example, you set the vault parameter key as `my_password` and the value as your actual password, `1m4$rf83!`. During runtime, `1m4$rf83!` is accessed in a variable, but never appears as text in your scripts. Only when [logging](https://docs.coveo.com/en/34#log-method) the value, wherever `1m4$rf83!` would've been logged, `my_password: sensitive value removed` will be displayed instead. Using vault parameters in your extensions ensures your sensitive information remains hidden and secure. Vault parameters must be created [with the API](https://docs.coveo.com/en/l9he0046/). ## Inspect impacted item logs You can review the logs for the items impacted by an extension. On the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page, click the desired extension, and then select **Inspect impacted items** in the **More** menu. You'll be redirected to the [**Log Browser**](https://platform.cloud.coveo.com/admin/#/orgid/logs/browser/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/logs/browser/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/logs/browser/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/logs/browser/)), where only the items modified by the selected extension will be displayed. For more information on the Log Browser, see [Use the Log Browser to review indexing logs](https://docs.coveo.com/en/1864/). ## Delete an extension It's a good practice to delete unused extensions to keep your Coveo organization clean and optimized. Ensure that the extension isn't used by any source . On the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page, click the extension that you want to delete, and then click **Usage statistics** in the Action bar. . In the **Usage statistics** panel that appears, click **Used by the following sources** to confirm that the extension isn't used by any source. . Close the **Usage statistics** panel. Detach the extension from a source If the extension is used by a source, you must first detach it from the source before you can delete it: . On the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page, click the source that uses the extension you want to delete, and then click **Edit extensions** in the **More** menu. . In the **Edit extensions** panel that opens, click the extension you want to delete, and then click **Delete** in the Action bar. . Click **Delete** to confirm. . Click **Save**. . Repeat the process for all sources that use the extension. Delete the extension . Once your extension is no longer attached to any source, on the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page, select the extension, and then click **Delete** in the Action bar. . Click **Delete** to confirm. ## Edit or restore an old version of an extension To edit an extension, select it on the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page, and then click **Edit** in the Action bar. When you save your changes, Coveo creates a new version of this extension with a unique ID and a timestamp. To view the older versions of an extension, select the extension, and then click **More** > **Manage versions** in the Action bar. Then, you can either restore an old version or create a new version based on an existing one: . In the **Version** panel that opens, select a version, and then click **Restore** in the Action bar to open it. . Optionally, edit the extension, and then click **Save**. The restored or modified version will be saved as a new version, and will become the latest version, that is, the version Coveo uses when indexing your content. > **Note** > > The versioning feature doesn't record changes to the **Name** and **Description** parameters. > If you edit the extension name, for instance, all extension versions will have the new name. ## Copy an extension ID You can copy the ID of an extension to use it in other parts of the Coveo Platform, including the [Coveo Platform API](https://platform.cloud.coveo.com/docs?urls.primaryName=Extension). To do so, on the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page, click the desired source, and then click **Copy extension ID to clipboard** in the **More** menu. An extension ID consists of the [organization ID](https://docs.coveo.com/en/n1ce5273/) and the extension's unique identifier, separated by a dash. It identifies the extension in the Coveo Platform, whereas the [version ID](#edit-or-restore-an-old-version-of-an-extension) identifies a specific version of an extension. ## About extension usage statistics The [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page and the **Usage statistics** panel provide information about the extension's usage, including the number of sources to which it's applied, the number of items processed, and the number of errors encountered. This will help you identify extensions that aren't used or cause issues. ### Execution time status The execution time status reflects the performance of an extension by assessing its average execution time. The closer the average execution time is to the maximum limit, the more problematic the extension is. By default, the maximum limit is 5 seconds, but this number can be increased through [Coveo Support](https://connect.coveo.com/s/case/Case/Default) if requested. If the maximum limit is reached, the extension [times out](#timeout-status). The status considers all items processed by the extension over two monitoring periods: * The last 24 hours * The last 5 minutes An average execution time is calculated for each monitoring period. Out of the two averages, the worse one is chosen and the corresponding execution time status is displayed for your extension. This data helps you identify extensions that take too long to execute. You can then optimize the code efficiency. The possible values are: [%header,cols="~,~"] |=== |Status |Description |Good |The execution time is significantly below the maximum limit. |Warning |The execution time is getting closer to the maximum limit. |Problematic |The execution time is dangerously close to the maximum limit. |=== ### Timeout status An extension times out when it takes longer than the maximum time limit to execute. By default, the maximum limit is 5 seconds, but this number can be increased through [Coveo Support](https://connect.coveo.com/s/case/Case/Default) if requested. When a timeout occurs, the indexing pipeline skips its extension stage, that is, pre-conversion or post-conversion stage. There are no retries, and the item is indexed without the extension's modifications. When an extension has reached too many timeouts, Coveo automatically disables it. You can either create a new extension, or improve the code efficiency of the existing one and re-enable it in the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page afterward. The timeout status is determined by the ratio of timed out executions to the total number of extension executions. The higher the ratio, the more problematic the extension is. Coveo assigns a status by checking, every 5 minutes, the number of extension timeouts that occurred over two monitoring periods: * The last 24 hours * The last 5 minutes Out of the two results, the worse one is chosen and the corresponding timeout status is displayed for your extension. This data helps identify extensions that often fail to execute. You can then disable the extension and optimize the code efficiency. The possible values are: [%header,cols="~,~"] |=== |Status |Description |Good |The percentage of timeouts is acceptable. |Warning |The percentage of timeouts is significant. |Problematic |The percentage of timeouts is high. |=== Moreover, if either count is more than 25% of all executions during that period, Coveo disables the extension. ### Daily statistics The daily statistics provide an overview of the extension's usage over the last 24 hours. The following metrics are displayed: * **Average duration**: The average number of seconds the script takes to execute. * **Number of errors**: Number of extension executions for which the script returned an exception. * **Number of executions**: Total number of executions for all items from all sources to which the extension is applied. * **Number of skips**: Total number of executions for which the extension wasn't executed either because the extension condition was evaluated as `false`, the extension timed out, or the extension was disabled. * **Number of timeouts**: Total number of extension executions that reached the maximum execution time 5 of seconds. ## Review the activity regarding extensions As part of your duties, you may need to review [activities](https://docs.coveo.com/en/173/) related to extensions for investigation or troubleshooting purposes. To do so, in the upper-right corner of the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page, click [clock]. ## Required privileges The following table indicates the privileges required to view or edit elements of the [**Extensions**](https://platform.cloud.coveo.com/admin/#/orgid/content/extensions/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/extensions/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/extensions/)) page and associated panels. For more information, see [Manage privileges](https://docs.coveo.com/en/3151/) and [Privilege reference](https://docs.coveo.com/en/1707/). [%header,cols="16%,~,~"] |=== |Action |Service - Domain |Required access level |View extensions |Content - Extensions Content - Sources Organization - Activities Organization - Organization |View .2+|Edit extensions |Organization - Activities Organization - Organization |View |Content - Extensions Content - Sources |Edit |=== > **Important** > > A member with the **View** access level on the **Activities** domain can access the [Activity Browser](https://docs.coveo.com/en/1969/). > This member can therefore see all activities taking place in the organization, including those from Coveo Administration Console pages that they can't access.