Indexing Pipeline Extension Parameters

indexing pipeline extension (IPE) parameters are key-value pairs defined in source extension JSON configurations and passed as arguments to target IPEs. More precisely, these parameters populate the parameters dictionary available in target IPEs.

You can use IPE parameters to create generic extensions applicable to several sources. Maintaining one generic IPE is easier than maintaining several slightly different IPEs, and there’s a limit to the number of IPEs you can define in your Coveo organization.

Example

You have two similar Web sources: one indexes www.myHostNameA.com and the other indexes www.myHostNameB.com. Both display products and store locations. You want to write an IPE to parse the URL and add a metadata designating the site subsection.

Now, URL subsections differ slightly from one website to another: www.myHostNameA.com/product versus www.myHostNameB.com/item, and www.myHostNameA.com/store versus www.myHostNameB.com/location. Hard coding the host name and the subsection keywords in your IPE would make it static and applicable to a single source, so you decide to pass arguments to the IPE instead.

The source JSON configuration for the IPE in the first source:

[
  {
    "actionOnError": "SKIP_EXTENSION",
    "extensionId": "myorganization-xc56kss5iazmlq4irhndj52ns4",
    "parameters": {
      "hostname_value": "myHostNameA",
      "website_part_1": "product",
      "website_part_2": "store"
    }
  }
]

In the second source:

[
  {
    "actionOnError": "SKIP_EXTENSION",
    "extensionId": "myorganization-xc56kss5iazmlq4irhndj52ns4",
    "parameters": {
      "hostname_value": "myHostNameB",
      "website_part_1": "item",
      "website_part_2": "location"
    }
  }
]

With the following post-conversion IPE:

# regex module
import re

match = re.search(r'(\w+)\.com(\/(\w*))?',
                  document.get_meta_data_value('originaluri')[0])
if match:
    base_hostname = match[1]
    site_part = match[3]
    if not site_part:
        document.add_meta_data({'sub_section': base_hostname})
    elif site_part == parameters['website_part_1']:
        document.add_meta_data({'sub_section': 'product'})
    elif site_part == parameters['website_part_2']:
        document.add_meta_data({'sub_section': 'store'})

If you’ve created a sub_section field and mapping, you can now create a standardized sub_section facet to optimize your search page and filter results by product or store using a single IPE, even if the keywords identifying the site subsections differ.