Indexing Pipeline Extension Parameters

indexing pipeline extension (IPE) parameters are key-value pairs defined in source extension JSON configurations and passed as arguments to target IPEs. More precisely, these parameters populate the parameters dictionary available in target IPEs.

You can use IPE parameters to create generic extensions applicable to several sources. Maintaining one generic IPE is easier than maintaining several slightly different IPEs, and there’s a limit to the number of IPEs you can define in your Coveo organization.

You have two similar web sources: one indexes www.myHostNameA.com and the other indexes www.myHostNameB.com. Both display products and store locations. You want to write an IPE to parse the URL and add a metadata designating the website subsection.

Now, URL subsections differ slightly from one website to the other: www.myHostNameA.com/product versus www.myHostNameB.com/item, and www.myHostNameA.com/store versus www.myHostNameB.com/location. Hard coding the host name and the subsection keywords in your IPE would make it static and applicable only to a single source, so you decide to pass arguments to the IPE instead.

The source JSON configuration for the IPE in the first source:

[
  {
    "actionOnError": "SKIP_EXTENSION",
    "extensionId": "myorganization-xc56kss5iazmlq4irhndj52ns4",
    "parameters": {
      "hostname_value": "myHostNameA",
      "website_part_1": "product",
      "website_part_2": "store"
    }
  }
]

In the second source:

[
  {
    "actionOnError": "SKIP_EXTENSION",
    "extensionId": "myorganization-xc56kss5iazmlq4irhndj52ns4",
    "parameters": {
      "hostname_value": "myHostNameB",
      "website_part_1": "item",
      "website_part_2": "location"
    }
  }
]

With the following post-conversion IPE:

# regex module
import re
match = re.search(r'(\w+)\.com(\/(\w*))?',
                  document.get_meta_data_value('originaluri')[0])
if match:
    base_hostname = match[1]
    site_part = match[3]
    if not site_part:
        document.add_meta_data({'sub_section': base_hostname})
    elif site_part == parameters['website_part_1']:
        document.add_meta_data({'sub_section': 'product'})
    elif site_part == parameters['website_part_2']:
        document.add_meta_data({'sub_section': 'store'})

If you have created a sub_section field and mapping, you can now create a standardized sub_section facet to optimize your search page and filter results by product or store using a single IPE, even if the keywords identifying the website subsections differ.

Recommended Articles