Indexing Pipeline Extension Parameters
Indexing Pipeline Extension Parameters
indexing pipeline extension (IPE) parameters are key-value pairs defined in source extension JSON configurations and passed as arguments to target IPEs.
More precisely, these parameters populate the parameters
dictionary available in target IPEs.
You can use IPE parameters to create generic extensions applicable to several sources. Maintaining one generic IPE is easier than maintaining several slightly different IPEs, and there’s a limit to the number of IPEs you can define in your Coveo organization.
You have two similar Web sources: one indexes www.myHostNameA.com
and the other indexes www.myHostNameB.com
.
Both display products and store locations.
You want to write an IPE to parse the URL and add a metadata designating the site subsection.
Now, URL subsections differ slightly from one website to another: www.myHostNameA.com/product
versus www.myHostNameB.com/item
, and www.myHostNameA.com/store
versus www.myHostNameB.com/location
.
Hard coding the host name and the subsection keywords in your IPE would make it static and applicable to a single source, so you decide to pass arguments to the IPE instead.
The source JSON configuration for the IPE in the first source:
[
{
"actionOnError": "SKIP_EXTENSION",
"extensionId": "myorganization-xc56kss5iazmlq4irhndj52ns4",
"parameters": {
"hostname_value": "myHostNameA",
"website_part_1": "product",
"website_part_2": "store"
}
}
]
In the second source:
[
{
"actionOnError": "SKIP_EXTENSION",
"extensionId": "myorganization-xc56kss5iazmlq4irhndj52ns4",
"parameters": {
"hostname_value": "myHostNameB",
"website_part_1": "item",
"website_part_2": "location"
}
}
]
With the following post-conversion IPE:
# regex module
import re
match = re.search(r'(\w+)\.com(\/(\w*))?',
document.get_meta_data_value('originaluri')[0])
if match:
base_hostname = match[1]
site_part = match[3]
if not site_part:
document.add_meta_data({'sub_section': base_hostname})
elif site_part == parameters['website_part_1']:
document.add_meta_data({'sub_section': 'product'})
elif site_part == parameters['website_part_2']:
document.add_meta_data({'sub_section': 'store'})