Indexing Pipeline Extension Testing Strategies and Good Practices

This article compares available strategies to test indexing pipeline extensions (IPE). The most obvious method to test an extension is to apply it to a source, rebuild the source and validate if the script did what was expected. This method can be very tedious, particularly for a source with a large number of items, since you have to wait for a rebuild at each test.

Simulation alternatives such as the Test an Extension API call and the Coveo Labs pipeline-extension-manager (which relies on the Test an Extension API call) allow you to get results much faster, but come with limitations such as simulated metadata from index fields. Any metadata which was not mapped to a field won’t be available to your extension during the simulation. Similarly, metadata mapped to a field with an unmatched name won’t be available with the proper name.

As a developer, your first choice might be to use the Test an Extension API call. However, while the implementation of an automated testing process is easier with the API, only mapped metadata is available with the API call, as unmapped metadata isn’t indexed. Furthermore, indexed metadata is retrievable only with the mapped field name and not with the original metadata name, unless they’re identical.

The following table provides an overview of the different methods for testing IPEs.

Method Testing goal Advantages Disadvantages
Using the Coveo Labs pipeline-extension-manager Chrome extension When you want to quickly test one or a few extension scripts without referring to unmapped metadata.
  • Easiest and quickest way to test an indexing pipeline extension script on a single item.
  • Easier to use than the Test an Extension API call since it doesn't require developers skills.
  • Easily find and select an item to test with the use of a search page.
  • No need to edit the source JSON configuration to test parameters.
  • Only indexed metadata is available. Only metadata mapped to a field are indexed.
  • Metadata aren't retrievable with their original metadata name, but rather with the mapped field name.
  • The use of metadata origin is worthless since unmapped values are not indexed.
  • May need to map metadata and re-index a whole source to access metadata.
  • Unable to test past versionId of an extension. Only current extension can be tested.
  • Unable to set actionOnError and condition values when testing.
  • The Coveo Support team doesn't offer assistance with regard to this extension.
Using the Test an Extension API call When a developer needs to implement automated tests on many extension scripts without referring to unmapped metadata.
  • Easier to implement an automated testing process with the API call.
  • Immediate results when testing an indexing pipeline extension script on a single item.
  • Only indexed metadata is available. Only metadata mapped to a field are indexed.
  • Metadata aren't retrievable with their original name, but rather with the mapped field name.
  • The use of metadata origin is worthless since unmapped values aren't indexed.
  • May need to map metadata and re-index a whole source to access metadata.
  • Requires developers skills to create or retrieve the document model and feed it to the API.
Logging Messages From an Indexing Pipeline Extension

When you need to test parts of a single extension script and/or when you need to use unmapped metadata.

In this script, the log messages gives detail for each step. Therefore, a developer can validate assigned values and test if-else statements, for example.

item_size = document.get_meta_data_value('size')
log('1- Size of item: {}'.format(item_size[0]), 'Detail')
item_type = document.get_meta_data_value('filetype')
log('2- Type of item: {}'.format(item_type[0]), 'Detail')
if int(item_size[0]) > 2000:
    log('item size is greater than 2000', 'Notification')
elif str(item_type[0]) == 'html':
    log('item type is html', 'Notification')
else:
    log('both conditions failed to match', 'Warning')
  • Easier to test a single line of code with a log message.
  • All metadata and metadata origin are available with their original name.
  • Use of try-except code blocks to manage explicitly specified script errors.
  • You can use the logging messages method jointly with the other three testing methods.

Logging messages while indexing a source with a few chosen items is typically the best strategy in those situations:

  • when you need to implement automated tests.

  • when you don’t need to access unmapped metadata.

  • when you don’t have developers skills.

  • The log values doesn't appear immediately in the Log Browser page or in the SourceLogs API.
  • May need to index a whole source to find relevant log results to analyze.
Testing with a source containing a small number of items When you need access to unmapped metadata in your extension script.
  • All metadata and metadata origin are available with their original names.

  • The rebuild process takes time even with very few elements.
  • Can be difficult to find test relevant items to index.
  • Not always possible to index only a few items of a particular source.

Leading Practices

When using a try-except code block in your extension script, you should generally catch explicitly specified errors to manage them, as shown in the following code sample:

my_title = document.get_meta_data_value('title')
 
if 'Coveo' not in my_title:
    raise ValueError('Coveo not in the title')
 
try:
    my_title = my_title[0]
    my_title = my_title.upper()
    document.add_meta_data({'caps_title':my_title})
 
except ValueError as e:
    log(str(e),'Error')

Any error other than ValueError still raises a flag and makes this script fail. This practice helps to identify errors in your extension script.

You can retrieve any uncaught error messages with the Get specified source document logs SourceLogs API call or in the Administration Console Log Browser page. Furthermore, when binding the extension to a source in the JSON configuration or with the API call, you can manage errors by editing the actionOnError value to SKIP_EXTENSION or REJECT_DOCUMENT.

If your indexing pipeline extension script modifies item permissions, ensure that your code covers every possible use case to prevent disclosing restricted access items to unauthorized users. You should also set actionOnError to REJECT_DOCUMENT to ensure that you never index a document without the proper permissions.

Recommended Articles