Inspecting All Metadata Values

When debugging a source, you may want to inspect all key/value pairs extracted at the various indexing pipeline stages. The following post-conversion extension script stores this information in a data stream named allmeta, and adds this data stream to each item.

Post-conversion Extension Script Sample

import json
all_meta = document.DataStream('allmeta')
json.dump(document.get_meta_data(),all_meta, indent=2)
document.add_data_stream(all_meta)

Typical usage of this script:

  1. Apply the script to the source for which you want to inspect available metadata and their origin (see Add or Edit an Indexing Pipeline Extension).

  2. Rebuild the source.

  3. On the Content Browser page in the Source facet, select your source.

  4. Double-click an item to open the Properties panel.

  5. In the Fields tab, click on the Item unique ID to copy the uniqueId value of the item.

    Copying the uniqueId of an item

  6. Use the Get item data stream endpoint to view the newly added DataStream.

    Request template

     GET https://platform.cloud.coveo.com/rest/search/v2/datastream?organizationId=<MyOrganizationId>&dataStream=allmeta&uniqueId=<ItemUniqueId> HTTP/1.1
      
     Accept: application/json
     Authorization: Bearer <MyAccessToken>
    

    Where:

    200 OK response body

     [
       {
         "Values": {
           // Metadata key/value pairs...
         },
         "Origin": "crawler"
       },
       {
         "Values": {
           // Metadata key/value pairs...
         },
         "Origin": "converter"
       },
       {
         "Values": {
           // Metadata key/value pairs...
         },
         "Origin": "mapping"
       }
     ]
    
  7. Once you finish debugging, remove the script from the source and rebuild the source.

You can achieve a similar result by storing the information as metadata on each item as such:

import json
document.add_meta_data({"allmetadatavalues": json.dumps(document.get_meta_data())})

However, because this new metadata greatly increases the size of each item, it can have a negative impact on indexing performance. You should only apply this script temporarily to a source containing one or few representative items.

Recommended Articles