Inspecting All Metadata Values

When debugging a source, you may want to inspect all key/value pairs extracted at the various indexing pipeline stages. The following post-conversion extension script stores this information in a data stream named allmeta, and adds this data stream to each item.

Post-conversion Extension Script Sample:

import json
all_meta = document.DataStream('allmeta')
json.dump(document.get_meta_data(),all_meta, indent=2)
document.add_data_stream(all_meta)

Typical usage of this script:

  1. Apply the script to the source for which you want to inspect available metadata and their origin (see Add or Edit an Indexing Pipeline Extension).

  2. Rebuild the source.

  3. On the Content Browser (platform-eu | platform-au) page in the Source facet, select your source.

  4. Double-click an item to open the Properties panel.

  5. In the Fields tab, click on the Item unique ID to copy the uniqueId value of the item.

    Copying the uniqueId of an item
  6. Use the Get item data stream endpoint to view the newly added dataStream.

    Request template

    GET https://platform.cloud.coveo.com/rest/search/v2/datastream?organizationId=<MyOrganizationId>&dataStream=allmeta&uniqueId=<ItemUniqueId> HTTP/1.1
    
    Accept: application/json
    Authorization: Bearer <MyAccessToken>

    Where:

    200 OK response body:

    [
      {
        "Values": {
          // Metadata key/value pairs...
        },
        "Origin": "crawler"
      },
      {
        "Values": {
          // Metadata key/value pairs...
        },
        "Origin": "converter"
      },
      {
        "Values": {
          // Metadata key/value pairs...
        },
        "Origin": "mapping"
      }
    ]
  7. Once you finish debugging, remove the script from the source and rebuild the source.

Note

You can achieve a similar result by storing the information as metadata on each item as such:

import json
document.add_meta_data({"allmetadatavalues": json.dumps(document.get_meta_data())})

On the desired source, you would map this metadata to a field in your organization. You would then be able to explore the available metadata by inspecting items from that source and looking at the value of this field at query time (e.g., in the Content Browser).

However, because this new metadata greatly increases the size of each item, it can have a negative impact on indexing performance. You should only apply this script temporarily to a source containing one or few representative items.

What's next for me?