About the permanentid field

Note

Use the primaryid field instead of the permanentid field when you need a stable and unique document identifier. See the primaryid field and how it compares to permanentid and uniqueid.

The permanentid field contains a value that permanently identifies each item within an organization. The field value remains the same, even for an item that’s indexed more than once with different sources. This field allows Coveo Machine Learning (Coveo ML) models to learn user behavior on stable item IDs.

Before the addition of the permanentid field, Coveo JavaScript Search Framework pages and Coveo ML models used the urihash field by default to identify index items.

permanentid field value

The method to get the permanentid field value may be different depending on the repository type:

  • For most standard sources, the permanentid field is:

    • Based on the item URI, because it’s an appropriate unique and permanent identifier.

    • A 60 hexadecimal character hash of the item URI (to optimize index performance).

  • For other repository types, such as Google Drive and YouTube, where one item can have more than one URI, the permanentid field is typically a hash of a string containing:

    (repository type identification) + (unique separator) + (source dependent unique item identifier)

    Example

    In a Box source, the item permanentid field value is a hash of the string:

    "https://www.box.com/" + "@@@" + file_id

    where file_id is the Box unique identifier for each item.

  • For Salesforce sources, the permanentid is automatically set by the Salesforce connector.

Taking advantage of the permanentid field

The permanentid field usage should be mostly transparent, but you may need to perform some tasks to fully take advantage of the custom aspects benefits.

  • Standard source types

    The introduction and usage of the permanentid field is meant to be mostly transparent for standard source types:

  • Custom sources (Push API)

    For custom sources populated through the Push API, when the URI isn’t a unique identifier and you want to allow Coveo ML models to learn usage of pushed items, you can:

    1. Push metadata that uniquely and permanently identifies each pushed item.

      The content of the metadata can be anything such as a URI or a GUID, as long as it’s unique and never changes.

      Note

      Using an alphanumeric string of at most 60 characters (without spaces) optimizes index performance. However, you can use any value you want. Using a value that isn’t hashed can help with troubleshooting.

    2. Map this metadata to the permanentid field.

About the urihash field

Before the addition of the permanentid field, the urihash field was used to identify index items.

The value of the urihash field for a given item is based on the item's URI value. The item's URI is hashed to create a standardized urihash value.

For some sources, such as a repository that allows to share items, an item could have more than one URI, and therefore more than one urihash field value. This can cause Coveo ML models to interpret these values as distinct items when they actually refer to the same item.

Example

In a Box source, the item URI includes the Box user ID, so a shared file or folder has many URIs. The permanentid field value is based on the Box file_id value, which remains the same even when an item is shared and accessible with various paths by various users.