About the permanentid field
About the permanentid field
The permanentid
field contains a value that uniquely and permanently identifies each item with respect to its original repository.
The field value remains the same, even for an item that’s indexed more than once with different sources.
This field allows Coveo Machine Learning (Coveo ML) models to learn user behavior on stable item IDs.
Before the addition of the permanentid
field, Coveo JavaScript Search Framework pages and Coveo ML models used the urihash
field by default to identify index items.
permanentid field value
The method to get the permanentid
field value may be different depending on the repository type:
-
For most standard sources, the
permanentid
field is: -
For other repository types, such as Google Drive and YouTube, where one item can have more than one URI, the
permanentid
field is typically a hash of a string containing:(repository type identification) + (unique separator) + (source dependent unique item identifier)
ExampleIn a Box source, the item
permanentid
field value is a hash of the string:"https://www.box.com/" + "@@@" + file_id
where
file_id
is the Box unique identifier for each item. -
For Salesforce sources, the
permanentid
is automatically set by the Salesforce connector.
Taking advantage of the permanentid field
The permanentid
field usage should be mostly transparent, but you may need to perform some tasks to fully take advantage of the custom aspects benefits.
-
Standard source types
The introduction and usage of the
permanentid
field is meant to be mostly transparent for standard source types:-
All standard sources automatically include the
permanentid
metadata and field mapping. -
Since April 2017, a JavaScript Search Framework search page automatically detects the existence of the
permanentid
field for each clicked item. It sends this field and its value with each usage analytics (UA) event, in thecontentIdKey
andcontentIdValue
metadata fields respectively. The page always sends theurihash
field and its value. -
The Coveo ML models take the
contentIdKey
field and thecontentIdValue
metadata passed in UA events to identify each item to learn from, using theurihash
as a fallback. The transition from theurihash
to thepermanentid
field is therefore automatic. Within an index or even a source, items may be identified using either field.The models can map older UA events, which contain only the
urihash
field, to newer ones with both theurihash
andpermanentid
fields. This allows each item's click history to be preserved through the transition.Following the transition from the
urihash
to thepermanentid
field, your users may experience a minor and temporary degradation of Coveo ML Automatic Relevance Tuning (ART) and Content Recommendation (CR) model performance, but only if you push custom Coveo UA click events that don’t include both theurihash
andpermanentid
.This is because items whose unique identifier suddenly changes will appear as new items to Coveo ML models. With time, however, new UA events on these items will accumulate, rebuilding their usage history and allowing Coveo ML models to learn from them again.
-
-
For custom sources populated through the Push API, when the URI isn’t a unique identifier and you want to allow Coveo ML models to learn usage of pushed items, you can:
-
Push metadata that uniquely and permanently identifies each pushed item.
The content of the metadata can be anything such as a URI or a GUID, as long as it’s unique and never changes.
Using an alphanumeric string of at most 60 characters (without spaces) optimizes index performance. However, you can use any value you want. Using a value that isn’t hashed can help with troubleshooting.
-
Map this metadata to the
permanentid
field.
-
About the urihash field
The value of the urihash
field for a given item is based on the item's URI value.
The item's URI is hashed to create a standardized urihash
value.
For some sources, such as a repository that allows to share items, an item could have more than one URI, and therefore more than one urihash
field value.
This can cause Coveo ML models to interpret these values as distinct items when they actually refer to the same item.
In a Box source, the item URI includes the Box user ID, so a shared file or folder has many URIs.
The permanentid
field value is based on the Box file_id
value, which remains the same even when an item is shared and accessible with various paths by various users.