Understanding and Customizing the Binary Data Indexing Process
Understanding and Customizing the Binary Data Indexing Process
On-Premises only
When it comes to indexing binary data, the indexing process of Sitecore items in Coveo indexes is slightly different. This page explains what happens when binary data is indexed, and how you can customize this process.
Understanding the Default Indexing Process of Binary Data
Indexing an item causes these events to occur:
- The Coveo Search Provider fetches the item from the database.
- The Coveo Search Provider configures the item with the required metadata and fields.
- The Coveo Search Provider adds the binary data related properties using the
BinaryDataPropertiesWriter
specified in the index configuration. - The Coveo Search Provider pushes the Sitecore Item to the RabbitMQ queue.
- The Queue Crawler fetches the item stored in the RabbitMQ queue.
- The Queue Crawler determines whether it has to retrieve the item’s binary data.
- If the binary data must be retrieved, the Queue Crawler sends a request to the Sitecore Web Service for the data.
- The Sitecore Web Service retrieves the data and sends it to the Queue Crawler.
- The Queue crawler indexes the item along with the binary data.
Customizing the Default Process
Send the Binary Data to RabbitMQ
The binary data can be sent along with the item’s metadata to the RabbitMQ queue. To do this:
- In your
Coveo.SearchProvider.config
file, locate and copy thecontentSearch
node. - In your
Coveo.SearchProvider.Custom.config
file, paste thecontentSearch
node. You may now close yourCoveo.SearchProvider.config
file. -
Locate the following index configuration, which you just copied in your file.
<index id="Coveo_web_index" type="Coveo.SearchProvider.ProviderIndex, Coveo.SearchProvider"> <param desc="p_Name">$(id)</param> ...
-
Add the
BinaryDataPropertiesWriter
node used to send the binary data to RabbitMQ like in the example below.<index id="Coveo_web_index" type="Coveo.SearchProvider.ProviderIndex, Coveo.SearchProvider"> <BinaryDataPropertiesWriter type="Coveo.SearchProvider.Documents.BinaryDataPropertiesWriter.BinaryDataInQueuePropertiesWriter, Coveo.SearchProviderBase" /> <param desc="p_Name">$(id)</param> ...
- Save the file.
- The binary data will now be sent directly to RabbitMQ and won’t have to be downloaded by the Queue Crawler.
Compress the Binary Data Sent to the RabbitMQ Queue
When you send binary data to the queue, large messages are compressed. By default, messages that include more than 10 MB of binary data are compressed. You can specify the message size threshold beyond which the binary data is compressed. To do so, edit the QueueCompressionThresholdInBytes
setting in the Coveo.SearchProvider.Custom.config
file. The value is specified in bytes.
If you can’t find setting in your Coveo.SearchProvider.Custom.config
file, you need to copy and paste it from the Coveo.SearchProvider.config
file. To avoid upgrading issues, We recommend that you don’t modify the Coveo.SearchProvider.config
file.
<QueueCompressionThresholdInBytes>5000000</QueueCompressionThresholdInBytes>