Number of Workers
This article applies to the new Crawling Module, which works without Docker. If you still use the Crawling Module with Docker, see Number of Workers (Docker Version) instead. You might also want to read on the advantages of the new Crawling Module.
Versions > 1: new Crawling Module
Versions < 1: Crawling Module with Docker
Workers are the Coveo On-Premises Crawling Module components responsible for executing update tasks requested by the Coveo Platform. For more information on these operations, see Refresh, Rescan, and Rebuild.
There are two types of workers:
Content workers, which accomplish content update tasks.
A content worker executes a single content update operation at once. When you have more than one content worker, each works on its own task. For example, with two content workers, a Jira Software source refresh and a File System source rescan can happen simultaneously, while a single content worker would execute these tasks one after the other.
When all content workers are busy, any new content update task is delayed until one of them becomes available. Content update operations repeatedly executed behind schedule indicate that you have too few content workers. Consequently, the search results in your Coveo-powered search interface may not reflect your actual data.
By default, after its deployment, the Crawling Module has one content worker.
As a rule of thumb, we recommend starting with one content worker for every two sources. If some of your daily source update operations are delayed by more than an hour, try adjusting this number to one content worker per source.
Each Crawling Module source that indexes permissions created with the new connector version comes with at least one security identity provider that provides Coveo Cloud with the permission model of each retrieved item. Coveo Cloud can then replicate this model in the search interfaces it powers so that each end user can see only the content they’re allowed to access in your original repository. The Security Identities Administration Console page shows a list of your security identity providers.
Security workers are responsible for crawling the content permissions and feeding security identity providers this information. They also extract members from group security identities and associate users with their email identity. For more information on sources that index permissions and on how Coveo handles these permissions, see Coveo Cloud Management of Security Identities and Item Permissions.
Similarly to a content worker, a security worker can only execute an update operation at once. It’s therefore crucial to have an adequate number to ensure that Coveo Cloud search results reflect your actual content permissions.
By default, after its deployment, the Crawling Module has one security worker.
As a rule of thumb, we recommend having one security worker for every two security identity providers associated to a Crawling Module source.
Knowing the numbers of workers you need allows you to adjust the hardware of the server on which you will deploy the Crawling Module.
You can also monitor your workers from the Crawling Module component dashboard.