Crawling Module Requirements

This article applies to the new Crawling Module, which works without Docker. If you still use the Crawling Module with Docker, see Crawling Module Requirements (Docker Version) instead. You might also want to read on the advantages of the new Crawling Module.

To identify the Crawling Module you’re currently using, on the Crawling Modules page of the Coveo Administration Console, look at the Maestro reported version:

  • Versions > 1: new Crawling Module

  • Versions < 1: Crawling Module with Docker

Before you deploy the Coveo On-Premises Crawling Module, you must ensure that your Coveo Cloud license, host server, and IP address allowlist meet the following requirements.

Coveo Cloud License

Product Edition

To use the Coveo On-Premises Crawling Module, a valid Coveo Cloud Enterprise edition license is required. Check your license information to confirm your product edition.

Connectors

Your Coveo Cloud license must also include the connectors you want to use. See Supported Content for details on what the Crawling Module can index. These connectors must allow the Crawling Module as a content retrieval method.

To check whether your Coveo Cloud license includes the desired Crawling Module connectors:

  1. Log in to the Coveo Administration Console as a member of a group with the privileges required to create sources in the target Coveo organization.

  2. In the navigation menu, select Sources.

  3. On the Sources page, click Add Source.

  4. In the Add a Source of Supported Content panel, select the desired source. If the source has more than one content retrieval method, a drop-down menu with the available connectors appears, and you must select the Crawling Module option. If the Crawling Module connector is grayed out, your license doesn’t include it. You must then contact the Coveo Sales team to upgrade your license.

    Unavailable SharePoint sources

Host Server

We recommend that you install the Coveo On-Premises Crawling Module on a server running Windows Server 2019. Windows Server 2016 is also supported.

Your server must also host the repository to index or have access to the server on which this repository is located.

Hardware

To determine your server hardware requirements, you must estimate the number of workers the Crawling Module should have based on the size and update schedule of your content sources.

By default, after its deployment, the Crawling Module has one content worker and one security worker.

CPU and RAM

CPU and RAM requirements are based on the number of workers you need. As a guideline, consider that a Crawling Module instance with 4 to 6 workers typically requires 4 CPUs and 16 GB of RAM.

If your server CPU or RAM is insufficient, you could experience unresponsiveness or crashes preventing update operations from completing.

Disk Space

When you install the Crawling Module on your server, you must select the disk on which you want to deploy it. The required size of this disk depends on the number of items you intend to index, among other things. Coveo recommends a disk space of at least 100 GB.

However, your actual required disk space might be higher if you index a large number of items. As a rule of thumb, consider that 10 million items require about 10 GB of State Store storage. This value increases in a linear fashion.

IP Address Allowlist

If your environment restricts outgoing communications, ensure to allow the IP addresses that the Crawling Module uses.

What’s Next?

Once you have ensured your environment meets all the above requirements, you can proceed with the Crawling Module deployment.

Recommended Articles