Managing the On-Premises Crawling Module Using the REST API

Maestro is driven using a REST API and listens on port 5000 by default. Since no UI is available yet to manage your workers through Maestro, you must use the Swagger at http://localhost:5000/api/swagger/.

Coveo only supports Swagger for Crawling Module management. If you want to use a different tool (e.g., PowerShell), keep in mind that the Coveo Support team offers assistance with Swagger only.

This page lists the calls to use in typical use cases of the Coveo On-Premises Crawling Module. Additional information regarding each API call is provided under Crawling Module REST API Reference.

Initial Configuration

Once you installed Docker and Maestro (see Installing Docker and Installing Maestro):

  1. Access the Coveo Cloud V2 platform, and then select a Coveo Cloud organization to initiate the OAuth handshake.

  2. Use /api/authorize to link Maestro to the Coveo Cloud platform (see Linking Maestro to the Coveo Cloud V2 Platform).

  3. Use the /api/config PUT call to set a Crawling Module configuration (see Editing your On-Premises Crawling Module Configuration).

  4. Use /api/workers/start to start the workers (see Starting the Workers and the Database).

  5. Use /api/status/workers to review the progress of the Docker image download (see Viewing the Crawling Module Status).

  6. Create a source (see Creating a Crawling Module Source).

The Crawling Module can retrieve content from several on-premises sources (see Supported Content). Therefore, only one Crawling Module per Coveo Cloud organization is supported, as you do not need more. You must however ensure that you have enough workers to crawl your content optimally (see Number of Workers).

Typical Crawling Module Use

Once you installed and configured the crawling module, you can edit the Crawling Module configuration, and then create content sources to crawl (see Initial Configuration.

  1. Use the /api/config GET and the /api/config PUT call view and edit the Crawling Module configuration if needed (see Viewing your On-Premises Crawling Module Configuration and Editing your On-Premises Crawling Module Configuration).

  2. Use /api/workers/start to ensure the workers are started (see Starting the Workers and the Database).

  3. Create a source (see Creating a Crawling Module Source).

Updates

If you are using the Coveo On-Premises Crawling Module 0.2, see Updating the Coveo On-Premises Crawling Module 0.2 to update to version 0.3.

If you are already using version 0.3 or later, updates are automatic. This allows you to benefit from the latest features and bug fixes, and prevents the Crawling Module from becoming incompatible with the most recent Coveo Cloud update. To stay up to date, the Crawling Module periodically polls the Coveo Cloud platform for a new version of its components. If an update is available, it will be downloaded and installed within 24 hours, at the scheduled update time. The default time is 11:00 PM (see Editing your On-Premises Crawling Module Configuration).

Disconnecting your server from the Internet or shutting it down will result in the Crawling Module not polling Coveo Cloud for updates. Upon the next successful call for an update, if a component is two or more versions behind, the workers will stop until the Crawling Module is up to date again.

The following REST API calls are still available in case you ever need to update manually, for instance if automatic updates fail:

You can also contact the Coveo Support team for assistance.

Troubleshooting

It is recommended to contact Coveo Support when you encounter issues. You may then be instructed to use the following troubleshooting calls:

Miscellaneous

When creating a ODBC source, use /api/odbc/drivers to view a list of the drivers you can specify in the connection string (see Viewing the Available Drivers for an ODBC Source).