Managing the Coveo On-Premises Crawling Module Using the REST API (Docker Version)

This article applies to the old Crawling Module, which works with Docker. If you are using the new Crawling Module, see Managing the Crawling Module Using the REST API instead.

The old Crawling Module will soon reach its end-of-life. We recommend switching to the new Crawling Module, which doesn’t require Docker.

To identify the Crawling Module you’re currently using, on the Crawling Modules page of the Coveo Cloud Administration Console, look at the Maestro reported version:

  • Versions > 1: new Crawling Module

  • Versions < 1: Crawling Module with Docker

Maestro is driven using a REST API and listens on port 5000 by default. Since no UI is available yet to manage your workers through Maestro, you must use the Swagger at http://localhost:5000/api/swagger/.

Coveo only supports Swagger for Crawling Module management. If you want to use a different tool (e.g., PowerShell), keep in mind that the Coveo Support team offers help with Swagger only.

This page lists the calls to use in typical use cases of the Coveo On-Premises Crawling Module. Additional information regarding each API call is provided under Crawling Module REST API Reference.

Initial Configuration

Once you installed Docker and Maestro:

  1. Click here (or here if you have a HIPAA organization) to access the Coveo Platform, and then, in the page that opens, select a organization to link with your Crawling Module instance.

  2. Use the /api/config PUT call to set a Crawling Module configuration.

  3. Use /api/workers/start to start the workers.

  4. Use /api/status/workers to review the progress of the Docker image download.

  5. Create a source.

Typical Crawling Module Use

Once you installed and configured the Crawling Module, you can edit the its configuration again, and/or create additional content sources to crawl.

  1. Use the /api/config GET and the /api/config PUT call view and edit the Crawling Module configuration if needed.

  2. Use /api/workers/start to ensure that the workers are started.

  3. Create a source.

Updates

Crawling Module updates are automatic. This allows you to benefit from the latest features and bug fixes, and prevents the Crawling Module from becoming incompatible with the most recent Coveo Cloud update. To stay up to date, the Crawling Module periodically polls the Coveo Platform for a new version of its components. If an update is available, it will be downloaded and installed within 24 hours, at the scheduled update time. The default time is 11:00 PM.

Disconnecting your server from the Internet or shutting it down will result in the Crawling Module not polling Coveo Cloud for updates. Upon the next successful call for an update, if a component is two or more versions behind, the workers will stop until the Crawling Module is up to date again.

The following REST API calls are still available in case you ever need to update manually, for example if automatic updates fail:

You can also contact the Coveo Support team for help.

Troubleshooting

We recommend that you contact Coveo Support when you encounter issues. You may then be instructed to use the following troubleshooting calls:

Miscellaneous

When creating a ODBC source, use /api/odbc/drivers to view a list of the drivers you can specify in the connection string.

Recommended Articles