Manage the Coveo On-Premises Crawling Module Using the REST API (Docker Version)
This article applies to the old Crawling Module, which works with Docker. If you are using the new Crawling Module, see Manage the Crawling Module Using the REST API instead.
The old Crawling Module has reached its end-of-life on December 31, 2020. We recommend switching to the new Crawling Module, which doesn’t require Docker.
To identify the Crawling Module you’re currently using, on the Crawling Modules page of the Coveo Administration Console, look at the Maestro reported version:
-
Versions > 1: new Crawling Module
-
Versions < 1: Crawling Module with Docker
Maestro is driven using a REST API and listens on port 5000 by default. Since no UI is available yet to manage your workers through Maestro, you must use the Swagger at http://localhost:5000/api/swagger/
.
Coveo only supports Swagger for Crawling Module management. If you want to use a different tool (e.g., PowerShell), keep in mind that the Coveo Support team offers help with Swagger only.
This page lists the calls to use in typical use cases of the Coveo On-Premises Crawling Module. Additional information regarding each API call is provided under Crawling Module REST API Reference.
Initial Configuration
Once you installed Docker and Maestro:
-
Click here (or here if you have a HIPAA organization) to access the Coveo Platform, and then, in the page that opens, select a organization to link with your Crawling Module instance.
-
Use the
/api/config
PUT call to set a Crawling Module configuration. -
Use
/api/workers/start
to start the workers. -
Use
/api/status/workers
to review the progress of the Docker image download.
-
To accomplish the above steps, you need the privilege to create API keys.
-
The Crawling Module can retrieve content from several on-premises sources. You must however ensure that you have enough workers to crawl your content optimally.
Typical Crawling Module Use
Once you installed and configured the Crawling Module, you can edit the its configuration again, and/or create additional content sources to crawl.
-
Use the
/api/config
GET and the/api/config
PUT call view and edit the Crawling Module configuration if needed. -
Use
/api/workers/start
to ensure that the workers are started.
Updates
Crawling Module updates are automatic. This allows you to benefit from the latest features and bug fixes, and prevents the Crawling Module from becoming incompatible with the most recent Coveo Cloud update. To stay up to date, the Crawling Module periodically polls the Coveo Platform for a new version of its components. If an update is available, it will be downloaded and installed within 24 hours, at the scheduled update time. The default time is 11:00 PM.
Disconnecting your server from the Internet or shutting it down will result in the Crawling Module not polling Coveo Cloud for updates. Upon the next successful call for an update, if a component is two or more versions behind, the workers will stop until the Crawling Module is up to date again.
The following REST API calls are still available in case you ever need to update manually, for example if automatic updates fail:
-
/api/status/versions
to see the Database, Maestro, and Worker versions you currently have.
You can also contact the Coveo Support team for help.
Troubleshooting
We recommend that you contact Coveo Support when you encounter issues. You may then be instructed to use the following troubleshooting calls:
Miscellaneous
When creating a ODBC source, use /api/odbc/drivers
to view a list of the drivers you can specify in the connection string.