Crawling Module REST API reference
Crawling Module REST API reference
Maestro is driven using a REST API and listens on port 5000 by default.
Since not all Crawling Module management operations are available in the Administration Console yet, you must use Swagger at http://localhost:5000/api/swagger/
to accomplish most of them.
If you decided to use a different service port while installing Maestro, go to the corresponding address instead (for example, http://localhost:5001/api/swagger/
if you chose to use port 5001).
Note
Coveo only supports managing Crawling Modules using Swagger. If you want to use a different tool (for example, PowerShell), keep in mind that the Coveo Support team only offers help with Swagger. |
Authentication
Getting a URL to link the Crawling Module to an organization
The /api/authorize/url
call returns the URL at which you should log in with a Coveo account that has the privilege to create API keys.
As you log in, you perform a handshake with Coveo and create an API key for your crawling Module instance to use when communicating with the Platform.
Request template
GET http://localhost:5000/api/authorize/url HTTP/1.1
Accept: application/json
200 OK response body
{
"https://platform.cloud.coveo.com/oauth/authorize?response_type=token&client_id=CrawlingModule&scope=full&redirect_uri=http://localhost:5000/oauth/receive_token.html"
}
Linking the Crawling Module to an organization
The /api/authorize
call allows you to send an authorization token to complete the process of linking your Crawling Module instance to a Coveo organization.
Typically, you shouldn’t need to use this call unless the linking process fails.
Before you use this call, ensure that your Maestro settings contains the correct CoveoEnvironment
value.
The CoveoEnvironment
parameter represents the type of organization you want to link to your Crawling Module instance.
Possible values are Production
and Hipaa
.
Should the Crawling Module use a proxy to communicate with Coveo, specify its address and credentials as well.
In the request, you must provide the token you previously obtained.
The body of a successful response is an empty JSON object ({}
).
Verifying the link between the Crawling Module and your organization
Once you have linked the Crawling Module to a Coveo organization, you can use the /api/authorize/verify
call to confirm the success of the linking process.
Request template
GET http://localhost:5000/api/authorize/verify HTTP/1.1
Accept: application/json
The body of a successful response is an empty JSON object ({}
).
Configuration
Getting the Crawling Module configuration
The /api/config
GET call allows you to review your Coveo Crawling Module configuration.
The information it returns is the following:
-
ID of the Coveo organization linked to the Crawling Module instance (also available on the component dashboard)
-
Crawling Module instance name
-
Log retention period in days
-
Time at which the Crawling Module installs its updates
-
Number of content workers (also available on the component dashboard)
-
Number of security workers (also available on the component dashboard)
You can use this call to check that your configuration is adequate, and then edit this configuration if needed.
Request template
GET http://localhost:5000/api/config HTTP/1.1
Accept: application/json
The body of a successful response contains information regarding the Crawling Module configuration.
200 OK response body
{
"OrganizationId": "connectorsteamtestsmf76kcam",
"Name": "MyCompanysCrawlingModule",
"LogRetentionPeriodInDays": 30,
"AutoUpdateTriggerTime": "23:00:00",
"NumberOfCrawlerWorkers": 2,
"NumberOfSecurityWorkers": 1
}
Editing the Crawling Module configuration
Use the /api/config
PUT call to provide new values for your Crawling Module configuration parameters.
Request template
PUT http://localhost:5000/api/config HTTP/1.1
Content-Type: application/json-patch+json
Include the key-value pairs to modify in the request body. Possible pairs are:
-
"Name": "<NAME>"
-
"LogRetentionPeriodInDays": <NUMBER_OF_DAYS>
-
"AutoUpdateTriggerTime": "<TIME>"
-
"NumberOfCrawlerWorkers": <NUMBER_OF_CONTENT_WORKERS>
-
"NumberOfSecurityWorkers": <NUMBER_OF_SECURITY_WORKERS>
where:
-
<NAME>
is a name identifying your Crawling Module instance. This value appears on the Crawling Modules (platform-ca | platform-eu | platform-au) page of the Coveo Administration Console, as well as in your Crawling Module source configuration panels. Only alphanumeric characters, dashes and underscores are allowed. -
<NUMBER_OF_DAYS>
is an integer value that represents the number of days that logs are kept before being automatically deleted. By default, this retention period is30
days, which is also the minimum allowed. The maximum is730
days (2 years). -
<TIME>
is the time at which the automatic update process starts, in the formatHH:mm:ss
. The default is23:00:00
, and the time zone used is that of your server. -
<NUMBER_OF_CONTENT_WORKERS>
is an integer value that represents the desired number of content workers. -
<NUMBER_OF_SECURITY_WORKERS>
is an integer value that represents the desired number of security workers. Security workers are only required for some Crawling Module sources that index permissions ("sourceVisibility": "SECURED"
). See Indexing Secured Content for details.
Only the parameters you want to modify are required in the request payload.
Modifying the name of your Crawling Module instance
{
"Name": "MyCrawlingModule"
}
The body of a successful response is an empty JSON object ({}
).
Editing sensitive configuration parameters
The /api/config/sensitive
call allows you to change the password of your proxy or database.
See Password update for details on when to use this call.
The body of a successful response is an empty JSON object ({}
).
Getting the Crawling Module ID
The /api/config/id
GET call allows you to review the unique identifier of your Crawling Module instance.
This ID is also displayed on the Crawling Modules (platform-ca | platform-eu | platform-au) page of the Administration Console.
Request template
GET http://localhost:5000/api/config/id HTTP/1.1
Accept: application/json
The body of a successful response contains the Crawling Module unique identifier.
200 Success
"coveoorganization-345238a4-298h-8e3v-58467815481d"
Generating a new Crawling Module ID
When you deploy the Crawling Module, a unique identifier is generated for your instance.
However, if you ever need to make a copy of your Crawling Module instance, you will then have two instances with the same identifier.
To prevent communication issues, you must use the /api/config/id
POST call to generate a new unique identifier for one of these instances.
Request template
POST http://localhost:5000/api/config/id HTTP/1.1
Accept: application/json
Payload
{
"coveoorganization-h7d4h6137-w745-5f93-72h95ca314"
}
Logging
Creating a compressed logs archive
The /api/logging/logs
call creates a compressed archive containing all available Crawling Module logs, and then returns the path to the compressed archive in its response body.
The body of a successful response should be similar to the following:
200 OK response body
C:\ProgramData\Coveo\Maestro\CrawlingModuleLogs-20200618103928.zip
Deleting the "Dumps" folder files
The files in the Dumps
folder can be useful for troubleshooting, but take a lot of space.
If your Crawling Module is running as expected, you can use the /api/logging/purge/dump
call to delete them to free up disk space.
A successful request returns a Status of 200 OK
.
Status
Getting the workers' status
The /api/status/workers
call returns the status of each worker, along with other details.
This information is also available on the Crawling Module component dashboard.
Request template
GET http://localhost:5000/api/status/workers HTTP/1.1
Accept: application/json
200 OK response body
{
"WorkerStatus": [
{
"IsRunning": true,
"Name": "Crawler-Worker-68a1015f-7f5f-4f91-984c-0f16bd59da4f",
"WorkerType": "Crawler",
"Details": {
"ProcessId": 15212,
"Status": "Running",
"LastStartStopTime": "2020-05-12T14:23:14"
}
},
{
"IsRunning": true,
"Name": "Crawler-Worker-73edb6b3-7465-4253-9340-5707c97bbead",
"WorkerType": "Crawler",
"Details": {
"ProcessId": 31188,
"Status": "Running",
"LastStartStopTime": "2020-05-12T14:23:14"
}
},
{
"IsRunning": true,
"Name": "Security-Worker-c585d778-1d90-444d-bd46-57f8cdabe700",
"WorkerType": "Security",
"Details": {
"ProcessId": 25388,
"Status": "Running",
"LastStartStopTime": "2020-05-12T14:23:18"
}
}
]
}
Getting Maestro’s status
The /api/status/maestro
call returns the status of Maestro, along with other information.
Request template
GET http://localhost:5000/api/status/maestro HTTP/1.1
Accept: application/json
200 OK response body
{
"Uptime": "00:06:52",
"IsWorkerServiceRunning": true,
"LinkedOrganization": "myorganizationw376kcrn",
"IsAbleToReachOrganization": true
}
Getting the Maestro version
The /api/status/version
call returns the version of Maestro in the response body.
Alternatively, you can find this information on the Crawling Modules (platform-ca | platform-eu | platform-au) page, which also shows the latest version available, and on the component dashboard.
200 OK response body
{
"MaestroVersion": "1.2.8.0"
}
Troubleshooting
Getting the available drivers for an ODBC source
The /api/troubleshooting/odbc/drivers
call lists the drivers installed and available on your server.
You can then specify one of them in your Database source connection string.
See About ODBC drivers for details.
Alternatively, you can use your server’s ODBC Data Source Administrator to list the drivers installed on your system.
Request template
GET http://localhost:5000/api/troubleshooting/odbc/drivers HTTP/1.1
Accept: application/json
The body of a successful response looks like the example below.
200 OK response body
{
"X64Drivers": [
"SQL Server",
"PostgreSQL ANSI(x64)",
"PostgreSQL Unicode(x64)",
"MySQL ODBC 5.3 ANSI Driver",
"MySQL ODBC 5.3 Unicode Driver"
"Oracle in instantclient_12_2"
]
}
Note
The |
Testing the connection to the Coveo Platform
The /api/troubleshooting/proxy/testconnection
calls tries connecting to the Coveo Platform using the current proxy configuration.
If communication is unsuccessful, an error message is returned.
Request template
GET http://localhost:5000/api/troubleshooting/proxy/testconnection HTTP/1.1
Accept: application/json
The body of a successful response should be similar to the following:
200 OK response body
The proxy tunnel request to proxy 'http://54.236.123.234/' failed with status code '407'.
Proxy-Authenticate: Basic realm="proxy"
Getting required outgoing communication URLs
The /api/troubleshooting/urls
call lists public URLs to which Maestro must connect.
If your server can’t access one of these URLs, you should either relax the firewall rules in place or configure the Crawling Module to communicate with Coveo through a proxy.
Request template
GET http://localhost:5000/api/troubleshooting/urls HTTP/1.1
Accept: application/json
The body of a successful response looks like the example below.
200 OK response body
[
"coveo-nprod-public-resource.s3.amazonaws.com",
"coveo-nprod-customerdata.s3.amazonaws.com",
"api.cloud.coveo.com",
"platform.cloud.coveo.com"
]
Update
Updating Maestro
The Crawling Module updates automatically at the time specified in its configuration, so you shouldn’t need to use the /api/update/maestro
call.
However, should you ever need to update Maestro manually, (for example, if instructed to do so by the Coveo Support team), use this call to launch the update process.
Note
During the update process, a copy of your obsolete Crawling Module folder is saved under |
Restarting Maestro
After you edit Maestro settings, you must restart Maestro with the /api/service/restart
call to apply your changes.
Request template
POST http://localhost:5000/api/service/restart HTTP/1.1
The body of a successful response is an empty JSON object ({}
).
You check Maestro’s status with the /api/status
call.
Getting the proxy status
If you configured the Crawling Module to communicate with Coveo through a proxy, you can use the /api/troubleshooting/proxy/settings
call to review the proxy status.
Request template
GET http://localhost:5000/api/troubleshooting/proxy/settings HTTP/1.1
200 OK response body
{
"MaestroSettingsProxyUrl": null,
"IsHttpProxyEnvironmentVariablePresent": false,
"IsHttpsProxyEnvironmentVariablePresent": false,
"ProxyUsedForPlatform": "https://platform.cloud.coveo.com/",
"IsDefaultProxyCredentialsPresent": false,
"WinhttpProxyStatus": "winhttp proxy is not set."
}