Create and retrieve data exports
Create and retrieve data exports
The Usage Analytics Read API exposes /exports endpoints allowing you to manage data exports in an organization (see Exports API).
A typical situation in which you would want to use these endpoints is to create and manipulate data exports in your own application (that is, without going through the Coveo Administration Console). This article explains how to make requests to do so.
|
|
Note
Data exports are usually managed through the Coveo Administration Console (see Manage data exports). This article is only relevant if you have a legitimate reason for managing data exports programmatically. |
Usage overview
The following steps explain how to create, monitor, and download a data export.
Step 1: Create an export
Use the POST rest/v15/exports operation to create a data export (see Creating an export - query parameters).
Authenticate the request using an access token granting the Analytics - Data exports - Edit privilege in the target organization (see Manage API keys).
In the body of a successful response, retrieve and store the id property of your new data export.
POST https://platform.cloud.coveo.com/rest/ua/v15/exports?from=2019-01-01T00:00:00.000Z&to=2019-03-01T00:00:00.000Z HTTP/1.1
Authorization: Bearer **********-****-****-****-************
201 Created response body
{
"id": "9429fbb7-4217-47b8-8e25-8812ae6c31e5",
"author": "alice@example.com",
"downloadLink": "https://platform.cloud.coveo.com/rest/v15/exports/9429fbb7-4217-47b8-8e25-8812ae6c31e5?redirect=true",
"startDate": 1553198432348,
"from": 1546318800000,
"to": 1551416399999,
"filters": {},
"description": null,
"size": 46153,
"status": "PENDING",
"tables": [
"searches",
"custom_events",
"keywords",
"groups",
"clicks"
],
"downloadable": true,
"dimensions": [],
"replayable": false,
"usingDisplayNames": false,
"scheduleId": null
}
In this example, the org query parameter isn’t included, since it’s extracted from the API key.
If an OAuth2 token was used, the org query parameter would need to be included.
Step 2: Monitor the status of your export
The service needs to build the export before you can export it.
To monitor the status of your export, use the GET rest/ua/v15/exports/{exportsId} operation.
Ensure that the redirect query parameter is set to false (see redirect (Boolean)).
Authenticate the request using an access token granting the Analytics - Data exports - View privilege in the target organization (see Manage API keys).
In the body of a successful response, retrieve the status property of your new data export.
GET https://platform.cloud.coveo.com/rest/ua/v15/exports/9429fbb7-4217-47b8-8e25-8812ae6c31e5 HTTP/1.1
Authorization: Bearer **********-****-****-****-************
200 OK response body
{
"id": "9429fbb7-4217-47b8-8e25-8812ae6c31e5",
"author": "alice@example.com",
"downloadLink": "https://platform.cloud.coveo.com/rest/v15/exports/9429fbb7-4217-47b8-8e25-8812ae6c31e5?redirect=true",
"startDate": 1553198432348,
"from": 1546318800000,
"to": 1551416399999,
"filters": {},
"description": null,
"size": 46153,
"status": "AVAILABLE",
"tables": [
"searches",
"custom_events",
"keywords",
"groups",
"clicks"
],
"downloadable": true,
"dimensions": [],
"replayable": false,
"usingDisplayNames": false,
"scheduleId": null
}
Step 3: Download the export
When the status parameter of your report is AVAILABLE, you can retrieve the export itself.
Use the GET rest/v15/exports/{exportsId} operation, setting the redirect parameter to true (see redirect (Boolean)).
Authenticate the request using an access token granting the Analytics - Data exports - View privilege in the target organization (see Manage API keys).
The following request retrieves a data export, that is, a CSV file:
GET https://platform.cloud.coveo.com/rest/ua/v15/exports/9429fbb7-4217-47b8-8e25-8812ae6c31e5?redirect=true HTTP/1.1
Authorization: Bearer **********-****-****-****-************
Code sample
The following Python script implements a simplified client which completes the above steps and then additionally appends a c_eventType column to the requested tables, joins them into one, and orders them by visitId and datetime.
import requests # needs to be installed
import json
import time
from zipfile import ZipFile
from io import BytesIO
import pandas as pd # needs to be installed
def create_export(token, from_date, to_date, org=None, file_name=None, d=None):
"""Creates an export
:param token: An access token granting the Analytics - Data exports - Edit privilege in the target organization.
:param from_date: The date to begin the export at (for example, "2019-03-12T14:20:20.266Z").
:param to_date: The date to end the export at (for example, "2019-04-12T14:20:20.266Z").
:param org: The organization ID; only required when authenticating with an OAuth2 token.
:param file_name: The export name; defaults to the generated export ID.
:param d: A description of the export.
:return: The generated export ID.
"""
url = "https://platform.cloud.coveo.com/rest/ua/v15/exports"
# define export
querystring = {
"from": from_date,
"to": to_date
}
if org:
querystring["org"] = org
if file_name:
querystring["filename"] = file_name
if d:
querystring["d"] = d
headers = {
"Authorization": "Bearer %s" % token
}
response = requests.request("POST", url, headers=headers, params=querystring)
response_json = json.loads(response.text)
try:
return response_json["id"]
except:
print("\nThere was an issue with your export creation request. Here is the response from the service:")
print(" %s\n" % json.dumps(response_json))
def get_export(id, token, org=None):
"""Shows the information of an export
:param id: The export ID.
:param token: An access token granting the Analytics - Data exports - View privilege in the target organization.
:param org: The organization ID; only required when authenticating with an OAuth2 token.
:return: Information about the export in JSON format.
"""
# request to get export link
url = "https://platform.cloud.coveo.com/rest/ua/v15/exports/%s" % id
querystring = {
"redirect": "false"
}
if org:
querystring["org"] = org
headers = {
"Authorization": "Bearer %s" % token
}
response = requests.request("GET", url, headers=headers, params=querystring)
response_json = json.loads(response.text)
return response_json
def download_export(link, token, org=None):
"""Downloads an export
:param link: The export link. Contains the id and redirect parameter.
:param token: An access token granting the Analytics - Data exports - View privilege in the target organization.
:param org: The organization ID; only required when authenticating with an OAuth2 token.
"""
if org:
link += "&org=%s" % org
headers = {
"Authorization": "Bearer %s" % token
}
# get the export using the link which already contains the appropriate query parameters, and possibly the org
data = requests.get(link, headers=headers, stream=True)
z = ZipFile(BytesIO(data.content))
z.extractall('')
def join_tables():
"""Joins the clicks, searches, and custom_events tables in the current repository.
"""
# get click, search, and custom events
data = [pd.read_csv(csv_file) for csv_file in ["clicks.csv", "searches.csv", "custom_events.csv"]]
# add c_eventType column
df = [pd.DataFrame(table) for table in data]
df[0]["c_eventType"] = "Click"
df[1]["c_eventType"] = "Search"
df[2]["c_eventType"] = "Custom"
# join DataFrames into one
dfAllEvents = pd.concat(df, sort=True)
# sort first by visitId and then by datetime
dfAllEvents = dfAllEvents.sort_values(by=["visitId", "datetime"], ascending=True)
# reset the index column
dfAllEvents = dfAllEvents.reset_index(drop=True)
# export to csv
dfAllEvents.to_csv("allEvents.csv")
if __name__ == "__main__":
token = "**********-****-****-****-***********"
from_date = "2018-06-01T05:00:00.000Z"
to_date = "2019-03-01T04:59:59.999Z"
org = None
export_id = create_export(token, from_date, to_date, org=org)
# initial time to wait in between queries when checking the status of the export
waitInterval = 4
timeCounter = waitInterval
status = "PENDING"
while status != "AVAILABLE":
time.sleep(waitInterval)
print("%s ... %s seconds" % (status, timeCounter))
# exponential backoff, until 60 seconds
waitInterval = 2 * waitInterval if waitInterval < 60 else waitInterval
timeCounter += waitInterval
export = get_export(export_id, token, org=org)
try:
status = export["status"]
except:
print("\nThere was an issue with your export get request. Here is the response from the service:")
print(" %s\n" % json.dumps(export))
break
try:
download_export(export["downloadLink"], token, org=org)
print("Export complete")
except:
print("\nThere was an issue when attempting to download your export.\n")
join_tables()
Reference
Authenticating
The Usage Analytics Read API relies on the bearer HTTP authentication scheme.
All HTTP requests made to the Usage Analytics Read service must include an Authorization header with a valid access token (that is, an API key or OAuth2 token):
Authorization: Bearer <token>
To create a data export, the <token> must grant the Analytics - Data exports - Edit privilege in the target organization (see Create an API key).
To retrieve an export, the Analytics - Data exports - View privilege is sufficient.
|
|
Note
If a Usage Analytics Read API request is authenticated using an
For example:
|
Creating an export - query parameters
When making a POST request to https://platform.cloud.coveo.com/rest/ua/v15/exports to create a data export, the following query parameters apply (see /v15/exports).
from (required, string)
The timestamp from which to begin the export.
ISO8601 format YYYY-MM-DDThh:mm:ss.sssZ (see ISO 8601^).
2018-02-05T14:20:20.266Z
to (required, string)
The timestamp at which to end the export.
ISO8601 format YYYY-MM-DDThh:mm:ss.sssZ (see ISO 8601^).
2019-02-05T15:20:20.266Z
f (string)
The filter to apply to all the event dimensions (analogous to the SQL WHERE command).
If you send many f query parameters, the service joins them with the AND operator.
You can define an f argument with the Read filter syntax.
You can also use preexisting named filters by preceding their id by the $ character.
-
$EXCLUDE_PAGEVIEW_BOTS AND (city IN ['Québec','Quebec'])where
EXCLUDE_PAGEVIEW_BOTSis a preexisting named filter. -
Alternatively, you can send several
fquery parameters. In other words, appending&f=$EXCLUDE_PAGEVIEW_BOTS AND (city IN ['Québec','Quebec'])to your query string is equivalent to appending&f=$EXCLUDE_PAGEVIEW_BOTS&f=(city IN ['Québec','Quebec'])to your query string.
|
|
Note
You can retrieve the exhaustive list of named filters available in your organization by navigating to the Named Filters page of the Coveo Administration Console (see Manage named filters). Alternatively, you can make a call to the |
fs (string)
The filter to apply to the search and click event dimensions (analogous to the SQL WHERE command).
If you send many fs query parameters, the service joins them with the AND operator.
You can define an fs argument with the Read filter syntax.
You can also use preexisting named filters by preceding their id by the $ character.
$NO_BLANK_QUERIES AND searchcausev2=='resultsSort'
where NO_BLANK_QUERIES is a preexisting named filter.
|
|
Note
If you specify a condition on a custom event dimension in your |
fc (string)
The filter to apply to the custom event dimensions (analogous to the SQL WHERE command).
If you send many fc query parameters, the service joins them with the AND operator.
You can define an fc argument with the Read filter syntax.
You can also use preexisting named filters by preceding their id by the $ character.
customeventtype=='getMoreResults'
|
|
Note
If you specify a condition on a search or click event dimension in your |
filename (string)
A name for the export zip file. Default value is the export ID.
myExportFileName
d (string)
A description for the export. Appears in the response body when retrieving the information of an export.
This is my data export.
dimensions (string)
The name of the dimensions to export.
More precisely, the returnName when using the API, and the API Name when using the Administration Console.
If none is provided, all dimensions are exported.
To specify several dimensions values, send several dimensions query parameters.
To export only the originLevel1 and originLevel2 dimensions, append the following to your query string:
&dimensions=originLevel1&dimensions=originLevel2
|
|
Note
You can retrieve the exhaustive list of dimensions available in your organization by navigating to the Dimensions page of the Coveo Administration Console (see Manage dimensions on custom metadata). Alternatively, you can make a request to the |
tables (string)
The tables to export (see What’s the database schema of the different tables that I see when doing an export?).
If none is provided, all tables are exported.
To specify several tables values, send several tables query parameters.
Possible values are:
-
clicks -
searches -
custom_events -
keywords -
groups
|
|
Note
The group names in the |
To export only the searches and clicks tables, append the following to your query string:
&tables=searches&tables=clicks
format (string)
The format of the generated CSV files. Possible values are:
-
EXCEL: within strings,"and\characters are escaped by a"character. -
NO_NEWLINE: no\nnor\rat the end of each line. Also, within strings,"and\characters are escaped by a\character.
Default value is EXCEL.
NO_NEWLINE
useDisplayNames (Boolean)
Whether to use the display names in the export table headers.
When set to false, API names are used.
You can compare the display and API names (see What’s the database schema of the different tables that I see when doing an export?).
Default value is false.
-
You set
useDisplayNamestotrue. In thesearches.csvfile of your export, there’s a column titledUser Name. -
You then perform the same export, but you set
useDisplayNamestofalse. In thesearches.csvfile of your export, that same column is now titledusername.
Retrieving an export - query parameters
When making a GET request to https://platform.cloud.coveo.com/rest/ua/v15/exports/{exportId} to retrieve a data export or its information, the following query parameters apply (see /v15/exports/{exportsId}).
exportId (required, string)
The unique identifier of the export.
8sdfn3n2ns-3042-s9df-s9df-9sdfjsdjf9jd
redirect (Boolean)
Whether to return an HTTP redirect to the actual file.
In other words, returns the export file itself when set to true, and returns information about the export when set to false.
Default value is false.
true