Use case: Retrieve Coveo UA data with the Python connector
Use case: Retrieve Coveo UA data with the Python connector
In the Coveo Administration Console, the Snowflake reader account gives you direct access to your organization's' data and it enables you to explore that data as well as create customized exports and materialized views. You can use different Business Intelligence (BI) tools for data analysis, such as Tableau™ and Power BI™.
Alternatively, you have the option to use a native Snowflake connector. In this article, we’ll demonstrate how to use Python™ to explore and export your data.
Step 1: Install the connector
To get started, you must first install the Snowflake Connector for Python using the following command:
pip install snowflake-connector-python
For detailed installation steps, see Installing the Python Connector.
Step 2: Install Pandas
Once you’ve installed the connector, you must also install Pandas, which is a data analysis library that works with Python. Use the following command to install:
pip install pandas
For detailed installation steps, see Using Pandas DataFrames with the Python Connector.
Step 3: Connect to Snowflake
Following the installations of Snowflake Connector for Python and Pandas, you must create a script to connect to Snowflake, as shown in the following example:
import snowflake.connector
import pandas as pd
conn = snowflake.connector.connect(
user='<xxxxxx>',
password='<xxxxxxxx>',
account='<account>',
ocsp_fail_open=False,
)
Step 4: Define a SQL query
Define and test the query in a Snowflake worksheet. For example:
set from_date ='<from_date>';
set to_date ='<to_date>';
select * from "COVEO"."ua"."all_events_shared" where "datetime" between $from_date and $to_date;
Once you’re satisfied with the query, you must add it to your Python script. For example:
sql_query = pd.read_sql_query('select * from "COVEO"."ua"."all_events_shared" where "datetime" between %(from_date)s and %(to_date)s'
,conn
,params={"from_date":from_date, "to_date":to_date})
Step 5: Add the data to a DataFrame
Use the following code to read the data into a Pandas DataFrame.
df = pd.DataFrame(sql_query)
Step 6: Export to CSV
As a final step, export the data to a CSV file as follows:
df.to_csv (r'searches.csv', index = False)
Reference
Code sample
The script below is similar to the Python script used for UA data exports, however it’s shorter since it doesn’t require joining the various tables on the CSV file.
import snowflake.connector
import pandas as pd
conn = snowflake.connector.connect(
user='xxxxxx',
password='xxxxxxxx',
account='account',
ocsp_fail_open=False,
)
# No need to define org name, your reader account bound to your data
from_date = "2022-01-01"
to_date = "2022-01-02"
# Searches
sql_query = pd.read_sql_query('select * from "COVEO"."ua"."searches_shared" where "datetime" between %(from_date)s and %(to_date)s limit 10;'
,conn
,params={"from_date":from_date, "to_date": to_date})
df = pd.DataFrame(sql_query)
df.to_csv (r'searches.csv', index = False)
# Clicks
sql_query = pd.read_sql_query('select * from "COVEO"."ua"."clicks_shared" where "datetime" between %(from_date)s and %(to_date)s'
,conn
,params={"from_date":from_date, "to_date": to_date})
df = pd.DataFrame(sql_query)
df.to_csv (r'clicks.csv', index = False)
# Customs
sql_query = pd.read_sql_query('select * from "COVEO"."ua"."custom_events_shared" where "datetime" between %(from_date)s and %(to_date)s'
,conn
,params={"from_date":from_date, "to_date": to_date})
df = pd.DataFrame(sql_query)
df.to_csv (r'customs.csv', index = False)
# Views
sql_query = pd.read_sql_query('select * from "COVEO"."ua"."views_shared" where "datetime" between %(from_date)s and %(to_date)s'
,conn
,params={"from_date":from_date, "to_date": to_date})
df = pd.DataFrame(sql_query)
df.to_csv (r'views.csv', index = False)
# All events combined
sql_query = pd.read_sql_query('select * from "COVEO"."ua"."all_events_shared" where "datetime" between %(from_date)s and %(to_date)s'
,conn
,params={"from_date":from_date, "to_date": to_date})
df = pd.DataFrame(sql_query)
df.to_csv (r'all.csv', index = False)