Query via API

This is for:

Developer

With LiveTap, you have full access to all of your Qubit data via API. This enables the following use cases:

  • Export events into your data warehouse and business intelligence software such as Looker

  • Fuel data science models with structured, quality data

  • Pipe real-time data into analytics dashboards during key sales campaigns

Data access methods

API access enables you to query data from a variety of tools, such as:

Authentication

To connect you’ll need a Google Account or Google Cloud Service account with permissions to access the Google Cloud project your data is stored in and run jobs on BigQuery. Contact your Qubit Customer Success Manager for details.

API libraries

If you are connecting from a language such as Python, take a look at the Google Cloud BigQuery documentation for details on how to install and get started with the various libraries available.

Note

You may also find Google Colab useful—​hosted Juypter workbooks in the cloud, stored on your Google Drive.

Example: connecting via Juypter Notebooks

Note

You’ll find the Google Cloud article on using Juypter notebooks with BigQuery useful if you are new to Juypter.

An excellent place to start experimenting with your data is within Juypter Notebooks. This Python environment enables you to connect via API to your data and write reusable Python functions to query, transform, and model your data.

Getting started

The following example uses Juypter notebooks following the instructions to set up Google Cloud with Python and then installing the google-cloud-bigquery package with pip.

Firstly, follow these steps ("Setting up a local Jupyter environment") for a guide on authenticating yourself with Google Cloud.

Import the packages we require

# https://pandas.pydata.org/
import pandas as pd

# https://cloud.google.com/bigquery/docs/reference/libraries
from google.cloud import bigquery
bq = bigquery.Client()

Run a query

Using the retail data schema, here we write a query to:

  1. Select transactions & total basket revenue for the last 7 days (including today) from event_ecBasketTransactionSummary.

  2. Select the country (ipLocation_country) from event_qubit_session.

  3. Join this data so we have the country of every user who transacted.

  4. Sum to calculate our top 10 country & currency combinations.

# define our query
QUERY= """
with
transactions as
(
  select
    transaction_id,
    context_id,
    basket_total_currency,
    basket_total_value
  from
    `qubit-client-12345.my_store__v2.event_ecBasketTransactionSummary`
  where
    meta_recordDate  > DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
),

location as
(
  select
    context_id,
    max(ipLocation_country) as country
  from
    `qubit-client-12345.my_store__v2.event_qubit_session`
  where
    meta_recordDate  > DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
  group by
    1
)

select
  location.country,
  transactions.basket_total_currency,
  sum(transactions.basket_total_value) as total_basket_revenue

from
  transactions

left join
  location on (transactions.context_id = location.context_id)

group by
  1, 2

order by
  3 desc

limit
  10
"""

# execute our query on BigQuery
job = bq.query(QUERY)

# turn our query result into a Pandas DataFrame
df = job.to_dataframe()

View our data

# take a peek at our data using the Pandas DataFrame head() function
df.head()
country	            basket_total_currency               total_basket_revenue

united kingdom      GBP	                                152,431.14
ireland	            GBP	                                52,000.43
united states	    GBP	                                45,210.10
spain	            GBP	                                2,869.50
hong kong	        GBP	                                1,475.57
...

From here, we can adjust our query and re-run, or use various features of Pandas to analyze our query data. Alternatively, we may wish to create charts or even feed this data into a data science model.

Export result to Excel

If you are running locally on your machine, we can easily export our dataframe to Excel:

writer = pd.ExcelWriter('query.xlsx')
df.to_excel(writer, 'Query')
writer.save()