Analyze the Performance of Pipeline A Vs Pipeline B

Members of the Administrators, Analytics Managers, Analytics Viewers, and Relevance Managers built-in groups can take advantage of the A/B Testing usage analytics report template to evaluate the effectiveness of query pipeline changes and act when necessary.

The article is written for a relevance analyst comparing two query pipelines in an active A/B test created in the Coveo Administration Console (see Manage Query Pipelines and Manage A/B Tests). You could, however, compare any A/B test associated with any type of item as long as your search interface sends the right information for the A/B Test Name and A/B Test Version dimensions to the Coveo Analytics service (see A/B Test Name and A/B Test Version).

Analyze the Performance of Pipeline A Versus Pipeline B

  1. On the Reports page, create a dashboard using the A/B Testing template (see Add Usage Analytics Dashboards).

  2. On the A/B Testing page:

    1. Click the date range in the upper-right corner.

    2. In the Report Period dialog, select the Start date of your A/B test (see Set the Period to Review Search Usage Data).

    3. Evaluate the statistical significance of your results (see A/B Tests Leading Practices). To do so, compare the available metric and dimension values for each pipeline:

      • Keep in mind that a small percentage of users doesn’t accept cookies. They might have disabled cookies in their browser settings or they might use an extension that blocks cookies. Therefore, they could switch from both Pipeline A and B during a single session since the Coveo Search API uses cookies to direct users into pipelines. On average these users shouldn’t impact aggregated metrics, but if you look closely into details, you could see abnormal values.

      • When you use a proxy, ensure that the cookies used by Coveo are forwarded. If not, you need to add those cookies to the list of forwarded cookies.

      • In the A/B Test Overview section Table card:

        If your item changes are effective, you should see higher Search Event Click-Through, Search Events With Clicks, and Relevance Index values, and a lower Average Click Rank value for item B than item A.

        • Search Event Click-Through

          This metric is the percentage of queries with at least one click on search results for each test sets (e.g., Query Pipeline). The value is between 0 and 100. Higher values are better.

        • Average Click Rank

          This metric is the average position of clicked items in the search results for each test sets (e.g., Query Pipeline). The value is greater than zero. Lower values are better. A value of 1 would mean that users always open the first item in the search results list.

        • Search Events With Clicks

          This metric is the number of queries resulting in at least one click by the user for each test sets (e.g., Query Pipeline). Higher values are better.

        • Relevance Index

          This metric is calculated with a formula based on the Search Event Count and the Average Click Rank metrics to highlight ranking issues for each test sets (e.g., Query Pipeline). This value is between 0 and 1. Higher values are better. More frequently submitted queries with clicks on results farther in the search results list yield to a lower value.

      • In the Metrics Comparison tab, in the item A and B sections:

        When comparing two query pipelines, you want, for a similar number of visits (Visit Count), the pipeline B to have:

        • Higher Search Event With Click, Search Event With Results, and Search Event Click-Through values than item A.

        • A lower Average Click Rank value than item A over time.

        • Visit Count

          This metric is the number of visits (regardless of unique IP addresses and user names) in the selected report period.

        • Search Event Count

          This metric is the number of search events that were performed in the selected report period.

        • Click Event Count

          This metric is the number of clicks that were performed in the selected report period.

        • Average Click Rank

          see metric definition

        • Search Event Click-Through

          see metric definition

        • Average Click per Visit

          This metric is the number of clicks (Click Event Count) divided by the number of visits (Visit Count) in the selected report period.

        • Search Events With Clicks

          see metric definition

        • Queries With Results

          This pie chart shows the percentages of queries that returned results and queries that didn’t return results using the Has Results dimension and the Search Event Count metric in the selected report period.

        • Source Name (Click Event Count per Source titled card)

          This bar chart shows the number of clicks for each source containing the items that users clicked in the selected report period.

        • Click Ranking Modifier (Opened Results Suggested By titled card)

          (When tested pipelines contain a Coveo ML ART model only) This pie chart shows the relative importance in percentage of each ranking modifier (query pipeline components) that affected the ranking of the items users clicked using the Click Ranking Modifier dimension and the Click Event Count metric in the selected report period.

      • In the Comparison Over Time tab, in the item A and B sections:

        When comparing two query pipelines, you want, for a similar number of visits, the pipeline B to have:

        • Lower Search Events Without Results and Search Events Without Clicks values than pipeline A.

        • A higher Average Search Event Click-Through and a lower Average Click Rank value than pipeline A over time.

        • Average Click Rank over time

          This time series graph shows the average position of clicked items in the search results per interval (hour, day, etc.) based on the selected report time period. The value is greater than zero. Lower values are better. A value of 1 would mean that users always open the first item in the search results list. The upper right numerical values are the average of all interval averages and the peak for the selected period.

          The average of interval averages value can be under one because days with no clicks are included in the calculation.

        • Average Search Event Click-Through over time

          This time series graph shows the percentage of queries with at least one click on search results per interval (hour, day, etc.) based on the selected report time period. The value is between 0 and 100. Higher values are better. The numerical values on the right side are the average of all interval averages and the peak for the selected period.

        • Search Events Without Results over time

          This time series graph shows the number of queries that didn’t return results per interval (hour, day, etc.) based on the selected report time period. Lower values are better, meaning that search results are more often returned following user queries. The numerical values on the right side are the total number of search events without results, the average of all interval averages, and the peak for the selected period.

        • Search Events Without Clicks over time

          This time series graph shows the number of queries that were not followed by at least one click on search results per interval (hour, day, etc.) based on the selected report time period. Lower values are better, meaning that search results are more often clicked following user queries. The numerical values on the right side are the total number of search events without clicks, the average of all interval averages, and the peak for the selected period.

  3. In the People to Exclude tab, in the Average Click Rank over time graph, if you see extremes values or spikes, exclude the users in the table that falsify the metric (if any) by adding a global filter using the User Id or Visitor Id dimension (see Add Global Dimension Filters).

  4. (When comparing two query pipelines only) When your results are positive and statistically significant, copy the changed rule from the test pipeline (B) to the production pipeline (A) to make the changes effective in your search page (see Copy a Rule to Another Pipeline).

  5. Deactivate and then delete your A/B test (see Manage A/B Tests).

  6. If you were comparing two query pipelines, delete the test pipeline (B) (see Delete a Query Pipeline).

Required Privileges

The following table indicates the required privileges to view and edit dashboards from the Reports page and associated panels (see Manage Privileges and Privilege Reference).

Action Service - Domain Required access level
View dashboard

Analytics - Analytics data

Analytics - Dimensions

Analytics - Named filters

Analytics - Reports

View
Edit dashboard

Analytics - Analytics data

Analytics - Dimensions

Analytics - Named filters

View
Analytics - Reports Edit
Analytics - Administrate Allowed
Recommended Articles