Manage A/B tests

This is for:

In this article

About A/B tests
Common use cases
Create an A/B test
Edit an A/B test
Pause an A/B test
Stop an A/B test
Review A/B tests performance
Leading practices
"Key Performance Indicators" section
Required privileges

The A/B Test tab of a query pipeline configuration allows members with the required privileges to evaluate the impact of query pipeline rules changes by pairing query pipelines in one or more A/B tests.

An A/B test allows you to compare two versions of a query pipeline by splitting the web traffic between both versions. You can then measure which version provides the best results based on metrics that are significant to your Coveo organization.

About A/B tests

How are A/B tests split

Once you activate an A/B test in your search interface and a query is performed by an end user, the Search API assigns the user to either the Original pipeline or its Test scenario version. This assignment is based on a consistent grouping method that uses the client ID provided in the API request. The A/B test ratio configuration is also considered, for example, you set the ratio to route 60% of your traffic to the Original pipeline and 40% to its Test scenario version.

Note

The client ID is typically stored in either local storage or a first-party cookie, and it persists across sessions. Therefore, an end user can have multiple visits but still be consistently routed to the same query pipeline for the duration of the A/B test.

For an A/B test to apply, the Original pipeline's routing condition must be met (see Routing rules). If the query matches the condition, the Search API routes end user to the appropriate query pipeline following the A/B test ratio.

Note

In the context of an A/B test, the condition applied to the Test scenario version of the pipeline is ignored.

Example

You create an A/B test with:

80% of your traffic directed to the Original pipeline.
20% directed to its Test scenario.

You add the following routing condition to the Original pipeline:

Search hub is communitySearch.

An end user sends a query from the community search interface, the query matches the condition and is considered for A/B testing.

At this point, the query has an 20% chance of being routed to the Test scenario of the query pipeline, and a 80% chance of being routed to the Original pipeline.

Common use cases

You’ll typically perform A/B tests to evaluate the impact of query pipeline rule changes.

During a test, you compare two versions of a pipeline by splitting the web traffic between both versions. After the test, you can measure which version provides the best results based on metrics that are significant to your Coveo organization. You can then set the winning pipeline as the effective one for your search interface.

You may want to use the A/B Test tab of a query pipeline configuration in the following scenarios:

Explore potential solutions

Consider the hypothesis you want to resolve before creating your test. Begin by identifying a problem in your organization search usage, and then create an A/B test to experiment potential solutions with a segment of your web traffic.

The test results then allow you to compare the performance of your original pipeline in its actual state against its Test scenario version, in which you tweaked a query pipeline rule to observe its effects on your implementation.

Example

By analyzing the performance of your relevance metrics, you realize that you must improve your Average Click Rank to reach your relevance objectives.

You then decide to create a new query pipeline rule, hoping this modification will improve your Average Click Rank metric.

You therefore create an A/B test on your query pipeline and add the new rule in the Test scenario version of the pipeline.

To test the efficiency of that new rule, you check the values listed in the Key Performance Indicators section of the A/B Test tab to compare the metrics generated by both pipelines (the Original pipeline VS Test scenario).

After reviewing the results of the A/B test, you notice that the new rule has improved your Average result rank metric.

You then set the Test scenario version of the pipeline as the pipeline in which you want all of the traffic that meets the condition to be redirected.

Adjust the traffic ratio

The A/B Test tab allows you to adjust the traffic ratio you want to send to the Test scenario version of the pipeline, even when the A/B test is already started. You may want to use this tool to speed up or slow down data gathering when experimenting with query pipeline rule modifications.

Example

You create an A/B test that sends 50% of your traffic to your Original pipeline and the other 50% to the Test scenario version of the pipeline.

After a moment, you realize that the Test scenario version of the pipeline isn’t collecting enough data to determine whether the modifications significantly improved your search interface relevance.

You therefore modify the traffic ratio to send 70% of your traffic in the Test scenario version of the pipeline.

Create an A/B test

On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you want to create an A/B test, and then click Edit components in the Action bar.
On the A/B Test tab, click Configure A/B Test.

In the Configuration section, under Traffic Split, move the slider to choose the traffic ratio you want to send to the Original pipeline versus its Test scenario version.

Tip

When you want to deploy a query pipeline change in the interface to which your main traffic is directed, start by assigning a very small percentage of your audience to the Test scenario version of the pipeline to test the change behavior. If the change behaves as expected, you can then gradually increase the percentage routed to the Test scenario version.

In the Test scenario section, click Edit to manage query pipeline components of the Test scenario version of the pipeline.
On the Test Scenario page that opens, add, edit, or delete query pipeline rules to build your A/B test.
Once you’re done adding or editing rules, click A/B test on the top of the page to return to the A/B test tab.
On the A/B Test tab, click Start to activate the A/B test.

You can pause or stop your A/B test at any moment.

Note

When you create an A/B test, a mirror of the original pipeline is created and is used to store the configuration of the Test scenario version of the pipeline. This mirror pipeline isn’t visible on the Query Pipelines page, but you can find it on the Content Browser (platform-ca | platform-eu | platform-au) page, where you can use it to test the pipeline and investigate search relevance issues.

Mirror pipeline in Content Browser | Coveo

The mirror pipeline remains visible in the Content Browser when the A/B test is active or paused, and is automatically deleted when you stop the test.

Edit an A/B test

On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you want to edit an A/B test, and then click Edit components in the Action bar.
On the A/B Test tab, in the Configuration section, click Edit A/B Test.
In the Edit A/B Test Configuration panel that opens, in the Configuration section, under Traffic Split, move the slider to choose the traffic ratio you want to send to the Original pipeline and its Test scenario version.
In the Test scenario section, click Edit to manage query pipeline components of the Test scenario version of the pipeline.
On the Test Scenario page that opens, add, edit, or delete query pipeline rules to build your A/B test.
Once you’re done adding or editing rules, click A/B test on the top of the page to return to the A/B test tab.
On the A/B Test tab, click Start to activate the A/B test.

Pause an A/B test

You may want to temporarily stop sending traffic to the Test scenario version of your original pipeline.

On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you have an active A/B test, and then click Edit components in the Action bar.
On the A/B Test tab, click Pause.

You can restart the A/B test at any moment by clicking Start.

Stop an A/B test

After running an A/B test, you’ll be either satisfied or unsatisfied with the changes brought to the Test scenario version of the pipeline. Whatever the case, end the A/B test in order to stop sending traffic to the Test scenario version of the original pipeline.

On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you have an active A/B test, and then click Edit components in the Action bar.
On the A/B Test tab, do one of the following:
- Click Stop A/B test and keep original configuration to have the rules of the Original pipeline to still apply.
- Click Stop A/B test and use test scenario to have the rules of the Test scenario version of the pipeline to apply from now on in your pipeline.
- Click Stop A/B test and extract test scenario to create a new query pipeline with the configuration of the Test scenario. This new query pipeline will appear on the Query Pipelines (platform-ca | platform-eu | platform-au) page with the following name: [name of the original pipeline]-A/B-test-mirror.
Click Confirm.

The A/B test is now stopped. No more traffic will be sent to the Test scenario version of the pipeline.

Review A/B tests performance

You have two avenues for reviewing the performance of your A/B tests: the A/B Test tab and the built-in A/B Test dashboard report.

Note

The A/B Test tab provides a quick snapshot, as it analyzes a maximum of 60 days' worth of data. For a more comprehensive analysis, use the built-in A/B Testing dashboard report available in the Analytics section of the Coveo Administration Console. This report allows for a more granular view of your metrics and doesn’t have the 60-day data analysis limitation.

"A/B Test" tab

The A/B Test tab displays key metrics that allow you to analyze the performance of the Original pipeline against that of the Test scenario. These metrics include Average Result Rank, Search Clickthrough, Average Number of Results, and Queries Without Results.

On the Query Pipelines (platform-ca | platform-eu | platform-au) page, click the query pipeline in which you have an active A/B test, and then click Edit components in the Action bar.
On the A/B Test tab, if an A/B test is running, in the Key Performance Indicators section, review the available metrics.

Based on shown metric results for your Original Pipeline and the Test scenario, you can see which pipeline provided the best search relevance.

"A/B Testing" dashboard report

The A/B Testing dashboard report provides a detailed analysis of your A/B tests. This report allows you to compare the performance of the Original pipeline against that of the Test scenario over a period of time.

On the Reports (platform-ca | platform-eu | platform-au) page of the Administration Console, click Add, and then click Dashboard from template.
In the Select a Template panel, click A/B Testing, and then click Select Template.

In the Add A/B Testing Report panel that opens, under Select A/B Test, click the dropdown menu and select the A/B test you want to analyze.

Note

Your Coveo organization must contain at least two query pipelines associated with an active A/B Test, and usage analytics data must be available for at least one of the two pipelines. Otherwise, you’ll get the Missing Required Usage Analytics Data error message when adding the report.

Under Select item A and Select item B, click the dropdown menus and select either the items or pipelines you want to compare.
Click Add report.

On the upper-right corner, click Save.

You can now review the performance of your A/B test in the newly created dashboard report.

On the A/B Test Overview tab, the Performance metrics table card gives you an overview of the performance of the Original pipeline and the Test scenario.

If the difference between the Original pipeline and the Test scenario isn’t as significant as you expected, you may want to consider running the A/B test for a longer period of time to gather more data. If you let the test run for two to three weeks and the results are still unclear, you can stop the test and wait for a month before retesting.

On the Metrics Comparison tab, the various metric cards give you a breakdown of metric performance for the two pipelines. It’s recommended to review the Average Result Rank, Search Clickthrough, Average Number of Results, and Queries Without Results metrics to determine which pipeline provides the best search relevance.

On the Comparison Over Time tab, you can view the dimensions time series cards to determine the progression of certain metrics over time. You can adjust the date range on the cards by hovering over the card and clicking the desired period.

If you notice a spike in the Average Click Rank dimension time series card, this may indicate that user behavior is causing distortion in your metrics. This typically occurs when a small number of users are responsible for a large number of clicks, for example, internal users who are testing the search interface, You can view these users on the People to Exclude tab.

On the People to Exclude tab, in Extreme Users to Exclude from the A/B Test table card, you can view the user IDs of those who are causing spike in your metrics.

To exclude a user from the A/B Testing report, click to create a filter, for example, User Id is not 9e9f35b-59bd-ab.
To exclude multiple user IDs, adhere to the following best practice to ease the process of creating a filter with more than one value:
1. On the upper-right side of the Extreme Users to Exclude from the A/B Test table card, click .
2. In the Data Explorer panel, in the bottom-right corner, click . A CSV file containing the user IDs will be downloaded to your device.
3. Back in the A/B Testing report, click .
  1. Select the User Id dimension.
  2. Select the is not operator.
  3. In the value field, copy and paste the user IDs from the CSV file.
  4. Once you’re done, click Add filter.

Leading practices

When managing A/B tests, consider the following recommendations and tips:

Your A/B tests should only contain subtle changes between the Original pipeline and its Test scenario so that you always know the reason for a positive outcome.
Set yourself achievable goals in order to improve your relevance metrics.

Example

Improve the Search Event Clickthrough (%) ratio of the 10 queries with lowest Relevance Index by 10%.
Your test results must be significant for you to conclude whether the Test scenario version of the pipeline is better than the Original pipeline. You can use tools like SurveyMonkey to calculate an A/B test statistical significance.
Keep in mind that A/B tests results can often be surprising. You should therefore not reject an A/B test result based on your arbitrary judgment.
The difference between the Original pipeline and its Test scenario, as well as the sample size, are two key factors to look into before drawing any conclusion on an A/B test.

Example

You want to test your results regarding the clickthrough ratio with a couple of changes you made in your default pipeline. You get a clickthrough ratio of 30% with the Original pipeline and 35% with its Test scenario. The base size of your two samples is 1,000 queries. With these numbers, your result is statistically significant 19 times out of 20 (95% confidence level), so you make the Test scenario version of the pipeline effective on your search page.
Make your A/B test consistent across your entire search page.

Example

When a user searches for char and you enter a thesaurus rule that replaces char with characters, this behavior should be the same across every search box of your search interface.
Perform many A/B tests to compile the positive results to boost your expected outcome.

"Key Performance Indicators" section

This section allows you to review the performance of the Original pipeline compared to that of the Test scenario. Available metrics are:

Metric	Definition
Average result rank	The average position of opened items from the search results.
Search clickthrough	The percentage of search events followed by a click.
Average number of results	The average number of results for a query.
Query without results	The percentage of queries that returned no results.

Metric

Definition

Average result rank

The average position of opened items from the search results.

Search clickthrough

The percentage of search events followed by a click.

Average number of results

The average number of results for a query.

Query without results

The percentage of queries that returned no results.

Required privileges

The following table indicates the privileges required to view or edit A/B tests. Learn more about the Privilege reference or how to manage privileges.

Action	Service - Domain	Required access level
View A/B tests	Organization - Organization Search - Query pipelines Analytics - Analytics data	View
Edit A/B tests	Organization - Organization Analytics - Analytics data	View
Search - Query pipelines	Edit
Search - Execute queries	Allowed

Action

Service - Domain

Required access level

View A/B tests

Organization - Organization
Search - Query pipelines
Analytics - Analytics data

View

Edit A/B tests

Organization - Organization
Analytics - Analytics data

View

Search - Query pipelines

Edit

Search - Execute queries

Allowed