Coveo for Sitecore 5 is now available!

Excluding Sitecore Items From Your Coveo Index

Most of the time, Coveo for Sitecore will index more items than what is needed.

It is highly recommended to reduce as much as possible the amount of content which does not bring value to your visitors.

This article highlights the Coveo for Sitecore tools which can help you manage your content.

Index Crawling Root

Changing the crawling root is definitely the way to go when you want to exclude items from the Coveo index (see Changing the Crawling Root of an Index). With this method, the items are not even analyzed, thus saving resources and time when rebuilding your index.

Available Pipelines

If it is not possible in your setting to exclude those items by changing the crawling root, you can use pipelines to filter out the items as they are analyzed. Review the following pipeline descriptions to decide which one suits your needs best.

You can use the inbound indexing pipelines to prevent items from being indexed in Coveo. To do so, you can either use the default Sitecore pipelines or the additional Coveo pipelines that come with Coveo for Sitecore (see Sitecore 7 Inbound and Outbound Filter Pipelines and Understanding the Indexing and Search Pipelines).

Use the Coveo pipelines to avoid affecting other indexes used by Sitecore.

If excluded items were previously indexed, you must rebuild the index to delete them.

coveoInboundFilterPipeline

This pipeline is the only one that is run only for Coveo indexes. This makes it an ideal candidate to prevent items from being indexed for Coveo indexes and keep Lucene indexes untouched. The processors for this pipeline require a different type of argument. Thus, processors from the Sitecore indexing.filterIndex.inbound pipeline cannot be used directly in this pipeline without adapting their code.

A default Coveo.SearchProvider.InboundFilters.ApplySitecoreInboundFilterProcessor processor is included in the pipeline to run the Sitecore indexing.filterIndex.inbound pipeline. Thus, all the Sitecore inbound pipeline processors are also run for Coveo indexes. This processor can be removed if desired (see Creating a Custom Coveo Inbound Filter).

Coveo for Sitecore (December 2016)

You can exclude from the index all items that do not have a layout, using the Coveo.SearchProvider.InboundFilters.HasLayoutInboundFilter processor (see Excluding Items Without Layouts From Being Indexed).

coveoSitecoreInboundFilterPipeline

This pipeline uses the same type of arguments as the Sitecore indexing.filterIndex.inbound pipeline and works in conjunction with the Coveo.SearchProvider.InboundFilters.InvokeSitecoreInboundFilterPipeline processor in the coveoInboundFilterPipeline pipeline. It was created to avoid adjusting the code of older Sitecore indexing.filterIndex.inbound pipeline processors, and be able to use them for Coveo indexes only.

indexing.filterIndex.inbound

This pipeline is available out of the box with Sitecore. It runs its processors for all the possible Sitecore search indexes, Lucene (or Solr), and Coveo. This is a problem with Coveo for Sitecore, as it runs side-by-side with the default Lucene indexes but you probably do not want to exclude items from the Lucene index. Thus, it is not recommended that you use this pipeline to exclude items from your Coveo index. You are advised to use the coveoInboundFilterPipeline provided with Coveo for Sitecore unless you are sure that you want to exclude the items from all indexes, including Lucene, Solr or Azure Search (see Creating a Custom Sitecore Inbound Filter).

What’s Next?

Once the crawling scope is well adjusted and the right filters are in place, you need to decide what will be indexed on your item (see Creating an HTML Representation of Page Content).

Recommended Articles