- Indexing Strategies
- Excluding Sitecore Items from Your Coveo Index
- Indexing Item Content for Relevance
- Understanding How The ClickableUri Value Is Computed
- Rebuilding a Search Index
- Analyzing the Rebuild Process
- How Coveo for Sitecore Handles Sitecore Access Rights
- Indexing Performance Leading Practices
- Managing Sitecore Content in the Coveo Cloud Platform
- Browsing Through Indexed Fields
- Advanced Indexing Customizations
Excluding Sitecore Items From Your Coveo Index
Most of the time, Coveo for Sitecore will index more items than what is needed.
It is highly recommended to reduce as much as possible the amount of content which does not bring value to your visitors.
This article highlights the Coveo for Sitecore tools which can help you manage your content.
Index Crawling Root
Changing the crawling root is definitely the way to go when you want to exclude items from the Coveo index (see Changing the Crawling Root of an Index). With this method, the items are not even analyzed, thus saving resources and time when rebuilding your index.
If it is not possible in your setting to exclude those items by changing the crawling root, you can use pipelines to filter out the items as they are analyzed. Review the following pipeline descriptions to decide which one suits your needs best.
You can use the inbound indexing pipelines to prevent items from being indexed in Coveo. To do so, you can either use the default Sitecore pipelines or the additional Coveo pipelines that come with Coveo for Sitecore (see Sitecore 7 Inbound and Outbound Filter Pipelines and Understanding the Indexing and Search Pipelines).
Use the Coveo pipelines to avoid affecting other indexes used by Sitecore.
If excluded items were previously indexed, you must rebuild the index to delete them.
This pipeline is the only one that is run only for Coveo indexes. This makes it an ideal candidate to prevent items from being indexed for Coveo indexes and keep Lucene indexes untouched. The processors for this pipeline require a different type of argument. Thus, processors from the Sitecore
indexing.filterIndex.inbound pipeline cannot be used directly in this pipeline without adapting their code.
Coveo.SearchProvider.InboundFilters.ApplySitecoreInboundFilterProcessor processor is included in the pipeline to run the Sitecore
indexing.filterIndex.inbound pipeline. Thus, all the Sitecore inbound pipeline processors are also run for Coveo indexes. This processor can be removed if desired (see Creating a Custom Coveo Inbound Filter).
You can exclude from the index all items that do not have a layout, using the
Coveo.SearchProvider.InboundFilters.HasLayoutInboundFilter processor (see Excluding Items Without Layouts From Being Indexed).
This pipeline uses the same type of arguments as the Sitecore
indexing.filterIndex.inbound pipeline and works in conjunction with the
Coveo.SearchProvider.InboundFilters.InvokeSitecoreInboundFilterPipeline processor in the
coveoInboundFilterPipeline pipeline. It was created to avoid adjusting the code of older Sitecore
indexing.filterIndex.inbound pipeline processors, and be able to use them for Coveo indexes only.
This pipeline is available out of the box with Sitecore. It runs its processors for all the possible Sitecore search indexes, Lucene (or Solr), and Coveo. This is a problem with Coveo for Sitecore, as it runs side-by-side with the default Lucene indexes but you probably do not want to exclude items from the Lucene index. Thus, it is not recommended that you use this pipeline to exclude items from your Coveo index. You are advised to use the
coveoInboundFilterPipeline provided with Coveo for Sitecore unless you are sure that you want to exclude the items from all indexes, including Lucene, Solr or Azure Search (see Creating a Custom Sitecore Inbound Filter).
Once the crawling scope is well adjusted and the right filters are in place, you need to decide what will be indexed on your item (see Creating an HTML Representation of Page Content).