Refresh, rescan, and rebuild

source update operations ensure that the content displayed in the search results of your Coveo-powered search interface closely matches your original repository content. For a great user experience, changes in your repository must be searchable and displayed in your search interface shortly after they’ve been saved.

To keep a source up to date with your content or to apply a source configuration change, you must run a source update operation. This can be done either manually on the Sources (platform-ca | platform-eu | platform-au) page or automatically with a schedule.

For most sources, Coveo offers three update operations:

  • Refresh

  • Rescan

  • Rebuild

When you run a source update operation, Coveo looks for and indexes changes in:

For more information on the indexing process, see Indexing Pipeline and Keep an Index Up to Date.

A source update (often a rebuild) may be required to apply some changes made to your source configuration. If so, you will be prompted to launch the required operation on the Sources (platform-ca | platform-eu | platform-au) page.

Refresh

During a refresh operation, Coveo crawls the items and permission models that have been identified by your content system as modified since the last source update. Then, Coveo retrieves the changes and updates your index.

A refresh can be scheduled or launched manually on the Sources (platform-ca | platform-eu | platform-au) page. It’s the update operation with the lowest impact on server resources and performance.

Notes
  • Some sources don’t support refresh operations. Most notably, Web sources perform rescans instead.

  • Some sources do support refresh operations, but have limitations. For instance, a refresh may not retrieve all types of changes, such as deleted items or permission changes. Check the Source Key Characteristics table of the desired source documentation page for details on any limitation.

Rescan

During a rescan operation, Coveo crawls all items in your content system. Then, Coveo re-indexes only the items and permissions that have been modified since the last source update. A rescan is typically used as a safeguard, that is, to index any change that hasn’t been caught by refresh operations due to limitations.

A rescan also deletes items from your index that either:

  • Have been deleted in your content system and therefore haven’t been crawled. If you rarely or never delete your content, see Forbid item deletion during a rescan.

  • Have been marked as deleted in your content system.

A rescan can be scheduled or launched manually on the Sources (platform-ca | platform-eu | platform-au) page. It has a moderate impact on server resources and performance.

Rebuild

During a rebuild operation, Coveo crawls all items in your content system. It then re-indexes all these items.

Typically, a rebuild is only necessary following a source configuration change that affects the indexed content.

You can launch a rebuild manually on the Sources (platform-ca | platform-eu | platform-au) page. The process can take several hours or days and has the biggest impact on server resources and performance since it crawls and re-indexes a large number of items. Therefore, for optimal efficiency, you should only launch a rebuild when you’ve made all the desired source or content changes.

Example

You changed the name of a space in your Confluence Cloud instance. A rebuild of your Confluence Cloud source is now necessary because performing a rescan will only detect the change in pages that are created or modified after the name change.

About the update process

During a source update operation, Coveo retrieves the latest version of each item it crawls.

For example, let’s say you launch the rescan of a source of 100 items at 12:00 AM. At 12:01 AM, while Coveo is crawling the 24th item, you edit the 81st item. At 12:02 AM, Coveo crawls the 81st item and indexes the change you’ve made.

Conversely, changes made to items that Coveo has already crawled will only be picked up and indexed by the next update operation.

For example, let’s say you launch the rescan of a source of 100 items at 12:00 AM. At 12:02 AM, while Coveo is crawling the 74th item, you edit the 10th item. At 12:03 AM, Coveo indexes the 100th item, and the rescan operation ends. Later that day, you launch a refresh. The 10th item, which is marked as modified at 12:02 AM, is then crawled and your item change is indexed.