Indexing Website Content

To index your website content, you need a Coveo organization. This Coveo Platform access allows you to create a source, which is the bridge to your website data.

Step 1: Getting Started With Coveo Cloud

  1. For first time users, learn how to log in to the Coveo Platform.

  2. Get an OAuth2 token. You will use it later in this tutorial to authenticate some of the REST API calls you’re going to make.

  3. Using the access token you got in step 2, create a trial organization.

  4. Using the identity registered in step 1, log in to the Coveo Platform.

You can review your organization license limits in the Coveo Platform (see Review Organization Settings and Limits).

Step 2: Creating and Configuring a Source to Index a Website

Coveo provides many out-of-the-box connectors designed to access and index website content. Connectors may be system-specific or generic.

A corporate website may be generated and managed through a Content Management System. Coveo provides CMS-specific connectors. However, you can also select a more generic connector when creating a source to index content from a website, or from its underlying CMS database (e.g., a Web, Sitemap, or Database connector). For the comprehensive list of connection options available, see Connector Types.

The following table summarizes Coveo connection options for website content. You can click a given connector name in the table for more details regarding its features, including content security type support and instructions on how to create a source.

System used Available connectors
Sitecore Coveo has a system-specific integration for Sitecore websites (see Coveo for Sitecore).
Other systems (Adobe Experience Manager, WordPress, Acquia, Episerver, etc.) Use the Sitemap Source if your website includes a Sitemap file or a Sitemaps index file.
Use the Website Source to crawl your website as do search engines such as Google.
Use the Database Source if you can access the underlying database of your CMS and you know its schema.
Use the Generic REST API Source to get content from a remote repository exposing its data through a REST API.
Use the Push Source for situations where you have to resort to having a developer create a custom crawler and push the collected content to your Coveo organization (e.g., for an on-premises content management system developed in-house).

Advanced Indexing Options - Adding an Indexing Pipeline Extension

The Coveo Cloud indexing pipeline is the process each item goes through when indexed. At this stage, you might want to explore how you can customize this process by adding an extension. (see Indexing Pipeline Extension Overview)

Step 3: Review and Inspect Your Indexed Content

The Content Browser is a basic, non-configurable, demo search interface in Coveo Platform to help you navigate and inspect your organization sources content.

For instructions on accessing the Content Browser and making use of its many features, see Inspect Items With the Content Browser.

Step 4: Adding Fields and Mapping Metadata

Coveo organization sources come with a set of standard system fields. However, adding your own fields allows the end user to obtain additional information in search results and to better target desired content (see Field Uses).

To add a field and its associated mapping for your source

  1. Add a field.

  2. Add a mapping to associate item metadata or text with the field you just created.

  3. After the index rebuild, return to the Content Browser page to review the changes to your indexed items.

Step 5: (Optional) Creating a Customizable Demo Search Page

So far, you used the Content Browser to filter and view your indexed content. Now create a real, customizable Coveo demo search page in the Cloud Platform (see Manage Hosted Search Pages). You will get an idea of what you’ll be able to accomplish using the Coveo JavaScript Search Framework in the next step of the solution implementation.

What’s Next?

You should now proceed to Integrating a Search Interface into Your Website.

Recommended Articles