Creating an HTML Representation of Page Content

By default, each document is indexed in a Coveo index using only the field information retrieved in Sitecore.

However, some pages have more information than what’s defined in the fields.

This section describes the different options available and their output on relevance and performance.

How Indexing the Content as HTML Affects Results

The HTML representation of the page content will be set by the index as free-text searchable content. Free text content and free text fields can be queried by entering the search terms directly in the search box.

Using the Coveo JavaScript Search Framework, HTML content enables the CoveoQuickview component on the search results.

In the Coveo Cloud Content Browser or the Coveo On-Premises Index Browser, the items with HTML rendering will have a File Type attribute equal to HTML while the rest of the content will be sitecoreitem.

Creating an HTML Representation of the Page Content

An HTML representation of a page content can be created with and without executing an HTTP request to get the complete page content.

Without an HTTP Request

HTTP requests can be taxing during an indexing process and should be avoided if possible.

Use Basic HTML Content Processor to create an HTML representation of the item without sending an HTTP request.

With an HTTP Request

Although sending a HTTP request is more demanding during the indexing process, it’s sometime the only way to retrieve related content only available when the page is rendered in a browser.

To do this, two processors are available: