Creating an HTML Representation of Page Content
Creating an HTML Representation of Page Content
By default, each document is indexed in a Coveo index using only the field information retrieved in Sitecore.
However, some pages have more information than what’s defined in the fields.
This section describes the different options available and their output on relevance and performance.
How Indexing the Content as HTML Affects Results
The HTML representation of the page content will be set by the index as free-text searchable content. Free text content and free text fields can be queried by entering the search terms directly in the search box.
Using the Coveo JavaScript Search Framework, HTML content enables the CoveoQuickview component on the search results.
In the Coveo Cloud Content Browser or the Coveo On-Premises Index Browser, the items with HTML rendering will have a File Type
attribute equal to HTML
while the rest of the content will be sitecoreitem
.
Creating an HTML Representation of the Page Content
An HTML representation of a page content can be created with and without executing an HTTP request to get the complete page content.
Without an HTTP Request
HTTP requests can be taxing during an indexing process and should be avoided if possible.
Use Basic HTML Content Processor to create an HTML representation of the item without sending an HTTP request.
With an HTTP Request
Although sending a HTTP request is more demanding during the indexing process, it’s sometime the only way to retrieve related content only available when the page is rendered in a browser.
To do this, two processors are available:
- From the December 2017 release of Coveo for Sitecore 4.1, the FetchPageContent is the most advanced and flexible solution.
- For older versions, the HTMLContentInBodyWithRequests processor can be used.