Understanding How Coveo for Sitecore Resolves the Site for the Clickable URI

Each Coveo for Sitecore search result typically contains a link to open the item in your browser. We call this link the clickable URI.

When you’re indexing items in a multi-site/scaled Sitecore setup, you might struggle to understand how Coveo for Sitecore resolves the clickable URI of items. In particular, you could be concerned to see Sitecore items indexed with the site host name of your Content Management (CM) server when you expected Coveo for Sitecore to set the clickable URI to the host name of your Content Delivery (CD) server.

Coveo for Sitecore initially determines the clickable URI of a Sitecore item at indexing time, but the site part of the clickable URI is recomputed automatically at query time.

This article explains how Coveo for Sitecore determines the site in the clickable URI at indexing and at query time. It also provides guidelines to troubleshoot issues when you get unexpected host names in your computed clickable URIs.

How the Site is Resolved at Indexing Time

At indexing time, Coveo for Sitecore collects the following information to resolve the site for an item:

  • The Sitecore content tree item structure:

    You can break down the Content section of the Sitecore content tree by adding items that serve as site delimiters.

  • The Sitecore <sites> element configuration:

    In versions up to and including Sitecore 8.0, you can add <site> elements directly in the Web.config file. For Sitecore versions 8.1 and later, Sitecore provides the App_Config\Include\SiteDefinition.config.example file which is a blueprint to adding new content site definitions ahead of the <site name="website" ...> element of the Sitecore.config file. You can add site definitions in the SiteDefinition.config.example file and enable it by removing the .example extension.

    Coveo for Sitecore tries to match each item it indexes with a <site> element defined in the Sitecore configuration. When Coveo finds a matching <site> element, it uses the hostName attribute value (or targetHostName value, if one is specified) of that site for the clickable URI.

    When setting many hostName attribute values for a site, using pipes or wildcards, always provide a value for the targetHostName attribute as well because Coveo needs a specific host name for the clickable URI of the item.

    For example, Coveo matches the English version of an item called page1 with the following site.

     <site name="site1" hostName="sample.*" targetHostName="sample.net" ... />
    

    In the Coveo index, the URI value for this item would be http://sample.net/en/page1.

    Certain <site> element attribute values you specify are of particular importance when Coveo matches an item with a site, namely:

    • the rootPath
    • the startItem
    • the language

The Coveo for Sitecore ResolveItemSiteProcessor processor in the coveoResolveItemSite pipeline applies the following logic to match an item with a <site> element:

  1. The processor excludes non-content <site> elements as potential item hosts (e.g., the shell, login, admin, service sites).

  2. For each remaining <site> element of the Sitecore configuration, the Sitecore item location in the content tree is compared to the concatenation of the rootPath and startItem site attribute values. If the Sitecore item isn’t along that path, the <site> element is eliminated as a potential host.

  3. For each remaining potential host <site>, and in the order these <site> elements are declared in the configuration, the Sitecore item language is compared to the language site attribute value. If the item and site languages match, or if no language was specified in the site definition, the site is selected as the host.

  4. If no site was selected in step 3, the first remaining potential host <site>, in the order these <site> elements are declared in the configuration, is selected (though the item and site languages don’t match).

  5. If no site was selected in step 4 and you have set a value for the <serverUrl> element in the Coveo.SearchProvider.Custom.config file, the <serverUrl> element value is used.

  6. If no site was selected in step 4 and you have not set a value for the <serverUrl> element, a default host, created at Sitecore installation, is used.

You have the following structure in the Sitecore content tree and the page1, page2, and page3 items each include versions in English and French (Canada).

Your compiled configuration is the following:

<sites>
  <site name="coveo_website" virtualFolder="/sitecore modules/Web/Coveo" physicalFolder="/sitecore modules/Web/Coveo" rootPath="/sitecore/content" startItem="/home" language="en" database="web" domain="extranet" allowDebug="true" cacheHtml="true" loginPage="/sitecore/login" patch:source="Coveo.SearchProvider.config"/>
  <site name="coveoanalytics" virtualFolder="/coveo/rest/v6/analytics" enableTracking="true" database="web" domain="extranet" patch:source="Coveo.SearchProvider.config"/>
  <site name="coveorest" virtualFolder="/coveo/rest" physicalFolder="/coveo/rest" enableTracking="false" database="web" domain="extranet" patch:source="Coveo.SearchProvider.config"/>
  <site name="shell" virtualFolder="/sitecore/shell" physicalFolder="/sitecore/shell" rootPath="/sitecore/content" startItem="/home" language="en" database="core" domain="sitecore" loginPage="/sitecore/login" content="master" contentStartItem="/Home" enableWorkflow="true" enableTracking="false" analyticsDefinitions="content" xmlControlPage="/sitecore/shell/default.aspx" browserTitle="Sitecore" htmlCacheSize="10MB" registryCacheSize="15MB" viewStateCacheSize="1MB" xslCacheSize="25MB" disableBrowserCaching="true" contentLanguage="en" patch:source="Sitecore.Speak.ItemWebApi.config" enableItemLanguageFallback="false" enableFieldLanguageFallback="false" itemwebapi.mode="StandardSecurity" itemwebapi.access="ReadWrite" itemwebapi.allowanonymousaccess="false"/>
  ...
  ...
  <site name="modules_website" virtualFolder="/sitecore modules/web" physicalFolder="/sitecore modules/web" rootPath="/sitecore/content" startItem="/home" language="en" database="web" domain="extranet" allowDebug="true" cacheHtml="true"/>
  <site name="site1" hostName="www.mysite1.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site1" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site2" hostName="www.mysite2.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site2" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site3" hostName="www.mysite3.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site3" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site4" hostName="www.mysite4.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site1" language="fr-CA" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site5" hostName="www.mysite5.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site2" language="fr-CA" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site6" hostName="www.mysite6.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="website" enableTracking="true" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/home" database="web" domain="extranet" allowDebug="true" ... />
  <site name="scheduler" enableTracking="false" domain="sitecore"/>
  ...
  ...
</sites>

Given this information, the item versions would be resolved to a hostName as follows.

Item name Item version hostName
page1 English www.mysite1.ca
page1 French (Canada) www.mysite4.ca
page2 English www.mysite2.ca
page2 French (Canada) www.mysite5.ca
page3 English www.mysite3.ca
page3 French (Canada) www.mysite3.ca 1

1: The <site3> site is selected over <site6> because it’s declared before <site6> in the configuration.

Automatic Recomputation of the Clickable URI at Query Time

Coveo for Sitecore automatically recomputes clickable URIs at query time for the following reasons:

  • In scaled environments, items are indexed on a Content Management (CM) server but are accessed by visitors through requests to a Content Delivery (CD) server. Therefore, the site part of the clickable URI must reflect the intended Content Delivery site.

  • The Sitecore site definitions may have been updated recently, making a site under which items were indexed invalid.

When a website visitor accesses a Coveo-powered search page, all search page calls to the Search API result in a response to the Coveo REST endpoint. This event triggers the ResolveResultClickableUriProcessor processor in the coveoProcessParsedRestResponse pipeline.

The ResolveResultClickableUriProcessor processor simply calls the Sitecore LinkManager GetItemUrl method with the UrlOptions.Site property set to the current site (see Use the Sitecore LinkManager to Resolve URIs).

Troubleshooting

Based on the numerous client cases we have had over time, we have added this section to provide some guidance into the most common pitfalls regarding malformed or unexpected clickable URIs.

The use of custom link providers accounts for the greatest number of clickable URI cases Coveo Support receives.

Both at index time and at query time, Coveo for Sitecore calls the Sitecore LinkManager GetItemUrl method with the default urlOptions, but with AlwaysIncludeServerUrl set to true, to get the clickable URIs it needs to render in search results. This out-of-the-box behavior successfully handles most Sitecore environments.

Prior to attempting anything else, clients should ensure they have isolated the effect of custom provider logic and related configurations on the clickable URIs they obtain and that this isn’t the source of the problem.

Use of Multiple hostName Values Without a targetHostName in a Configuration

When configuring <site> elements, make sure of the following when a <site> element contains multiple hostName values (using wildcards (*) or pipes (|):

  1. The <site> element also contains a targetHostName value. Otherwise, Sitecore won’t be able to apply a host name to incoming item URLs which match one of the hostName patterns in your <site> element. As a result, when considering the Coveo for Sitecore 6-point site selection sequence explained at the end of the How the Site is Resolved at Indexing Time section, item URLs will use a fallback <site> hostName (step 4), the <serverUrl> value (step 5), or a default host (step 6).

  2. The targetHostName must not contain a wildcard (*). Otherwise, once again, Sitecore won’t be able to apply a host name to incoming item URLs which match one of the hostName patterns in your <site> element. Instead, Sitecore will use a fallback host name, somewhere between steps 4 and 6 in the site selection sequence.

  3. The Rendering.SiteResolving setting must be set to true.

SXA Site Item Clickable URIs

In SXA, you configure site definitions under <SXA_SITE_NAME>/Settings/Site Grouping rather than in configuration files. Nonetheless, Coveo for Sitecore handles SXA site item clickable URI computations at indexing and query time the same way it does with non-SXA sites. The common pitfalls mentioned above therefore apply to SXA sites as well.

You can use the switchableLinkProvider default link provider or set the Link Provider name site definition field to the sitecore provider. The computed clickable URIs at indexing and at query time will be the same using both providers.

Make sure you have selected a Start Item in your SXA site definition (as you would do by specifying <site> element rootPath and startItem attribute values in non-SXA sites). Also refer to the SXA Site Manager to ensure you don’t have any site conflicts.

Recommended Articles