How the site for the clickable URI is resolved

Each Coveo for Sitecore search result typically contains a link to open the item in your browser. We call this link the clickable URI.

When you’re indexing items in a multi-site/scaled Sitecore setup, you might struggle to understand how Coveo for Sitecore resolves the clickable URI of items. In particular, you could be concerned to see Sitecore items indexed with the site host name of your content management (CM) server when you expected Coveo for Sitecore to set the clickable URI to the host name of your content delivery (CD) server.

Coveo for Sitecore initially determines the clickable URI of a Sitecore item at indexing time, but the site part of the clickable URI is recomputed automatically at query time.

This article explains how Coveo for Sitecore determines the site in the clickable URI at indexing and at query time. It also provides guidelines for troubleshooting issues when you get unexpected host names in your computed clickable URIs.

How the site is resolved at indexing time

At indexing time, Coveo for Sitecore collects the following information to resolve the site for an item:

  • The Sitecore content tree item structure:

    You can break down the Content section of the Sitecore content tree by adding items that serve as site delimiters.

    MultisiteIndexingContentTreeSetup
  • The Sitecore <sites> element configuration:

    Sitecore provides the <SITECORE_INSTANCE_ROOT>\App_Config\Include\SiteDefinition.config.example file, which is a blueprint for adding new content site definitions ahead of the <site name="website" ...> element of the Sitecore.config file. You can add site definitions in the SiteDefinition.config.example file and enable this file by removing the .example extension.

Coveo for Sitecore tries to match each item it indexes with a <site> element defined in the Sitecore configuration. When Coveo finds a matching <site> element, it uses the hostName attribute value (or targetHostName value, if one is specified) of that site for the clickable URI.

Important

When setting many hostName attribute values for a site, using pipes or wildcards, always provide a value for the targetHostName attribute as well because Coveo needs a specific host name for the clickable URI of the item.

For example, Coveo matches the English version of an item called page1 with the following site.

 <site name="site1" hostName="sample.*" targetHostName="sample.net" ... />

In the Coveo index, the URI value for this item would be http://sample.net/en/page1.

Tip
Leading practice

Set the log level on Coveo rebuild logs to DEBUG to see which <site> an item is matched with at indexing time.

Image of custom Coveo rebuild logs on the Diagnostic Page log viewer
Figure 1. Setting the log level to DEBUG gives you the ability to search for a Sitecore Item ID and better follow how the item is handled during the rebuild.

Certain <site> element attribute values you specify are of particular importance when Coveo matches an item with a site:

  • rootPath

  • startItem

  • language

The Coveo for Sitecore ResolveItemSiteProcessor processor in the coveoResolveItemSite pipeline applies the following logic to match an item with a <site> element:

  1. The processor excludes non-content <site> elements as potential item hosts (for example, the shell, login, admin, service sites).

  2. For each remaining <site> element of the Sitecore configuration, the Sitecore item location in the content tree is compared to the concatenation of the rootPath and startItem site attribute values. If the Sitecore item isn’t along that path, the <site> element is eliminated as a potential host.

    Note

    The Coveo for Sitecore ResolveItemSiteProcessor processor used to give precedence to the site definition ContentStartItem attribute over the StartItem attribute when resolving clickable URIs at indexing time. Coveo for Sitecore now disregards the ContentStartItem. If you have ContentStartItem attribute values in your site definitions, the changes in this release might affect the clickable URIs of your items in the index.

  3. For each remaining potential host <site>, and in the order these <site> elements are declared in the configuration, the Sitecore item language is compared to the language site attribute value. If the item and site languages match, or if no language was specified in the site definition, the site is selected as the host.

  4. If no site was selected in step 3, the first remaining potential host <site>, in the order these <site> elements are declared in the configuration, is selected (though the item and site languages don’t match).

  5. If no site was selected in step 4 and you have set a value for the <serverUrl> element in the Coveo.SearchProvider.Custom.config file, the <serverUrl> element value is used.

  6. If no site was selected in step 4 and you haven’t set a value for the <serverUrl> element, a default host, created at Sitecore installation, is used.

Example

You have the following structure in the Sitecore content tree and the page1, page2, and page3 items each include versions in English and French (Canada).

SiteResolutionExampleContentTree

Your compiled configuration is the following:

<sites>
  <site name="coveo_website" virtualFolder="/sitecore modules/Web/Coveo" physicalFolder="/sitecore modules/Web/Coveo" rootPath="/sitecore/content" startItem="/home" language="en" database="web" domain="extranet" allowDebug="true" cacheHtml="true" loginPage="/sitecore/login" patch:source="Coveo.SearchProvider.config"/>
  <site name="coveoanalytics" virtualFolder="/coveo/rest/v6/analytics" enableTracking="true" database="web" domain="extranet" patch:source="Coveo.SearchProvider.config"/>
  <site name="coveorest" virtualFolder="/coveo/rest" physicalFolder="/coveo/rest" enableTracking="false" database="web" domain="extranet" patch:source="Coveo.SearchProvider.config"/>
  <site name="shell" virtualFolder="/sitecore/shell" physicalFolder="/sitecore/shell" rootPath="/sitecore/content" startItem="/home" language="en" database="core" domain="sitecore" loginPage="/sitecore/login" content="master" contentStartItem="/Home" enableWorkflow="true" enableTracking="false" analyticsDefinitions="content" xmlControlPage="/sitecore/shell/default.aspx" browserTitle="Sitecore" htmlCacheSize="10MB" registryCacheSize="15MB" viewStateCacheSize="1MB" xslCacheSize="25MB" disableBrowserCaching="true" contentLanguage="en" patch:source="Sitecore.Speak.ItemWebApi.config" enableItemLanguageFallback="false" enableFieldLanguageFallback="false" itemwebapi.mode="StandardSecurity" itemwebapi.access="ReadWrite" itemwebapi.allowanonymousaccess="false"/>
  ...
  ...
  <site name="modules_website" virtualFolder="/sitecore modules/web" physicalFolder="/sitecore modules/web" rootPath="/sitecore/content" startItem="/home" language="en" database="web" domain="extranet" allowDebug="true" cacheHtml="true"/>
  <site name="site1" hostName="www.mysite1.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site1" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site2" hostName="www.mysite2.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site2" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site3" hostName="www.mysite3.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site3" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site4" hostName="www.mysite4.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site1" language="fr-CA" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site5" hostName="www.mysite5.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/site2" language="fr-CA" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="site6" hostName="www.mysite6.ca" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/" language="en" database="web" domain="extranet" enableWebEdit="true" patch:source="SiteDefinition.config"/>
  <site name="website" enableTracking="true" virtualFolder="/" physicalFolder="/" rootPath="/sitecore/content" startItem="/home" database="web" domain="extranet" allowDebug="true" ... />
  <site name="scheduler" enableTracking="false" domain="sitecore"/>
  ...
  ...
</sites>

Given this information, the item versions would be resolved to a hostName as follows.

Item name Item version hostName

page1

English

www.mysite1.ca

page1

French (Canada)

www.mysite4.ca

page2

English

www.mysite2.ca

page2

French (Canada)

www.mysite5.ca

page3

English

www.mysite3.ca

page3

French (Canada)

www.mysite3.ca 1

1 The <site3> site is selected over <site6> because it’s declared before <site6> in the configuration.

Automatic recomputation of the clickable URI at query time

Note

The information that follows doesn’t apply to search interfaces that use the Coveo Hosted Search Page rendering or when the reverse proxy is disabled.

Coveo for Sitecore automatically recomputes clickable URIs at query time for the following reasons:

  • In scaled environments, items are indexed on a content management (CM) server but are accessed by visitors through requests to a content delivery (CD) server. Therefore, the site part of the clickable URI must reflect the intended content delivery site.

  • The Sitecore site definitions may have been updated recently, making a site under which items were indexed invalid.

When a website visitor accesses a Coveo-powered search page, all search page calls to the Search API result in a response to the Coveo for Sitecore REST endpoint proxy. This event triggers the ResolveResultClickableUriProcessor processor in the coveoProcessParsedRestResponse pipeline.

For normal items, the ResolveResultClickableUriProcessor processor simply calls the Sitecore LinkManager.GetItemUrl method with the UrlOptions.Site property set to the current site (see Use the Sitecore LinkManager to resolve URIs). For media items, the ResolveResultClickableUriProcessor processor calls the Sitecore MediaManager.GetMediaUrl method instead.

Note

Sometimes (for example, when the Sitecore instance is hosted in Azure), the MediaManager.GetMediaUrl method doesn’t return an appropriate clickable URI host name. In these situations, you should:

  1. Add a <serverUrl> element as a child of the <defaultIndexConfiguration> element in the Coveo.SearchProvider.Custom.config file to specify the host name to be used in your clickable URIs.

    <defaultIndexConfiguration>
      <serverUrl>https://oursite.com</serverUrl>
  2. Add the following <setting> element shown as a child of the <settings> element in the Coveo.SearchProvider.Custom.config file.

    <settings>
      <setting name="Coveo.Url.UseServerUrlFromConfiguration" value="true" />
    </settings>

Troubleshooting

Based on the numerous client cases we’ve had over time, this section has been added to provide some guidance into the most common pitfalls regarding malformed or unexpected clickable URIs.

The use of custom link providers accounts for the greatest number of clickable URI cases Coveo Support receives.

Both at index time and at query time, Coveo for Sitecore calls the Sitecore LinkManager GetItemUrl method with the default urlOptions, but with AlwaysIncludeServerUrl set to true, to get the clickable URIs it needs to render in search results. This out-of-the-box behavior successfully handles most Sitecore environments.

Before attempting anything else, clients should ensure they have isolated the effect of custom provider logic and related configurations on the clickable URIs they obtain and that this isn’t the source of the problem.

Unawareness of single-level inheritance in site definitions

When you define a site in Sitecore, you can inherit from another site by specifying the inherits attribute in the <site> element. However, the site definition inheritance in Sitecore only applies one level down.

For example, consider the following site definitions.

<site name ="site2" inherits="site1" hostName ="test.com" targetHostName="test.com" ... />
<site name ="site1" inherits="website" rootPath="/sitecore/content" ... />
<site name ="website" hostName="example.com" targetHostName="example.com" rootPath="/sitecore/content" startItem="/home" ... />

In this example, site2 inherits the rootPath value from site1. However, even if site1 inherits from website, site2 doesn’t inherit the startItem value from website, because site2 isn’t the immediate parent of website.

With such a configuration, an item under /sitecore/content/home resolves to <site1>, not <site2>, and its clickable URI uses targetHostName example.com.

Use of multiple hostName values without a single targetHostName in a site configuration

When configuring <site> elements, make sure of the following when a <site> element contains multiple hostName values (using wildcards (*) or pipes (|)):

  1. The <site> element also contains a targetHostName value. Otherwise, Sitecore won’t be able to apply a host name to incoming item URLs which match one of the hostName patterns in your <site> element. As a result, when considering the Coveo for Sitecore 6-point site selection sequence explained at the end of the How the site is resolved at indexing time section, item URLs will use a fallback <site> hostName (step 4), the <serverUrl> value (step 5), or a default host (step 6).

  2. The site has a single targetHostName (that is, targetHostName must not contain a wildcard (*) or pipe (|)). Otherwise, once again, Sitecore won’t be able to apply a host name to incoming item URLs which match one of the hostName patterns in your <site> element. Instead, Sitecore will use a fallback host name, somewhere between steps 4 and 6 in the site selection sequence.

    Note

    Coveo for Sitecore logs a warning when a site definition contains multiple targetHostName values.

    Warning shown by Coveo for Sitecore for multi TargetHostName sites | Coveo for Sitecore 5
  3. The Rendering.SiteResolving setting must be set to true.

SXA site item clickable URIs

In SXA, you configure site definitions under <SXA_SITE_NAME>/Settings/Site Grouping rather than in configuration files. Nonetheless, Coveo for Sitecore handles SXA site item clickable URI computations at indexing and query time the same way it does with non-SXA sites. The common pitfalls mentioned previously therefore apply to SXA sites as well.

You can use the switchableLinkProvider default link provider or set the Link Provider name site definition field to the sitecore provider. The computed clickable URIs at indexing and at query time will be the same using both providers.

Make sure you’ve selected a Start Item in your SXA site definition (as you would do by specifying <site> element rootPath and startItem attribute values in non-SXA sites). Also refer to the Manage multiple sites with the SXA Site Manager article to ensure you don’t have any site conflicts.