Add or Edit a SharePoint Online Source

Members of the Administrators and Content Managers built-in groups can index SharePoint Online content and make it searchable. In a Coveo-powered search interface, the source content is accessible to either everyone, the source creator only, or specific users as determined by source permissions (see Content Security).

  • To retrieve SharePoint on-premises content, you must create a SharePoint Server source.

  • The item modifications that are retrievable during a source rescan are determined by the options selected when adding or editing the Sharepoint Online source in the Additional Content section.

  • Following a refresh operation, deleted discussion lists are excluded from your Coveo SharePoint Online source content, but replies to the original discussion message will only be excluded following the next rescan operation. This is a known issue caused by a limitation of Microsoft SharePoint Online.

  • To decrease indexing times, you can configure your SharePoint Online source so that if a change is detected in the folder when indexing, it won’t re-index a list folder.

Source Key Characteristics

Features Supported Additional information
SharePoint Online version Latest cloud version  
Searchable content types Sites, sub-sites, user profiles, personal websites, lists, list items, list item attachments, document libraries, document sets, documents, web parts, and microblog posts and replies.
Content update operations Refresh

Takes place every hour by default. A rescan or rebuild is required to take account of deleted user profiles.


Takes place every week by default. Extracts all of the data and indexes the following: modified permissions, new items, existing items with a modified date greater than the date in the index, and existing items with a computed entity tag1 different than the one in the index.

Content security options Determined by source permissions  
Source creator  

1: The entity tag is the version identifier of an item and is calculated using the item metadata.


SharePoint Online Account With Appropriate Roles and Permissions

When you want to index SharePoint Online content, you must create a specific SharePoint Online account that has access to the content you want to make searchable and that will only be used for the source. If you allow Coveo to retrieve your content through your personal account, you will need to also update the source access token each time you change the account password to prevent authentication errors (see Update Access Token).

  1. Access your Azure Portal with an administrator account.

  2. In Azure, create an account with the following roles:

    Role Justification
    Application Administrator

    This role allows the user to consent to give Coveo's Azure Active Directory Application the admin permissions it needs (see Azure Application Permissions).

    If you don't want the crawling account to have that role, you need to consent with a user that has the Global Admin role once before login with the crawling account (see Admin Consent).

    SharePoint Administrator

    This role is needed for the site URLs autodiscovery. This is used when you select All sites for your SharePoint Online source.

    If you don't want the crawling account to have that role, you must select Specific items for your SharePoint Online source.

  3. Access your SharePoint Online tenant with an account that has the SharePoint Administrator role, and then grant appropriate SharePoint Online permissions to the account you created before to ensure it has access to all the content that you want to index.

    The following table presents the minimal required permissions that the account must have to perform the specified action.

    If you specified sites to crawl and you didn’t grant the minimal permissions, the crawler will stop. If you selected “All sites”, it will skip sites that the crawling account can’t see.

    Action to perform Minimal required permission
    Content (without security indexing)

    We recommend that you be a site administrator for every site you want to crawl to avoid permission misconfiguration. If you don't want the crawling account to be a site admin, it requires the following permission levels on every site (see Understanding permission levels in SharePoint):

    • Site permissions:

      • View Pages - View pages in a Web site.

      • Open - Allows users to open a Web site, list, or folder in order to access items inside that container.

    • List Permissions:

      • View Items - View items in lists and documents in document libraries.

      • Open Items - View the source of documents with server-side file handlers.

      • View Versions - View past versions of a list item or document.

    Content (with security indexing)

    Site admin for all SharePoint Online sites that need to be crawled (see Manage site admins in SharePoint Online).

    Personal site and user profile

    Owner of all personal sites (see Adding the Personal Sites Owner Permissions for SharePoint Online).

DNS Records Configuration for Office 365

  1. Log in to Office 365 admin center with an administrator account.

  2. In the navigation bar on the left, select Domains.

  3. On the Manage domains page:

    1. Under Domain Name, select your corporate domain (not check box.

    2. Next to the Action column, under the domain name, click Domain settings.

  4. On the domain page, in the DNS records section, take note of the DNS records.

  5. Configure these DNS records in your DNS host provider (see Create DNS records for Office 365 when you manage your DNS records).

  6. On the domain page, in the DNS records section, click the Troubleshoot domain link to ensure that the DNS records were correctly configured.

Azure Application Permissions

A SharePoint Online source uses the OAuth 2.0 authorization protocol. To work with Microsoft APIs (CSOM and REST), Coveo must authenticate via an Azure Active Directory Application.

When you create a SharePoint Online source, an Azure application is created in your Azure tenant (see Understand user and admin consent), and you must grant permissions to this application. The access token is then limited to these permissions, which are necessary to successfully crawl SharePoint Online. All following access token permissions needs Admin Consent. As a result, for a user to authenticate through the Coveo Azure Active Directory application, they must have the Application administrator role, or a user with the Global Admin role must have given consent (see Admin Consent).

Coveo is a verified publisher for the Azure application.

The permissions automatically granted to the application are the following:

Required permission Justification
Have full control of all sites (AllSites.FullControl)

Coveo requires this permission to retrieve permissions of crawled items (see AllSites.FullControl Permission for more information). Microsoft doesn't offer enough granularity for Coveo to use a permission with fewer privileges.

Some API calls require Coveo to have the AllSites.Read permission to fetch list items, sites and subsites, and document content data, but since AllSites.FullControl is required too, AllSites.Read doesn't appear in the list of required permissions.

Read user profiles (User.Read.All)

Coveo requires this permission mainly to retrieve user profiles and index them as items retrieve user profiles and index them as items.

Read directory data (Directory.Read.All)

Coveo requires this permission to fetch:

  • The Directory Role and Directory Role Members (see List Members).

  • All users in Office 365, which is necessary to determine which users are in built-in groups such as Everyone (see List Users and Coveo Cloud Management of Security Identities and Item Permissions).

    The Azure documentation shows that the least privileged permission to retrieve the list of users in a group is actually User.ReadBasic.All, but since Directory.Read.All is already required for other operations, User.ReadBasic.All doesn't appear in the list of required permissions.

Read all groups (Group.Read.All)

Coveo uses this permission to obtain the ID of a group (represents an Azure Active Directory (Azure AD) group, which can be an Office 365 group, or a security group), and then a list of the group members (see Get Group and List members).

AllSites.FullControl Permission

On an API level, the AllSites.FullControl permission is required to retrieve permissions of crawled items. However, due to the delegated permissions used by the Azure Application in the Sharepoint Online connector, Coveo will never have more privileges than the logged-in user.

When a source is created or the access token is updated, Coveo initiates an Azure OAuth2 handshake and you need to specify a user. Then, Coveo receives an access token referring to the specific user’s permissions. If the access token permissions are higher than those of the logged-in user, the logged-in user’s permissions take precedence. Therefore, Coveo would only have the complete set of AllSites.FullControl privileges if the logged-in user also has the same level of privileges.

  1. Follow steps 1 to 3 in Add or Edit a SharePoint Online Source with a user that has the Global Admin role. This is the only role that you can use to provide consent (see Common consent scenarios).

  2. Check Consent on behalf of your organization.

  3. Click Accept.

  4. The Add/Edit a SharePoint Online Source panel opens. Close the panel and perform the steps in Add or Edit a SharePoint Online Source again, but this time using your crawling account credentials to create your SharePoint Online source.

Add or Edit a SharePoint Online Source

Before you start, ensure that your SharePoint Online instance meets the source requirements.

When adding or editing a SharePoint Online source, follow the instructions below.

“Sign in to SharePoint Online” Window

  1. Enter your SharePoint Online tenant name or tenant address, and then click Sign In.

    • SharePoint Online tenant name: MyCompany

    • SharePoint Online tenant address:

  2. Enter the Email and Password of the limited administrator account that you created earlier and that has access to the desired SharePoint Online content, and then click Sign in.

    As of March 25, 2019, when you create two SharePoint Online sources retrieving content with the same tenant, they share their security providers, which increases the speed of the security identities refresh operation. You must, however, use the same limited administrator credentials for both sources.

  3. Click Accept to grant the required permissions to the Coveo application.

“Configuration” Tab

On the Add/Edit a SharePoint Online Source subpage, the Configuration tab is selected by default. It contains your source general and content information, as well as other parameters.

General Information

Source Name

Enter a name for your source.

A source name can’t be modified once it’s saved, therefore be sure to use a short and descriptive name, using letters, numbers, hyphens (-), and underscores (_). Avoid spaces and other special characters.

Character Optical Recognition (OCR)

If you want Coveo Cloud to extract text from image files or PDF files containing images, check the appropriate box. OCR-extracted text is processed as item data, meaning that it’s fully searchable and will appear in the item Quick View. See Enable Optical Character Recognition for details on this feature.


When adding a source, if you have more than one logical (non-Elasticsearch) index in your organization, select the index in which the retrieved content will be stored (see Leverage Many Coveo Indexes). If your organization only has one index, this drop-down menu isn’t visible and you have no decision to make.

  • To add a source storing content in an index different than default, you need the View access level on the Logical Index domain (see Manage Privileges and Logical Indexes Domain).

  • Once the source is added, you can’t switch to a different index.

“Content to Include” Section

  1. Select whether you want to index SharePoint Online or OneDrive content.

    You can configure your source to ignore specific SharePoint Online list template types when indexing items.

  2. Specify the corresponding options.

    • For OneDrive, select Folders to index list folders and document sets.

    • For SharePoint Online:

      1. Select the content to retrieve:

        • All sites

          All sites that the crawling account is allowed to access will be searchable.

        • Specific items

          If you choose to make only certain items searchable, see SharePoint Online Account With Appropriate Roles and Permissions to set the required permissions on the crawling account. In the URL box, enter URLs corresponding to the desired sites, lists, websites, and subsites. Each URL must include the protocol and tenant name.

          • For a specific site: https://site:8080/sites/support

          • For a specific website: https://site:8080/sites/support/subsite

          • For a specific list: https://site:8080/sites/support/lists/contacts/allItems.aspx

            A specific folder in a list isn’t supported.

        • Personal sites

        • User profiles

      2. If you selected All sites, Specific items, or Personal sites, select whether to index additional content:

        • Folders

          Select this option to index list folders and document sets.

        • Unapproved items

          Select this option to retrieve unapproved items, which are items with a Draft or Pending approval status, from lists where moderation is activated. If an unapproved version exists for an item that is already Approved, your source indexes the unapproved item instead of the approved item. As a result, the unapproved item appears in Coveo search results. If this option is disabled, your source indexes only Approved items.

          In a list where moderation is active, a document named Meeting Notes is Approved and indexed by Coveo. This document version is 1.0. However, a coworker edits Meeting Notes, thereby creating version 1.1, and the document status becomes Draft. Then, your SharePoint Online source is rescanned. If Unapproved items is enabled in your source, version 1.0 is deleted from the Coveo index and is replaced by the draft version 1.1. If Unapproved items is disabled in your source, Coveo indexes version 1.0 as version 1.1 is not yet Approved.

          In lists where moderation is deactivated, Coveo indexes the latest version of an item, be it Approved, Draft, or Pending. In this case, this option doesn’t apply.

          For SharePoint lists that require documents to be checked out before editing, Coveo doesn’t index a document while it’s checked out regardless of the Unapproved items option and the list moderation setting in SharePoint. If a checked out item is checked in and its status changes to Draft or Pending, the unapproved item is indexed only if the Unapproved items option is enabled in your source or if moderation is deactivated for the list.

“Filters” Section

Use this section to include or exclude content from specific pages based on URL expressions.

You can view your URL expressions in the addressPatterns attribute of your source JSON configuration panel.

Inclusion Filters

Your source indexes only the pages that match a URL expression specified in this section.

When you specify an inclusion filter, the index URL(s) for your source must be part of the inclusion filter scope, otherwise the corresponding content won’t be indexed. For example, if you entered https://site:8080/sites/support for the Specific items SharePoint Online option, that URL must match one of your filter expressions to index the corresponding content. If a source URL redirects to another URL, both URLs must be part of the inclusion filter scope.

  1. Enter a URL expression to apply as the inclusion filter.

  2. Select whether the URL expression uses a Wildcard or a Regex (regular expression) pattern.

  • You can test your regexes to ensure that they match the desired URLs with tools such as Regex101.

  • You can customize regexes to meet your use case focusing on aspects such as:

    • Case insensitivity

    • Capturing groups

    • Trailing slash inclusion

    • File extension

    For example, you want to index HTML pages on your company staging and dev websites without taking the case sensitivity or the trailing slash (/) into account, so you use the following regex:


    The regex matches the following URLs:

    • http://company-dev/important/document.html/

    • http://ComPanY-DeV/important/document.html/ (because of (?i), the case insensitive flag)

    • http://company-dev/important/document.html (with or without trailing / because of .?)

    • http://company-staging/important/document.html/ (because of dev|staging)

    but doesn’t match the following ones:

    • http://besttech-dev/important/document.html/ (besttech isn’t included in the regex)

    • http://company-dev/important/document.pdf/ (only html files are included)

    • http://company-prod/important/document.html/ (prod isn’t included in the regex)

The website you crawl contains versions in several languages and you want to have one source per language. For the US English source, if the source URL is, the inclusion filter would be*.

Exclusion Filters

Your source ignores content from pages that match a URL expression specified in this section.

When you specify an exclusion filter, the index URL(s) for your source must not be part of the exclusion filter scope, otherwise the corresponding content won’t be indexed. For example, if you entered https://site:8080/sites/support for the Specific items SharePoint Online option, and that URL matches one of your filter expressions, the corresponding content will not be indexed. If a source URL redirects to another URL, both URLs must not be part of the exclusion filter scope.

  1. Enter a URL expression to apply as the exclusion filter.

  2. Select whether the URL expression uses a Wildcard or a Regex (regular expression) pattern.

  • Exclusion filters also apply to shortened and redirected URLs.

  • By default, if pages are only accessible via excluded pages, those pages will also be excluded.

  • There’s no point in indexing the search page of your website, so you exclude its URL:

  • You don’t want to index ZIP files that are linked from website pages:*.zip

“Content Security” Tab

Select who will be able to access the source items through a Coveo-powered search interface. For details on this parameter, see Content Security.

“Access” Tab

In the Access tab, determine whether each group and API key can view or edit the source configuration (see Resource Access):

  1. In the Access Level column, select View or Edit for each available group.

  2. On the left-hand side of the tab, if available, click Groups or API Keys to switch lists.


  1. Finish adding or editing your source:

    • When you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to do other changes soon, click Add Source/Save .

      On the Sources page, you must click Launch build or Launch rebuild in the source Status column to add the source content or to make your changes effective, respectively.

    • When you’re done editing the source and want to make changes effective, click Add and Build Source/Save and Rebuild Source.

      Back on the Sources page, you can review the progress of your source addition or modification.

    Once the source is built or rebuilt, you can review its content in the Content Browser.

    If you selected Specific Items and User Profiles in the Content to Include section, some additional items will appear in the Content Browser. To retrieve user profiles, Coveo must dig through your SharePoint Online instance, including your My Site host site collection and the documents it contains. The items it encounters in the process are retrieved as well and therefore appear in the Content Browser.

  2. Optionally, consider editing or adding mappings.

    You can only manage mapping rules once you build the source (see Refresh, Rescan, or Rebuild Sources).

What’s Next?

Adapt the source update schedule to your needs.

Recommended Articles