Add or Edit a SharePoint Online Legacy Source

Members of the Administrators and Content Managers built-in groups can include SharePoint Online content and make it searchable. In a Coveo-powered search interface, the source content is accessible to either everyone, the source creator only, or specific users as determined by source permissions (see Content Security). By default, a SharePoint Online source starts a refresh every hour to retrieve SharePoint Online item changes (addition, modification, or deletion). A source rescan or rebuild is necessary to capture deleted user profiles.

Source Features Summary

Features Supported Additional information
SharePoint Online version Latest cloud version  
Searchable content types Sites, sub-sites, user profiles, personal websites, lists, list items, list item attachments, document libraries, document sets, documents, web parts, and microblog posts and replies.
Content update Refresh

Rescan or rebuild is needed to retrieve deleted user profiles.

Rescan  
Rebuild  
Content security options Determined by source permissions  
Source creator  
Everyone  

Requirements

DNS Records Configuration for Office 365

  1. Log in to Office 365 admin center with an administrator account.

  2. In the navigation bar on the left, select Domains.

  3. On the Manage domains page:

    1. Under Domain Name, select your corporate domain (not company.onmicrosoft.com) check box.

    2. Next to the Action column, under the [domain name], click Domain settings.

  4. On the [domain name] page, in the DNS records section, take note of the DNS records.

  5. Configure these DNS records in your DNS host provider (see Create DNS records for Office 365 when you manage your DNS records).

  6. On the [domain name] page, in the DNS records section, click the Troubleshoot domain link to ensure that the DNS records were correctly configured.

SharePoint Online Account With Appropriate Permissions

When you want to include SharePoint Online content, you must create a specific SharePoint Online account that will be only used for the source. Otherwise, you’ll need to also change the source Password value each time the account password changes to prevent authentication errors.

  1. Access your SharePoint Online tenant with an administrator account.

  2. On your SharePoint Online tenant:

    1. Select or create a user that the source will use to retrieve your SharePoint Online content. See the following table to identify the required type of user for your web application enabled authentication.

      SharePoint web application enabled authentication Type of user User format
      Native Native Office 365 account username@domain.onmicrosoft.com
      SSO with ADFS Single Sign-On Office 365 account username@domain.com
      SSO with Okta
    2. Grant appropriate SharePoint permissions to the SharePoint Online account you selected to ensure it has access to all the content that you want to include.

      The following table presents the minimal required permissions that the source credentials must have to perform the specified action.

      Action to perform Minimal required permission
      Content and security indexing, incremental refresh, and site collection discovery

      Personal site, user profile and social tags indexing

      When including personal sites or user profiles, the source credentials (indexing account) must not have a personal site on the SharePoint server being included to prevent connector failure cases when attempting to retrieve the list of personal sites.

      Owner of all personal sites collections (see Adding the Personal Sites Collections Owner Permissions for SharePoint Online).

Add or Edit a SharePoint Online Source

  1. Ensure that your SharePoint Online instance meets the source requirements (see Requirements).

  2. If not already in the Add/Edit a SharePoint Online Legacy Source panel, access the panel:

    • To add a source, in the main menu, under Content, select Sources > Add source button > SharePoint > SharePoint Online legacy.

      OR

    • To edit a source, in the main menu, under Content, select Sources > source row > Edit in the Action bar.

  3. In the Configuration tab, enter appropriate values for the available parameters:

    • Source name

      A descriptive name for your source under 255 characters (not already in use for another source in this organization).

      SharePoint-Online-Intranet

    • URL

      One or more URL of your SharePoint Online site section addresses including the protocol (http:// or https://) that you want to make searchable.

      • For the whole SharePoint Online site: https://domain.sharepoint.com

      • For a specific Web Application: https://site:8080/

      • For a specific site collection: https://site:8080/sites/support

      • For a specific website: https://site:8080/sites/support/subsite

      • For a specific document library: https://site:8080/documentLibrary

      • For a specific list: https://site:8080/sites/support/lists/contacts/allItems.aspx

        A specific folder in a list isn’t supported.

    • Scope

      Click the drop-down menu, and then select the option for the content type that you want to include in relation with the source URL that you specified (see above).

      By default, Web application is selected which is the highest element type in the SharePoint Online site hierarchy to include everything.

      Value Content to crawl
      Web application All site collections of the specified web application
      Site collection All web sites of the specified site collection
      Web And sub webs Only the specified web site and its sub webs (also known as subsites)
      List Only the specified list or document library
    • Character optical recognition (OCR)

      Check this box if you want Coveo Cloud to extract text from image files or PDF files containing images (see Enable Optical Character Recognition). OCR-extracted text is processed as item data, meaning that it’s fully searchable and will appear in the item Quick View (see Search Result Quick View).

      Since the OCR feature is available at an extra charge, you must first contact Coveo Sales to add this feature to your organization license. You can then enable it for your source.

    • Index

      When adding a source, if you have more than one logical (non-Elasticsearch) index in your organization, select the index in which the retrieved content will be stored (see Leverage Many Coveo Indexes). If your organization only has one index, this drop-down menu isn’t visible and you have no decision to make.

      • To add a source storing content in an index different than default, you need the View access level on the Logical Index domain (see Privilege Management and Logical Indexes Domain).

      • Once the source is added, you can’t switch to a different index.

  4. In the Authentication section, you must provide authentication information so that Coveo can access the content you want to make searchable (see Content Security).

    1. In the drop-down menu, select the identity provider you use to manage identities in your SharePoint site. Options are:

      • Native

      • Federated

    2. Depending on the option you choose in the drop-down menu, you must fill some or all of the following boxes:

      • Username

        The username of a dedicated SharePoint Online administrator account that has access to all the content you want to include (see SharePoint Online instructions in Granting SharePoint Permission to the Crawling Account).

        Starting March 25, 2019, when you create two SharePoint Online Legacy sources retrieving content from the same tenant, they share their security providers, which increases the speed of the security identities refresh operation (see Refresh a Security Identity Provider). You must however use the same administrator credentials for both sources.

      • Password

        The corresponding password.

      • Identity provider URL

        Depending on the provider your users use to log in to SharePoint:

        • When using SSO Office 365 authentication, enter the URL of the identity provider server used in SharePoint Online to authenticate users.

        • When authenticating via ADFS, you can edit the Identity provider URL in the ADFS settings (see Finding and Enabling the ADFS Service Endpoint URL Path).

          When authenticating via Okta, the URL should be of the following format: https://acme.okta.com/app/office365/{applicationId}/sso/wsfed/active

        • When using native authentication, leave this field blank.

      • SharePoint trust identifier

        Depending on the provider your users use to log in to SharePoint:

        • When using SSO Office 365 authentication, enter the Relying Party Trust identifier for the SharePoint Online identity provider server. Unless you use a different or modified SharePoint Online identity provider, use the default urn:federation:MicrosoftOnline value.

        • When using native authentication, you may leave the default value, as it will be ignored.

  5. In the Content to Include section, consider changing the default values of the settings in this section when you want to fine-tune how your SharePoint Online site is crawled.

    By default, user profiles and personal sites aren’t included.

    • User profiles: Select to include SharePoint Online users.

      Including user profiles can take a significant time depending on their number. Moreover, including user profiles more than once creates as many duplicates in your Coveo Cloud organization index. It’s thus recommended to only include your user profiles once for all your SharePoint Online sources:

      • When you configure your first SharePoint Online source, select the User profiles check box. For all your other SharePoint sources, ensure this parameter check box is cleared.

      • When you already have other configured SharePoint Online sources, look for your smallest web application in size, and select the User profiles check box and clear this parameter check box in all your other SharePoint Online sources.

    • Personal sites: (When the scope is Web application) select to include SharePoint Online personal sites.

  6. In the Content Security tab, select who will be able to access the source items through a Coveo-powered search interface. For details on this parameter, see Content Security.

  7. In the Access tab, determine whether each group and API key can view or edit the source configuration (see Understanding Resource Access):

    1. In the Access Level column, select View or Edit for each available group.

    2. On the left-hand side of the tab, if available, click Groups or API Keys to switch lists.

    If you remove the Edit access level from all the groups of which you’re a member, you won’t be able to edit the source again after saving. Only administrators and members of other groups that have Edit access on this resource will be able to do so. To keep your ability to edit this resource, you must grant the Edit access level to at least one of your groups.

  8. Optionally, consider editing or adding mappings (see Adding and Managing Source Mappings).

    You can only manage mapping rules once you build the source (see Refresh, Rescan, or Rebuild Sources).

  9. Complete your source addition or edition:

    • Click Add Source/Save when you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to do other changes soon.

      On the Sources page, you must click Start initial build or Start required rebuild in the source Status column to add the source content or make your changes effective, respectively.

      OR

    • Click Add and Build Source/Save and Rebuild Source when you’re done editing the source and want to make changes effective.

      Back on the Sources page, you can review the progress of your SharePoint Online source addition or modification (see Adding and Managing Sources).

    Once the source is built or rebuilt, you can review its content in the Content Browser (see Inspect Items With the Content Browser).

What’s Next?

Review your source update schedule and optionally change it so that it better fits your needs (see Edit a Source Schedule). By default, your content is refreshed every hour.

Recommended Articles