Add or Edit a File System Source

A File System source allows members of the Administrators and Content Managers built-in groups to retrieve and make searchable the content of files shared over a network via the Coveo On-Premises Crawling Module.

Example

Your company has a shared network drive on which letter, PowerPoint presentation, and email signature templates are available to all employees. You decide to index the whole drive to make its content searchable via your Coveo-powered search page.

When you have the required privileges, you can add files shared over a network to a Coveo organization.

Tip
Leading practice

The number of items that a source processes per hour (crawling speed) depends on various factors, such as network bandwidth and source configuration. See About Crawling Speed for information on what can impact crawling speed, as well as possible solutions.

Source Key Characteristics

Add or Edit a File System Source

Before you start, ensure that the content to index and make searchable is shared over a network.

Also ensure that the Coveo On-Premises Crawling Module is installed on a server that has access to the file system of which you want to retrieve the content.

Then, to add a source, in the Add a source of content panel, click the Crawling Module tab, and then click File system.

To edit a source, on the Sources (platform-ca | platform-eu | platform-au) page, click the desired source, and then click Edit in the Action bar.

"Configuration" Tab

On the Add/Edit a File System Source subpage, the Configuration tab is selected by default. It contains your source’s general and authentication information, as well as other parameters.

If you haven’t already installed the Coveo On-Premises Crawling Module on a server that has access to the file system of which you want to retrieve the content, click Download Crawling Module to do so.

General Information

Source Name

Enter a name for your source.

Tip
Leading practice

A source name can’t be modified once it’s saved, therefore be sure to use a short and descriptive name, using letters, numbers, hyphens (-), and underscores (_). Avoid spaces and other special characters.

Path

Enter the network path to a file system folder or a file.

Examples
  • For a file located on the server where the Crawling Module is installed: C:\Users\adminuser\Documents

  • For a file located on a different server than that where the Crawling Module is installed, but on the same network: file:\\my-server\Users\adminuser\Documents

To exclude certain folders or files, first configure and save your source with a broad path. Then, see Refine the Content to Index.

Paired Crawling Module

If your source is a Crawling Module source, and if you have more than one Crawling Module linked to this organization, select the one with which you want to pair your source. If you change the Crawling Module instance paired with your source, a successful rebuild is required for your change to apply.

Optical Character Recognition (OCR)

If you want Coveo to extract text from image files or PDF files containing images, check the appropriate box. OCR-extracted text is processed as item data, meaning that it’s fully searchable and will appear in the item Quick View. See Enable Optical Character Recognition for details on this feature.

Note

Contact Coveo Sales to add this feature to your organization license.

"Authentication" Section

The File System source supports the following authentication methods.

Regardless of the method you choose, you must enter the Username and Password of a dedicated administrator account that has access to the content you want to index. See Source Credentials Leading Practices. If you selected Active Directory on-premises, fill the additional boxes that appeared. If you selected Native, skip to "Content to Include" Section.

Active Directory Username and Active Directory Password

Enter credentials to grant Coveo access to your Active Directory.

Expand Well-Known SIDs

Select this option if you want the users that are included in your Active Directory well-known security identifiers to be granted access to the indexed content. Expect an increase in the duration of the security identity provider refresh operation. Supported well-known SIDs are: Everyone, Authenticated Users, Domain Admins, Domain Users, and Anonymous Users.

Tip
Leading practice

If your entire content is secured with the Everyone or Authenticated users well-known, it’s more cost-effective resource-wise to index it with a source whose content is accessible to everyone than to expand the well-known with a source that indexes permissions.

Enable TLS

Select this option to use a TLS protocol to retrieve your security identities. If you do, we strongly recommend selecting StartTLS if you can. Since LDAPS is a much older protocol, you should only select this value if StartTLS is incompatible with your environment.

Email Attributes

By default, Coveo retrieves the email address associated to each security identity from the mail attribute. Optionally, you can specify additional or different attributes to check. Should an attribute contain more than one value, Coveo uses the first one.

"Content to Include" Section

By default, when you select the Same users and groups as in your content system option in the Content Security tab, only NTFS permission entries are indexed and replicated in your search interface. Check the Share permissions box if you also want to index and enforce share permissions. The NTFS and share permission systems are combined, and therefore each end user must be allowed to access an item in both permission models to see this item in their search results.

For further information on share and NTFS permissions, see Share and NTFS Permissions on a File Server.

For more information on sources that index permissions and on how Coveo handles these permissions, see Coveo Management of Security Identities and Item Permissions.

"Content Security" Tab

Select who will be able to access the source items through a Coveo-powered search interface. For details on this parameter, see Content Security.

"Access" Tab

In the Access tab, set whether each group and API key can view or edit the source configuration (see Resource Access):

  1. If available, in the left pane, click Groups or API Keys to select the appropriate list.

  2. In the Access Level column for groups or API keys with access to source content, select View or Edit.

Completion

  1. Finish adding or editing your source:

    • When you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to do other changes soon, click Add Source/Save.

      Note

      On the Sources (platform-ca | platform-eu | platform-au) page, you must click Launch build or Start required rebuild in the source Status column to add the source content or to make your changes effective, respectively.

    • When you’re done editing the source and want to make changes effective, click Add and Build Source/Save and Rebuild Source.

      Back on the Sources (platform-ca | platform-eu | platform-au) page, you can review the progress of your source addition or modification.

      Once the source is built or rebuilt, you can review its content in the Content Browser.

  2. Optionally, consider editing or adding mappings once your source is done building or rebuilding.

Refine the Content to Index

You may want to avoid indexing certain subfolders, or to index only a few of them. To do so:

  1. If not already done, create and save your source with a broad path.

  2. In your source JSON configuration, enter an address filter to refine the targeted content.

    Important

    Your path must match one of your inclusion addressPatterns and not match any of your exclusion addressPatterns.

  3. Build or rebuild your source.

What’s Next?