Add a Box Business source

Members with the required privileges can add the content of Box Business and Enterprise plan user accounts to a Coveo organization.

Tip
Leading practice

The number of items that a source processes per hour (crawling speed) depends on various factors, such as network bandwidth and source configuration. See About crawling speed for information on what can impact crawling speed, as well as possible solutions.

Source key characteristics

The following table presents the main characteristics of a Box Business source.

Features Supported Additional information

Box version

Latest cloud version

Indexable content

Files, folders, enterprises, users, and web links

Content update operations

refresh

check

Takes place every three hours by default. A rescan or rebuild is required to:

  • Remove deleted users in Box

  • Update the subitems of a renamed folder

rescan

check

rebuild

check

Content security options

Same users and groups as in your content system

check[1]

Specific users and groups

x

Everyone

check

Metadata indexing for search

Automatic mapping of metadata to fields that have the same name

This setting is disabled by default and not recommended for this source type.

Automatically indexed metadata

Examples of auto-populated default fields (no user-defined metadata required):
 

  • clickableuri

  • boxitemownedbyname

  • boxitemstatus

  • filetype

  • title
     

After a content update, inspect your item field values in the Content Browser.

Extracted but not indexed metadata

The Box Business source extracts core system-defined attributes as metadata.
 

After a rebuild, review the View and map metadata subpage for the list of indexed metadata, and index additional metadata.

Prerequisite

A Box Business source uses a Box application to connect to Box and index content. For a Box Business source to index all the content of a folder and its subfolders, you must create a Box application that meets the following requirements:

App Access Level: App Access + Enterprise Access

Application Scopes:

Advanced Features: Perform actions as users and Generate User Access Tokens

Add and Manage Public Keys: Have a valid keypair

Add a Box Business source

Follow the instructions below to add a Box Business source.

  1. On the Sources (platform-ca | platform-eu | platform-au) page, click Add source.

  2. In the Add a source of content panel, click the Box Business source tile.

  3. Configure your source.

Tip
Leading practice

It’s best to create or edit your source in your sandbox organization first. Once you’ve confirmed that it indexes the desired content, you can copy your source configuration to your production organization, either with a snapshot or manually.

See About non-production organizations for more information and best practices regarding sandbox organizations.

"Configuration" tab

In the Add a Box Business source panel, the Configuration tab is selected by default. It contains your source general and authentication information, as well as other parameters.

Name

Enter a name for your source.

Tip
Leading practice

A source name can’t be modified once it’s saved, therefore be sure to use a short and descriptive name, using letters, numbers, hyphens (-), and underscores (_). Avoid spaces and other special characters.

Project

Use the Project selector to associate your source with one or more Coveo projects.

Authentication

Provide the Box application credentials in JSON format, either by pasting the configuration in the box or by clicking Select a JSON file to upload the file.

Note

If you still have a Box Business Legacy source, an enterprise ID is required instead. In the Box Enterprise ID box, enter the unique identifier that’s displayed in your Box Enterprise Admin Console, on the Account & Billing tab.

Content to index

Choose whether to index the content of all your managed Box accounts or only specific accounts. If you select Specific users only, enter the user email addresses corresponding to the Box accounts you want to index and make searchable.

Notes
  • By default, Coveo indexes all items that are owned by the Box accounts that you specify. However, you can add path filters to index only certain items or ignore unwanted items.

  • By default, Coveo doesn’t index users as source items. To index users, once you’ve created the source, edit the source JSON configuration and set the IndexUsers parameter to true.

    "IndexUsers": {
        "sensitive": false,
        "value": "true"
      }

    User source items contain metadata about the user such as their email address, phone number, and timezone.

"Items" tab

On the Items tab, you can specify how the source handles items based on their file type or content type.

File types

File types let you define how the source handles items based on their file extension or content type. For each file type, you can specify whether to index the item content and metadata, only the item metadata, or neither.

You should fine-tune the file type configurations with the objective of indexing only the content that’s relevant to your users.

Example

Your repository contains .pdf files, but you don’t want them to appear in search results. You click Extensions and then, for the .pdf extension, you change the Default action and Action on error values to Ignore item.

For more details about this feature, see File type handling.

Content and images

If you want Coveo to extract text from image files or PDF files containing images, enable the appropriate option. The extracted text is processed as item data, meaning that it’s fully searchable and will appear in the item Quick view.

Note

When OCR is enabled, ensure the source’s relevant file type configurations index the item content. Indexing the item’s metadata only or ignoring the item will prevent OCR from being applied.

See Enable optical character recognition for details on this feature.

"Content security" tab

Select who will be able to access the source items through a Coveo-powered search interface. For details on the content security options, see Content security.

Important

When using the Everyone content security option, see Safely apply content filtering for information on how to ensure that your source content is safely filtered and only accessible by intended users.

"Access" tab

On the Access tab, specify whether each group (and API key, if applicable) in your Coveo organization can view or edit the current source.

For example, when creating a new source, you could decide that members of Group A can edit its configuration, while Group B can only view it.

For more information, see Custom access level.

Build the source

  1. Finish adding or editing your source:

    • When you’re done editing the source and want to make your changes effective, click Add and build source/Save and rebuild source.

    • When you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to make other changes soon, click Add source/Save. On the Sources (platform-ca | platform-eu | platform-au) page, click Launch build or Start required rebuild when you’re ready to make your changes effective and index your content.

  2. On the Sources (platform-ca | platform-eu | platform-au) page, follow the progress of your source addition or modification.

  3. Once the source is built or rebuilt, review its content in the Content Browser.

Index metadata

To use metadata values in search interface facets or result templates, the metadata must be mapped to fields. Coveo automatically maps only a subset of the metadata it extracts. You must map any additional metadata to fields manually.

Note

Not clear on the purpose of indexing metadata? Watch this video.

  1. On the Sources (platform-ca | platform-eu | platform-au) page, click your source, and then click More > View and map metadata in the Action bar.

  2. Review the default metadata that your source is extracting from your content.

  3. Map any currently not indexed metadata that you want to use in facets or result templates to fields.

    1. Click the metadata and then, at the top right, click Add to Index.

    2. In the Apply a mapping on all item types of a source panel, select the field you want to map the metadata to, or add a new field if none of the existing fields are appropriate.

      Note

      For advanced mapping configurations, like applying a mapping to a specific item type, see Manage mappings.

    3. Click Apply mapping.

  4. Return to the Sources (platform-ca | platform-eu | platform-au) page.

  5. To reindex your source with your new mappings, click your source, and then click More > Rebuild in the Action bar.

  6. Once the source is rebuilt, review your item field values. They should now include the values of the metadata you selected to index.

    1. On the Sources (platform-ca | platform-eu | platform-au) page, click your source, and then click More > Open in Content Browser in the Action bar.

    2. Select the card of the item for which you want to inspect properties, and then click Properties in the Action bar.

    3. In the panel that appears, select the Fields tab.

Add path filters

By default, Coveo indexes all items in the Box accounts that you specified in the Content to index section. To index only certain items or ignore unwanted items, you can add path filters to your source JSON configuration.

Note

You can also filter items by file type on the Items tab.

  1. On the Sources (platform-ca | platform-eu | platform-au) page, click your Box source, and then click More > Edit configuration with JSON in the Action bar.

  2. Add your filters in the addressPatterns array object, while adhering to the following:

    • You must use the printableuri path for the expression parameter.

    • For your Box Business source to be able to locate an item specified in a printableuri path, the addressPatterns array must include separate expression parameter entries (inclusion filters) for each of the hierarchical elements of the item’s URL structure.

    Example

    You want to index all the items in User1’s Content folder and subfolders, except for the Draft subfolder or its subfolders.

    The printableuri for the Content folder is https://www.box.com/Speedbit/User1 (user1@speedbit.com)/All Files/Content.

    The printableuri for the Draft folder is https://www.box.com/Speedbit/User1 (user1@speedbit.com)/All Files/Content/Draft

    In order for your Box Business source to be able to locate the Content and Draft folders, a separate inclusion filter is required for each of the folders' hierarchical elements.

    Your addressPatterns array would be as follows:

    "addressPatterns": [
      {
        "allowed": true,
        "expression": "https://www.box.com/Speedbit",
        "patternType": "Wildcard"
      },
      {
        "allowed": true,
        "expression": "https://www.box.com/Speedbit/User1 (user1@speedbit.com)",
        "patternType": "Wildcard"
      },
      {
        "allowed": true,
        "expression": "https://www.box.com/Speedbit/User1 (user1@speedbit.com)/All Files",
        "patternType": "Wildcard"
      },
      {
        "allowed": true,
        "expression": "https://www.box.com/Speedbit/User1 (user1@speedbit.com)/All Files/Content*",
        "patternType": "Wildcard"
      },
      {
        "allowed": false,
        "expression": "https://www.box.com/Speedbit/User1 (user1@speedbit.com)/All Files/Content/Draft*",
        "patternType": "Wildcard"
      }
    ]
  3. Once you’ve added all your source filters, click Save and rebuild source.

Get the printable URI

The Box Business connector is designed to use the printableuri when applying path filters. This means that you must use the printableuri item field value when specifying the expression parameter in the addressPatterns array object.

The easiest way to get the printableuri of an item is to build your source, and then inspect the item in the Content Browser.

The printable URI format

The printable URI format for an item in Box Business (for example, a file or folder) is https://www.box.com/<COMPANY_NAME>/<ACCOUNT_USERNAME> (<ACCOUNT_EMAIL>)/All Files/<FOLDER>/<FILENAME>, where:

  • <COMPANY_NAME> is the name of the company account in Box Business.

  • <ACCOUNT_USERNAME> is the name of the user that owns the item or folder.

  • <ACCOUNT_EMAIL> is the email address of the user that owns the item or folder.

  • <FOLDER> is the name of the folder.

  • <FILENAME> is the item filename.

Examples
  • The printable URI of the following folder would be https://www.box.com/Speedbit/User1 (user1@speedbit.com)/All Files/Content:

    • Company name is Speedbit

    • Account username is User1

    • Account email address is user1@speedbit.com

    • Folder where the items are located is named Content

  • The printable URI of the following item would be https://www.box.com/Speedbit/User2 (user2@speedbit.com)/All Files/Tasks/Task_List.pdf:

    • Company name is Speedbit

    • Account username is User2

    • Account email address is user2@speedbit.com

    • Folder where the item is located is named Tasks

    • Item filename is Task_List.pdf

Get the printable URI of an indexed item

If you’ve already built the source and indexed items, you can get the printableuri of a given item by inspecting its properties in the Content Browser.

  1. On the Content Browser (platform-ca | platform-eu | platform-au) page, use the search box and facets to locate the desired item.

  2. Click the item card, and then click Properties in the Action bar.

  3. On the Fields tab, use the Filter box to search for the printableuri property value for the item.

Safely apply content filtering

The best way to ensure that your indexed content is seen only by the intended users is to enforce content security by selecting the Same users and groups as in your content system option. Should this option be unavailable, select Specific users and groups instead.

However, if you need to configure your source so that the indexed source content is accessible to Everyone, you should adhere to the following leading practices. These practices ensure that your source content is safely filtered and only accessible by the appropriate users:

Following the above leading practices results in a workflow whereby the user query is authenticated server side via a search token that enforces the search hub from which the query originates. Therefore, the query can’t be modified by users or client-side code. The query then passes through a specific query pipeline based on a search hub condition, and the query results are filtered using the filter rules.

Configure query filters

Filter rules allow you to enter hidden query expressions to be added to all queries going through a given query pipeline. They’re typically used to add a field-based expression to the constant query expression (cq).

Example

You apply the @objectType=="Solution" query filter to the pipeline to which the traffic of your public support portal is directed. As a result, the @objectType=="Solution" query expression is added to any query sent via this support portal.

Therefore, if a user types Speedbit watch wristband in the search box, the items returned are those that match these keywords and whose objectType has the Solution value. Items matching these keywords but having a different objectType value aren’t returned in the user’s search results.

To learn how to configure query pipeline filter rules, see Manage filter rules.

Note

You can also enforce a filter expression directly in the search token.

Use condition-based query pipeline routing

The most recommended and flexible query pipeline routing mechanism is condition-based routing.

When using this routing mechanism, you ensure that search requests are routed to a specific query pipeline according to the search interface from which they originate, and the authentication is done server side.

To accomplish this:

  1. Apply a condition to a query pipeline based on a search hub value, such as Search Hub is Community Search or Search Hub is Agent Panel. This condition ensures that all queries that originate from a specific search hub go through that query pipeline.

  2. Authenticate user queries via a search token that’s generated server side and that contains the search hub parameter that you specified in the query pipeline.

Configure the search token

When using query filters to secure content, the safest way to enforce content security is to authenticate user queries using a search token that’s generated server side. For instance, when using this approach, you can enforce a search hub value in the search token. This makes every authenticated request that originates from a component use the specified search hub, and therefore be routed to the proper query pipeline. Because this configuration is stored server side and encrypted in the search token, it can’t be modified by users or client-side code.

Implementing search token authentication requires you to add server side logic to your web site or application. Therefore, the actual implementation details will vary from one project to another.

The following procedure provides general guidelines:

Note

If you’re using the Coveo In-Product Experience (IPX) feature, see Implement advanced search token authentication.

  1. Authenticate the user.

  2. Call a service exposed through Coveo to request a search token for the authenticated user.

  3. Specify the userIDs for the search token, and enforce a searchHub parameter in the search token.

Note

You can specify other parameters in the search token, such as a query filter.

For more information and examples, see Search token authentication.

Required privileges

You can assign privileges to allow access to specific tools in the Coveo Administration Console. The following table indicates the privileges required to view or edit elements of the Sources (platform-ca | platform-eu | platform-au) page and associated panels. See Manage privileges and Privilege reference for more information.

Note

The Edit all privilege isn’t required to create sources. When granting privileges for the Sources domain, you can grant a group or API key the View all or Custom access level, instead of Edit all, and then select the Can Create checkbox to allow users to create sources. See Can Create ability dependence for more information.

Actions Service Domain Required access level

View sources, view source update schedules, and subscribe to source notifications

Content

Fields

View

Sources

Organization

Organization

Edit sources, edit source update schedules, and edit source mappings

Organization

Organization

View

Content

Fields

Edit

Sources

View and map metadata

Content

Source metadata

View

Fields

Organization

Organization

Content

Sources

Edit

What’s next?


1. By default, shared link permissions are ignored, meaning that an item only shared with a link is only visible by its owner in your Coveo-powered search interface. Contact Coveo Support for help on how to take shared link permissions into account.