Add or Edit a Confluence Self-Hosted Source

Members of the Administrators and Content Managers built-in groups can add the content of a Confluence instance to a Coveo Cloud organization. The source can be shared, private, or secured (see Content Security).

The Coveo Confluence Plugin available for on-premises (server) Confluence instances allows the Coveo Cloud platform to manage permissions associated with each Confluence items so that users only see Confluence content in Coveo search results to which they normally have access in Confluence itself. The plugin is also needed to perform refreshes of the source content.

By default, a Confluence source starts a rescan every day to retrieve Confluence item changes (addition, modification, or deletion).

  • If you change the name of a space in Confluence, the rescan detects the change only for pages created or modified following the change. You must rebuild the source to get the space name change on all space pages (see Refresh, Rescan, or Rebuild Sources).

  • For clients migrating from Coveo Enterprise Search 7.0 (CES 7), the following source has the same specifications as the Confluence V2 connector (see Atlassian Confluence V2 Connector).

Source Features Summary

Features Supported Additional information
Confluence version 6.7 to 7.1
Searchable content types

Spaces, pages (such as Wiki pages), blog posts, comments on pages and blog posts (included as metadata), and attachments (in pages, blog posts, and comments)

Content update Refresh

Requires the Coveo plugin to retrieve deleted, restored, and moved items, and items with modified comments or permissions.

Rescan
Rebuild
Content security options Secured

Requires the Coveo plugin.

Private
Shared

Requirements

Supported Confluence Versions

The source supports 6.7 to 7.1 on-premises installations using the Confluence REST API and Search REST API.

Confluence Data Center is supported.

Atlassian Confluence Server Accessible to Coveo Cloud

When the access to communication ports between Coveo Cloud and the Confluence server is restricted, the appropriate port(s) must be opened in the network infrastructure such as in firewalls to allow Coveo Cloud to access the content.

Confluence Administrator Account

When you want to include Confluence permissions, you must create a specific Confluence administrator account that will be only used for the source. Otherwise, you will need to also change the source Password value each time the account password changes to prevent authentication errors.

When configuring the source, you must use the credentials of a native Confluence user. Users managed by other identity providers such as Google are not supported.

Enabling the Confluence SOAP Remote API (Web Service)

Due to a Confluence REST API limitation, the connector must use the SOAP Remote API to retrieve content permissions. When you want your Confluence source to be secured, a Confluence system administrator must enable the remote API on your Confluence instance.

Add or Edit a Confluence Self-Hosted Source

  1. If not already in the Add/Edit a Confluence Self-Hosted Source panel, go to the panel:

    • To add a source, in the main menu, under Content, select Sources > Add source button > Confluence > Confluence Self-Hosted source.

      OR

    • To edit a source, in the main menu, under Content, select Sources > source row > Edit in the Action bar.

  2. In the Configuration tab:

    1. Under Supported Versions, check if your version of Confluence is supported.

    2. Under Content Update and Security requirement:

      1. If not already done, download the Coveo plugin for Confluence, by clicking Download Plugin, and then install the plugin (see Installing the Coveo Plugin for Atlassian Confluence).

      2. Enter the appropriate values for the following parameters:

        • Source name

          A descriptive name for your source under 255 characters (not already in use for another source in this organization).

          Confluence-CorporateWiki

        • Instance URL

          One or more Confluence Wiki site and space addresses including the protocol (http:// or https://) that you want to make searchable.

          Depending on your use case, use one of the following URL formats:

          • To index a complete Confluence (on-premises) site, add the Confluence server root URL:

            http://MyConfluenceServer:8090/

          • To index specific on-premises spaces, add their URL:

            • http://MyConfluenceServer:8090/display/space1

            • http://MyConfluenceServer:8090/display/space2

          • To be able to include item permissions, all your URLs must be located on a single Confluence site. Create separate sources for separate sites.

          • You can enter specific space addresses for deployments where Confluence is not installed at the server root, respecting the following format:

            http://server/MyConfluence/display/spacename.

        • Paired Crawling Module

          If your source is a Crawling Module source and if you have more than one Crawling Module linked to this organization, select the one with which you want to pair your source (see Deploying Multiple Crawling Modules). If you change the Crawling Module instance with which your source is paired, a successful rebuild is required for your change to apply.

        • Character optical recognition (OCR)

          Check this box if you want Coveo Cloud to extract text from image files or PDF files containing images (see Enable Optical Character Recognition). OCR-extracted text is processed as item data, meaning that it is fully searchable and will appear in the item Quick View (see Search Result Quick View).

          Since the OCR feature is available at an extra charge, you must first contact Coveo Sales to add this feature to your organization license. You can then enable it for your source.

        • Index

          When adding a source, if you have more than one logical (non-Elasticsearch) index in your organization, select the index in which the retrieved content will be stored (see Leverage Many Coveo Indexes). If your organization only has one index, this drop-down menu is not visible and you have no decision to make.

          • To add a source storing content in an index different than default, you need the View access level on the Logical Index domain (see Privilege Management and Logical Indexes Domain).

          • Once the source is added, you cannot switch to a different index.

        • Content security

          Select a content security option to determine who can see items from this source in a search interface.

          If you want to create a secured source to use with the Coveo On-Premises Crawling Module, contact the Coveo Support team (see Coveo On-Premises Crawling Module).

  3. In the Content to Include section, consider changing the default value of the parameters in this section when you want to fine-tune how your Confluence site is crawled:

    • Space type

      Select which spaces you want to index

      By default, only global spaces content is included and personal space content is excluded.

      • Global

      • Personal

      • Both

    • Space status

      Select which spaces should be included, depending on their status. Options are:

      • Current (only non-archived spaces are retrieved)

      • Archived (only archived spaces are retrieved)

    • Space filter

      The regex to use to filter spaces when you want to include only a subset of a Confluence site.

      This parameter is useful when you have many spaces to include that have an element in common in their space keys.

      You want to include all spaces with keys starting with an uppercase letter followed by a number, so you enter the following regex:

      ^[A-Z][0-9].*$

    • Options

      Select which items you want to include

      • Attachments (binary files attached to a page, blog post, or comment)

      • Comments (on blog posts and pages)

        Comments are included as metadata of the page, not as items.

  4. In the Authentication section, when you want to include secured Confluence content or include permissions, you must fulfill the following parameters:

    The source supports Okta integration. You must set the UseRequestParametersAuth to true in the source and in the security identity provider JSON configuration (see Edit a Source JSON Configuration and Manage Advanced Security Identity Provider Parameters).

    • Username

      The username of a dedicated Confluence administrator account who has access to all the content that you want to include.

    • Password

      The corresponding password.

    Depending on the content to index in your source, the Confluence user must either have the Space Administrator or Confluence Administrator permission level on the Confluence space. The following table indicates the minimal required level of permission needed depending on the content you want to index:

    Content type Minimum permission level
    Item permissions only Confluence Administrator
    Item changes following incremental refreshes Space Administrator on all indexed spaces
    Item permissions and changes following incremental refreshes Confluence Administrator

    For further information on Confluence permission levels, see Confluence Admin Permission Levels Explained.

  5. In the Access tab, determine whether each group and API key can view or edit the source configuration (see Understanding Resource Access):
    1. In the Access Level column, select View or Edit for each available group.
    2. On the left-hand side of the tab, if available, click Groups or API Keys to switch lists.

    If you remove the Edit access level from all the groups of which you are a member, you will not be able to edit the source again after saving. Only administrators and members of other groups that have Edit access on this resource will be able to do so. To keep your ability to edit this resource, you must grant the Edit access level to at least one of your groups.

  6. Optionally, consider editing or adding mappings (see Adding and Managing Source Mappings).

    You can only manage mapping rules once you build the source (see Refresh, Rescan, or Rebuild Sources).

  7. Complete your source addition or edition:

    • Click Add Source/Save when you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to do other changes soon.

      On the Sources page, you must click Start initial build or Start required rebuild in the source Status column to add the source content or make your changes effective, respectively.

      OR

    • Click Add and Build Source/Save and Rebuild Source when you are done editing the source and want to make changes effective.

      Back on the Sources page, you can review the progress of your Confluence source addition or modification (see Adding and Managing Sources).

    Once the source is built or rebuilt, you can review its content in the Content Browser (see Inspect Items With the Content Browser).

What’s Next?

Review your source update schedule and optionally change it so that it better fits your needs (see Edit a Source Schedule). By default, your content is rescanned every day.