Add a Confluence Cloud source
Add a Confluence Cloud source
Confluence is a cloud-based knowledge sharing tool that enables users to create and share content. Members of a Coveo organization with the required privileges can add the source to index the content of their Confluence Cloud instance.
Leading practice
The number of items that a source processes per hour (crawling speed) depends on various factors, such as network bandwidth and source configuration. See About crawling speed for information on what can impact crawling speed, as well as possible solutions. |
Source key characteristics
The following table presents the main characteristics of a Confluence Cloud source.
Features | Supported | Additional information | |
---|---|---|---|
Confluence Cloud version |
Latest cloud version |
||
Indexable content |
Spaces, pages (such as Wiki pages), blog posts, pages and blog posts comments (indexed as metadata), and attachments (in pages, blog posts, and comments). |
||
A refresh won’t take account of deleted, restored, and moved items, and items with modified comments or permissions. Therefore, a rescan or a rebuild is recommended. |
|||
Takes place every day by default. If you change the name of a space in Confluence Cloud, the rescan detects the change only for pages created or modified following the change. You must therefore rebuild the source to get the space name changed on all space pages. |
|||
Content security options |
Requires installing the Coveo User Sync app in your instance. See About the Coveo User Sync App for details. |
||
Note
Q&A in Confluence is an external plugin that must be installed on the instance, and isn’t indexed by the Confluence connector. In order to index Q&A, you must use a REST API cloud source. |
Add a Confluence Cloud source
A Confluence Cloud source indexes cloud content. If you want to retrieve on-premises (server) content, see Add a Confluence Data Center source instead.
Leading practice
It’s best to create or edit your source in your sandbox organization first. Once you’ve confirmed that it indexes the desired content, you can copy your source configuration to your production organization, either with a snapshot or manually. See About non-production organizations for more information and best practices regarding sandbox organizations. |
-
On the Sources (platform-ca | platform-eu | platform-au) page, click Add source.
-
In the Add a source of content panel, click the Confluence Cloud () source tile.
-
In the Add a new Confluence Cloud source panel, provide the following information:
-
Name: The source name can’t be modified once it’s saved. Therefore, make sure to use a short and descriptive name, using letters, numbers, hyphens, and underscores. Avoid spaces and other special characters.
-
Confluence address: The URL of your Confluence root URL. It often ends with
/wiki/
. -
Authentication: How Coveo should log in to your Confluence site to index your content.
If you select User delegated access using OAuth 2.0
-
Click Authorize account, and then sign in to Confluence with an account that has the necessary permissions to access all the content that you want to index.
-
Click Accept to grant Coveo access to this Confluence account.
-
Click Add source.
The Coveo OAuth 2.0 application requires
read
scopes only.If you select Atlassian account
-
Create an Atlassian account dedicated to the source. This account must have access to all the content that you want to index. See Source credentials leading practices for other leading practices to follow.
-
With this account, create an API token. This token should also be dedicated to your source.
-
Provide Coveo with the email address and API token corresponding to the source’s Atlassian account.
-
Install the Coveo User Sync app in your Confluence site to synchronize users and groups with Coveo.
-
Click Add source.
If you select No login
-
Click Next.
-
Select who will be able to access the source items through a Coveo-powered search interface. For details on this parameter, see Content security.
-
Click Add source.
-
-
Project: Use the Project selector to associate your source with one or more Coveo project(s).
NoteAfter source creation, you can update your Coveo project selection under the Identification subtab.
-
"Configuration" tab
When configuring or editing your Confluence Cloud source, the Configuration tab is selected by default. It contains your source’s general and authentication information, as well as other parameters that let you specify the content to index.
"Content to index" subtab
The Content to index subtab lets you define the content that you want to make available as search results.
Spaces
Specify whether you want to index global or personal spaces, or both. Then, specify whether you want to index archived pages.
If you want to index specific pages only, enter a regex representing the desired content. When crawling your Confluence instance, Coveo will target/exclude the space whose space key matches your regex.
You want to index all spaces with keys starting with an uppercase letter followed by a number, so you enter the following regex:
^[A-Z][0-9].*$
Additional content
Optionally, you can index the files attached to the indexed pages, blog posts, and comments.
You can also index comments posted on pages and blog posts. These comments will be indexed as metadata of this content.
"Advanced settings" subtab
The Advanced settings subtab lets you customize the Coveo crawler behavior. All advanced settings have default values that are adequate in most use cases.
Content and images
If you want Coveo to extract text from image files or PDF files containing images, enable the appropriate option.
The extracted text is processed as item data, meaning that it’s fully searchable and will appear in the item Quick view. See Enable optical character recognition for details on this feature.
"Authentication" subtab
The Authentication subtab contains settings used by the source crawler to emulate the behavior of a user authenticating to access restricted Confluence Cloud content. You provided authentication information when you created the source.
Confluence address: The URL of your Confluence root URL.
It often ends with /wiki/
.
Authentication: How Coveo should log in to your Confluence Cloud site to index your content.
If you select User delegated access using OAuth 2.0
-
Click Authorize account, and then sign in to Confluence Cloud with an account that has the necessary permissions to access all the content that you want to index.
-
Click Accept to grant Coveo access to this Confluence Cloud account.
The Coveo OAuth 2.0 application requires read
scopes only.
If you select Atlassian account
-
Create an Atlassian account dedicated to the source. This account must have access to all the content that you want to index. See Source credentials leading practices for other leading practices to follow.
-
With this account, create an API token. This token should also be dedicated to your source.
-
Provide Coveo with the email address and API token corresponding to the source’s Atlassian account.
-
Install the Coveo User Sync app in your Confluence Cloud site to synchronize users and groups with Coveo.
If you select No login
Go to the Content security tab, and then select who will be able to access the source items through a Coveo-powered search interface. For details on this parameter, see Content security.
"Identification" subtab
The Identification subtab contains general information about the source.
Name
The source name. It can’t be modified once it’s saved.
Project
Use the Project selector to associate your source with one or more Coveo projects.
"Content security" tab
Select who will be able to access the source items through a Coveo-powered search interface. For details on this parameter, see Content security.
Note
The Same users and groups as in your content system option requires you to install the Coveo User Sync app in your instance. Install the version of the app that corresponds to the region of your Coveo organization, or install the HIPAA version for Coveo HIPAA environments. See About the Coveo User Sync App for details. |
"Access" tab
In the Access tab, specify whether each group (and API key, if applicable) in your Coveo organization can view or edit the current source.
For example, when creating a new source, you could decide that members of Group A can edit its configuration, while Group B can only view it.
For more information, see Custom access level.
Completion
-
Finish adding or editing your source:
-
When you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to do other changes soon, click Add source/Save.
-
When you’re done editing the source and want to make changes effective, click Add and build source/Save and rebuild source.
NoteOn the Sources (platform-ca | platform-eu | platform-au) page, you must click Launch build or Start required rebuild in the source Status column to add the source content or to make your changes effective, respectively.
Back on the Sources (platform-ca | platform-eu | platform-au) page, you can follow the progress of your source addition or modification.
Once the source is built or rebuilt, you can review its content in the Content Browser.
-
-
Once your source is done building or rebuilding, review the metadata Coveo is retrieving from your content.
-
On the Sources (platform-ca | platform-eu | platform-au) page, click your source, and then click More > View and map metadata in the Action bar.
-
If you want to use a currently not indexed metadata in a facet or result template, map it to a field.
-
Click the metadata and then, at the top right, click Add to Index.
-
In the Apply a mapping on all item types of a source panel, select the field you want to map the metadata to, or add a new field if none of the existing fields are appropriate.
Notes-
For details on configuring a new field, see Add or edit a field.
-
For advanced mapping configurations, like applying a mapping to a specific item type, see Manage mappings.
-
-
Click Apply mapping.
-
-
Depending on the source type you use, you may be able to extract additional metadata from your content. You can then map that metadata to a field, just like you did for the default metadata.
More on custom metadata extraction and indexing
Some source types let you define rules to extract metadata beyond the default metadata Coveo discovers during the initial source build.
For example:
Source type Custom metadata extraction methods Define metadata key-value pairs in the
addOrUpdate
section of thePUT
request payload used to upload push operations to an Amazon S3 file container.REST API
and
GraphQL APIIn the JSON configuration (REST API | GraphQL API) of the source, define metadata names (REST API | GraphQL API) and specify where to locate the metadata values in the JSON API response Coveo receives.
Add
<CustomField>
elements in the XML configuration. Each element defines a metadata name and the database field to use to populate the metadata with.-
Configure web scraping configurations that contain metadata extraction rules using CSS or XPath selectors.
-
Extract metadata from JSON-LD
<script>
tags.
-
Configure web scraping configurations that contain metadata extraction rules using CSS or XPath selectors.
-
Extract JSON-LD
<script>
tag metadata. -
Extract
<meta>
tag content using theIndexHtmlMetadata
JSON parameter.
Some source types automatically map metadata to default or user created fields, making the mapping process unnecessary. Some source types automatically create mappings and fields for you when you configure metadata extraction.
See your source type documentation for more details.
-
-
When you’re done reviewing and mapping metadata, return to the Sources (platform-ca | platform-eu | platform-au) page.
-
To reindex your source with your new mappings, click Launch rebuild in the source Status column.
-
Once the source is rebuilt, you can review its content in the Content Browser.
-
-
To ensure that new items are indexed with the next refresh operation, edit the JSON configuration of your source so that the source uses the same time zone as your Confluence Cloud instance:
-
On the Sources (platform-ca | platform-eu | platform-au) page, click your source, and then click More > Edit configuration with JSON in the Action bar.
-
In the
parameters
object, add the following object:"LocalServerTimeOffsetForIncrementalRefresh": { "sensitive": false, "value": "<TIME_ZONE_OFFSET>" }
-
Replace
<TIME_ZONE_OFFSET>
with the time offset from UTC required to match the time zone of your Confluence Cloud instance. For example, if your Confluence instance uses UTC-04:00 time, enter-04:00
.
-
Source update best practice
You can get the CONFLUENCE_UNREACHABLE_SERVER
error message when your Confluence Cloud source rebuilds or is scheduled to perform a rescan during the daily Atlassian Cloud maintenance window (1 AM to 3 AM, in your server’s time zone).
During this period, Atlassian may block access to the API while performing maintenance tasks.
If possible, schedule your source’s rescans so that they’re completed outside of the daily maintenance window. If not possible, ignore the errors. The next scheduled rescan outside the maintenance window shall complete normally.
Indexing page properties
By default, Coveo doesn’t index pages or blog post properties (metadata.properties
).
To do so, you must edit your source’s JSON configuration to specify the desired page properties.
In the Configuration tab of the Edit configuration with JSON panel, add "MetadataPropertiesToExpand": "<VALUES>"
, where <VALUES>
are the properties you want to index, separated by commas.
Example: "MetadataPropertiesToExpand": "owner,status"
To refer to a property nested within another, concatenate their names with a dot (.
) separator.
Example: "MetadataPropertiesToExpand": "owner.lastname,status"
About the Coveo User Sync app
Installing Coveo’s User Sync app in your Atlassian instance is required to replicate your instance’s content access permissions in your search interface. This lets users see in their Coveo search results the content that their role allows them to see in your Atlassian instance.
To replicate your instance’s permission system, Coveo must associate user email addresses with user roles. Atlassian’s API doesn’t provide this information, but provides the roles assigned to each user account ID. So, Coveo built the User Sync app to retrieve the email address corresponding to each user account ID. It can then combine this information with the roles and account IDs provided by Atlassian’s API.
Should you ever switch your source’s content security setting from Everyone to Same users and groups as in your content system, you’ll need to refresh the security identity provider after installing the app.
For more information on sources that index permissions and on how Coveo handles these permissions, see Coveo management of security identities and item permissions.
Coveo doesn’t support the global permission allowing Jira Service Management (JSM) users to use Confluence (Settings > Global permissions > JSM access). As a result, JSM users who have access to your Confluence content through this global permission can’t access this content in their Coveo search results.
Why isn’t the app Cloud Fortified?
Atlassian’s Cloud Fortified Apps Program designates apps that meet the highest standards for security, reliability, and support, making them suitable for enterprise customers with critical business needs.
The Coveo User Sync app isn’t part of the Cloud Fortified Apps Program, but it meets all the requirements except for one: participation in Atlassian’s Marketplace Security Bug Bounty Program. Coveo isn’t currently planning to join this program because it already has its own bug bounty program in place.
For more information about vulnerability management at Coveo, including penetration testing, see Vulnerability management. You may also want to read on other security-related topics.
OAuth 2.0 scopes
The Coveo OAuth 2.0 application requires the following scopes:
-
read:content:confluence
-
read:content-details:confluence
-
read:attachment:confluence
-
read:group:confluence
-
read:user:confluence
-
read:configuration:confluence
-
read:space:confluence
-
read:permission:confluence
-
read:content.permission:confluence
Required privileges
You can assign privileges to allow access to specific tools in the Coveo Administration Console. The following table indicates the privileges required to view or edit elements of the Sources (platform-ca | platform-eu | platform-au) page and associated panels. See Manage privileges and Privilege reference for more information.
Note
The Edit all privilege isn’t required to create sources. When granting privileges for the Sources domain, you can grant a group or API key the View all or Custom access level, instead of Edit all, and then select the Can Create checkbox to allow users to create sources. See Can Create ability dependence for more information. |
Actions | Service | Domain | Required access level |
---|---|---|---|
View sources, view source update schedules, and subscribe to source notifications |
Content |
Fields |
View |
Sources |
|||
Organization |
Organization |
||
Edit sources, edit source update schedules, and edit source mappings |
Organization |
Organization |
View |
Content |
Fields |
Edit |
|
Sources |
|||
View and map metadata |
Content |
Source metadata |
View |
Fields |
|||
Organization |
Organization |
||
Content |
Sources |
Edit |