---
title: Add an Amazon S3 source
slug: '1616'
canonical_url: https://docs.coveo.com/en/1616/
collection: index-content
source_format: adoc
---
# Add an Amazon S3 source
Amazon simple storage service (S3) is a cloud-based data storage designed to store, manage, and distribute large quantities of data worldwide.
Members with the [required privileges](https://docs.coveo.com/en/3151/) can add the content of Amazon S3 buckets to a [Coveo organization](https://docs.coveo.com/en/185/).
Coveo [indexes](https://docs.coveo.com/en/204/) Amazon S3 files to make them searchable.
> **Leading practice**
>
> The number of [items](https://docs.coveo.com/en/210/) that a source processes per hour (crawling speed) depends on various factors, such as network bandwidth and source configuration.
> See [About crawling speed](https://docs.coveo.com/en/2078/) for information on what can impact crawling speed, as well as possible solutions.
## Source key characteristics
The following table presents the main characteristics of an Amazon S3 source.
[%header,cols="2,3,2,3"]
|===
2+|Features
^|Supported
|Additional information
2+|Amazon S3 version
^|Latest cloud version
a|The source supports storage providers that implement S3 APIs compatible with the latest Amazon S3 version.
2+|Indexable content (An access key is needed to connect to the Amazon Web Services (AWS) service through the software development kit (SDK).
The access key is a way to authenticate from the SDK as an Identity and Access Management (IAM) account.
The number of requests is unlimited, but you're charged for every request to your [Amazon S3 buckets)(https://aws.amazon.com/s3/pricing/#Request_Pricing).]
|Buckets (Amazon S3 Requester Pays buckets aren't supported.) and objects (folders and files)
|
.3+|[Content update operations](https://docs.coveo.com/en/2039/)
|[refresh](https://docs.coveo.com/en/2710/)
^|[x]
|
|[rescan](https://docs.coveo.com/en/2711/)
^|[check]
|[Takes place every day by default](https://docs.coveo.com/en/1933/)
|[rebuild](https://docs.coveo.com/en/2712/)
^|[check]
|
.3+|[Content security](https://docs.coveo.com/en/1779/) options
|[Same users and groups as in your content system](https://docs.coveo.com/en/1779#same-users-and-groups-as-in-your-content-system)
^|[x]
|
|[Specific users and groups](https://docs.coveo.com/en/1779#specific-users-and-groups)
^|[check]
|
|[Everyone](https://docs.coveo.com/en/1779#everyone)
^|[check]
|
.4+|[Metadata indexing for search](#index-metadata)
|Automatic mapping of [metadata](https://docs.coveo.com/en/218/) to [fields](https://docs.coveo.com/en/200/) that have the same name
2+a|This setting is disabled by default and [not recommended for this source type](https://docs.coveo.com/en/1640#about-the-performfieldmappingusingallorigins-setting).
|Automatically indexed [metadata](https://docs.coveo.com/en/218/)
2+a|Examples of [auto-populated default fields](https://docs.coveo.com/en/1833#field-origin) (no user-defined metadata required):
* `clickableuri`
* `filename`
* `filetype`
* `language` (auto-detected from item content)
* `s3modifieddate`
* `title`
After a content update, [inspect your item field values](https://docs.coveo.com/en/2053#inspect-search-results) in the **Content Browser**.
|Extracted but not indexed metadata
2+a|The Amazon S3 source extracts object metadata that the S3 API makes available.
After a rebuild, review the [**View and map metadata**](https://docs.coveo.com/en/m9ti0339#view-and-map-metadata-subpage) subpage for the list of indexed metadata, and [index additional metadata](https://docs.coveo.com/en/m9ti0339#index-metadata).
|Custom metadata extraction
2+a|AWS lets you set user-defined metadata on objects.
The Amazon S3 source automatically extracts user-defined metadata whose names are prefixed with `x-amz-meta-` during content update operations.
|===
## Add an Amazon S3 source
Follow the instructions below to add an Amazon S3 source.
. On the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page, click **Add source**.
. In the **Add a source of content** panel, click the **Amazon S3** source tile.
. Configure your source.
> **Leading practice**
>
> It's best to create or edit your source in your sandbox organization first.
> Once you've confirmed that it indexes the desired content, you can copy your source configuration to your production organization, either [with a snapshot](https://docs.coveo.com/en/3239/) or manually.
>
> See [About non-production organizations](https://docs.coveo.com/en/2959/) for more information and best practices regarding sandbox organizations.
### "Configuration" tab
In the **Add an Amazon S3 source** panel, the **Configuration** tab is selected by default.
It contains your source's general and authentication information, as well as other parameters.
#### General information
##### Name
Enter a name for your source.
> **Leading practice**
>
> A source name can't be modified once it's saved, therefore be sure to use a short and descriptive name, using letters, numbers, hyphens (`-`), and underscores (`_`). Avoid spaces and other special characters.
##### Amazon S3 bucket URL
Enter the address of one or more Amazon S3 buckets using one of the following formats:
**Virtual-host style (recommended)**
Details
[example]
**Examples**
* `+http://.s3.amazonaws.com/+`
* `+http://.s3..amazonaws.com/+`
Replace `` with the name of your bucket, and `` with your region code.
#### =====
**Path style**
Details
[example]
**Examples**
* `+http://s3.amazonaws.com/+`
* `+http://s3..amazonaws.com/+`
Replace `` with the name of your bucket, and `` with your region code.
#### =====
The source also supports content hosted by non-Amazon S3 providers, like [Wasabi](https://wasabi.com/).
You must then set the `ServiceUrl` JSON parameter value accordingly.
**Example**
To index content under `+https://s3.wasabisys.com/mybucket+`, you would configure your source as follows:
. Enter `+https://s3.wasabisys.com/mybucket+` in the **Amazon S3 bucket URL** box, and then click **Save**.
. On the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page, click the source you just created, and then click **More** > **Edit configuration with JSON**.
. Click anywhere in the **JSON configuration** box.
. Hit `Ctrl+F` (Windows) or `Command+F` (Mac).
. In the **Search** field that appears, type `ServiceUrl`, and then hit `Enter`.
. Set the `value` of the `ServiceUrl` parameter to the Wasabi endpoint URL.
[source,language=json]
```
"ServiceUrl": {
"sensitive": false,
"value": "https://s3.wasabisys.com"
}
```
. Click **Save**.
> **Notes**
>
> * To exclude certain subfolders, first configure and save your source with a broad URL.
> Then, see [Refine the Content to Index](#refine-the-content-to-index).
>
> * If a region isn't specified in the URL, it uses the US Standard (`us-east-1`) region endpoint by default.
>
> * When the URL points to a folder inside a bucket, only keys starting with that prefix will be crawled.
>
> * Replace all spaces in the bucket name with `%20`, if any.
> For example, `+http://s3..amazonaws.com/doc example bucket+` should be replaced with `+http://s3..amazonaws.com/doc%20example%20bucket+`.
##### Project
Use the **Project** selector to associate your source with one or more Coveo [projects](https://docs.coveo.com/en/n7ef0517/).
#### "Authentication" section
Select the authentication type that applies.
The options are:
**No login - Content to index is available to all**
Details
Select this option if your bucket content is public, meaning anonymous users can access the content.
Ensure you've granted the **Everyone (public access): Objects - List** access on your bucket.
See the **Using the S3 console to set ACL permissions for buckets** section in AWS's [Configuring ACLs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/managing-acls.html) page for more details.
If your bucket permissions aren't properly set, you'll encounter an authentication error similar to the one below when attempting a [content update operation](https://docs.coveo.com/en/2039/):
```text
Source credentials do not have sufficient privileges to access the specified Amazon S3 bucket and consequently, Coveo cannot perform any action regarding your source.
Edit the configuration to review the provided AWS Access Key ID and AWS Secret Access Key ID.
```
[TIP]
Before building the source, in a browser, test your bucket URL (without a path), and validate that it returns an XML file listing the bucket content (keys).
If you get a short `Access denied` XML error, the source will give an authentication error.
##### ====
**Amazon S3 access key**
Details
Select this option if your S3 bucket content is secured, meaning not accessible to anonymous users.
Then, enter the **Access key ID** and **Secret access key**^ values provided by your AWS Identity and Access Management (IAM) account, as detailed in [Access key and Secret key](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
The IAM account must have at least the `read` permission on the bucket content to index.
### "Items" tab
On the **Items** tab, you can specify how the source handles items based on their file type or content type.
#### File types
File types let you define how the source handles [items](https://docs.coveo.com/en/210/) based on their file extension or content type.
For each file type, you can specify whether to index the item content and [metadata](https://docs.coveo.com/en/218/), only the item metadata, or neither.
You should fine-tune the file type configurations with the objective of indexing only the content that's relevant to your users.
**Example**
Your repository contains `.pdf` files, but you don't want them to appear in search results.
You click **Extensions** and then, for the `.pdf` extension, you change the **Default action** and **Action on error** values to `Ignore item`.
For more details about this feature, see [File type handling](https://docs.coveo.com/en/l3qg9275/).
#### Content and images
If you want Coveo to extract text from image files or PDF files containing images, enable the appropriate option.
The extracted text is processed as item data, meaning that it's fully searchable and will appear in the item [Quick view](https://docs.coveo.com/en/2760#search-result-quick-view).
> **Note**
>
> When OCR is enabled, ensure the source's relevant [file type configurations](https://docs.coveo.com/en/l3qg9275/) index the item content.
> Indexing the item's metadata only or ignoring the item will prevent OCR from being applied.
See [Enable optical character recognition](https://docs.coveo.com/en/2937/) for details on this feature.
### "Content security" tab
Select who will be able to access the source items through a Coveo-powered [search interface](https://docs.coveo.com/en/2741/).
For details on the content security options, see [Content security](https://docs.coveo.com/en/1779/).
### "Access" tab
. On the **Access** tab, specify whether each group (and API key, if applicable) in your [Coveo organization](https://docs.coveo.com/en/185/) can view or edit the current source.
For example, when creating a new source, you could decide that members of Group A can edit its configuration, while Group B can only view it.
For more information, see [Custom access level](https://docs.coveo.com/en/3151#custom-access-level).
On the **Access** tab, specify whether each group (and API key, if applicable) in your [Coveo organization](https://docs.coveo.com/en/185/) can view or edit the current source.
For example, when creating a new source, you could decide that members of Group A can edit its configuration, while Group B can only view it.
For more information, see [Custom access level](https://docs.coveo.com/en/3151#custom-access-level).
### Build the source
. Finish adding or editing your source:
** When you're done editing the source and want to make your changes effective, click **Add and build source**/**Save and rebuild source**.
** When you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to make other changes soon, click **Add source**/**Save**.
On the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page, click **Launch build** or **Start required rebuild** when you're ready to make your changes effective and index your content.
> **Leading practice**
>
> By default, a Jira Software source indexes the entire Jira Software instance content.
> To index only certain projects, click **Save**, and then specify the desired address patterns in your [source JSON configuration](https://docs.coveo.com/en/1685/) before launching the initial build.
> See [Add source filters](https://docs.coveo.com/en/2006#add-source-filters) for further information.
. On the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page, follow the progress of your source addition or modification.
. Once the source is built or rebuilt, [review its content in the Content Browser](https://docs.coveo.com/en/2053/).
. Optionally, consider [editing or adding mappings](https://docs.coveo.com/en/1640/).
> **Note**
>
> If you selected **Specific URLs** or **User profiles** in the [**Content**](https://docs.coveo.com/en/1739#content) section, some additional items will appear in the Content Browser.
> To retrieve user profiles, Coveo must crawl your SharePoint Online instance, including your host site collection and the documents it contains.
> Items encountered during this process are also retrieved and therefore appear in the Content Browser.
### Index metadata
To use [metadata](https://docs.coveo.com/en/218/) values in [search interface](https://docs.coveo.com/en/2741/) [facets](https://docs.coveo.com/en/198/) or result templates, the metadata must be [mapped](https://docs.coveo.com/en/217/) to [fields](https://docs.coveo.com/en/200/).
Coveo automatically [maps](https://docs.coveo.com/en/217/) only a subset of the metadata it extracts.
You must map any additional metadata to fields manually.
> **Note**
>
> Not clear on the purpose of indexing metadata?
> Watch [this video](https://www.youtube.com/watch?v=BmmmVJ3AWi0).
. On the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page, click your source, and then click **More** > **View and map metadata** in the Action bar.
. Review the default [metadata](https://docs.coveo.com/en/218/) that your source is extracting from your content.
. Map any currently _not indexed_ metadata that you want to use in facets or result templates to fields.
> **Important**
>
> Amazon S3 is [no longer returning the `DisplayName` metadata](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Owner.html) of the `Owner` object in API calls.
> The Coveo Amazon S3 source previously retrieved this S3 metadata as the `ObjectOwnerDisplayName` [Coveo Platform](https://docs.coveo.com/en/186/) metadata.
> If you're indexing this [Coveo Platform](https://docs.coveo.com/en/186/) metadata and using the related [field](https://docs.coveo.com/en/200/) in your search interfaces, update your implementation accordingly.
.. Click the metadata and then, at the top right, click **Add to Index**.
.. In the **Apply a mapping on all item types of a source** panel, select the field you want to map the metadata to, or [add a new field](https://docs.coveo.com/en/1833#add-a-field) if none of the existing fields are appropriate.
> **Note**
>
> For advanced mapping configurations, like applying a mapping to a specific item type, see [Manage mappings](https://docs.coveo.com/en/1640#manage-mappings).
.. Click **Apply mapping**.
. Return to the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page.
. To reindex your source with your new mappings, click your source, and then click **More** > **Rebuild** in the Action bar.
. Once the source is rebuilt, review your item field values.
They should now include the values of the metadata you selected to index.
.. On the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page, click your source, and then click **More** > **Open in Content Browser** in the Action bar.
.. Select the card of the item for which you want to inspect properties, and then click **Properties** in the Action bar.
.. In the panel that appears, select the **Fields** tab.
. If needed, extract and map additional metadata.
**More on custom metadata extraction**
Details
AWS lets you [set user-defined metadata](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html#upload-objects-procedure) when you upload objects to your buckets, whether using the S3 console, AWS CLI, REST API, or SDKs.
You can also [edit object metadata](https://docs.aws.amazon.com/AmazonS3/latest/userguide/add-object-metadata.html) by copying objects to the same destination.
The Amazon S3 source automatically extracts user-defined metadata with the same `x-amz-meta-` prefixed name as in S3.
After uploading objects or editing metadata, rebuild and map each new extracted custom metadata to a field, as you did for the default metadata.
## Refine the content to index
You may want to avoid indexing certain subfolders, or to index only a few of them.
To do so:
. If not already done, create and save your source with a broad [bucket URL](#amazon-s3-bucket-url).
. In your [source JSON configuration](https://docs.coveo.com/en/1685/), enter an [address filter](https://docs.coveo.com/en/2006#add-source-filters) to refine the targeted content.
> **Important**
>
> Your [bucket URL](#amazon-s3-bucket-url) must match one of your inclusion [`addressPatterns`](https://docs.coveo.com/en/2006#addresspatterns-array-required) and not match any of your exclusion `addressPatterns`.
## Required privileges
You can assign privileges to allow access to specific tools in the [Coveo Administration Console](https://docs.coveo.com/en/183/).
The following table indicates the privileges required to view or edit elements of the [**Sources**](https://platform.cloud.coveo.com/admin/#/orgid/content/sources/) ([platform-ca](https://platform-ca.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-eu](https://platform-eu.cloud.coveo.com/admin/#/orgid/content/sources/) | [platform-au](https://platform-au.cloud.coveo.com/admin/#/orgid/content/sources/)) page and associated panels.
See [Manage privileges](https://docs.coveo.com/en/3151/) and [Privilege reference](https://docs.coveo.com/en/1707/) for more information.
> **Note**
>
> The **Edit all** privilege isn't required to create sources.
> When granting privileges for the [Sources](https://docs.coveo.com/en/1707#sources-domain) domain, you can grant a group or API key the **View all** or [**Custom**](https://docs.coveo.com/en/3151#custom-access-level) access level, instead of **Edit all**, and then select the **Can Create** checkbox to allow users to create sources.
> See [Can Create ability dependence](https://docs.coveo.com/en/3151#can-create-ability-dependence) for more information.
## What's next?
* [Schedule source updates](https://docs.coveo.com/en/1933/).
[Schedule source updates](https://docs.coveo.com/en/1933/).