Add an RSS source

A Really Simple Syndication (RSS) feed is a file that includes information about content that a site has published, and will allow a user to keep track of the updates to those sites. An RSS source allows members with the required privileges to add the content of an RSS feed to a Coveo organization.

The availability of old and new RSS feed items depends on the RSS feed configuration contained in an XML file.

Notes
  • If the XML feed file contains OpenSearch information, your RSS source uses it to include all available items in the feed as far in the past as possible.

  • If the XML feed file doesn’t contain OpenSearch information, the RSS source doesn’t retrieve old feed items, only new ones that are published.

  • The indexing process of RSS sources keeps items in your source even when they’re filtered out of the RSS feed as new items are published. The filtered-out RSS feed items are however deleted from the source when you perform a source rebuild or rescan.

Example

A technological website RSS feed is configured to provide only the last 100 articles, therefore previous articles aren’t accessible. When you create, rebuild, or rescan your RSS source, you get the last 100 RSS articles available on this site at that time. The next day, five articles are published and your source ends up containing 105 articles.

Six months later, it may contain a thousand articles as long as you don’t rebuild or rescan the source, in which case it will only contain the last 100 articles.

Tip
Leading practice

The number of items that a source processes per hour (crawling speed) depends on various factors, such as network bandwidth and source configuration. See About crawling speed for information on what can impact crawling speed, as well as possible solutions.

Source key characteristics

The following table presents the main characteristics of an RSS source.

Features Supported Additional information

RSS feed formats

RSS 1.0, RSS 2.0, and Atom 1.0

Indexable content

RSS feeds (or channels) and RSS items.

Optical Character Recognition (OCR)

check

Available for an additional charge. Contact Coveo Sales to add this feature to your Coveo organization license.

Content update operations

refresh

check

Takes place every hour by default.

A rescan or rebuild is required to account for deleted items.

The last update time of each RSS feed item must be available.[1]

rescan

check

rebuild

check

Content security options

Same users and groups as in your content system

x

Specific users and groups

check

Everyone

check

Metadata indexing for search

Automatic mapping of metadata to fields that have the same name

This setting is disabled by default and not recommended for this source type.

Automatically indexed metadata

Examples of auto-populated default fields (no user-defined metadata required):
 

  • clickableuri

  • language (auto-detected from item content)

  • rsspublishdate

  • rsscategories

  • title
     

After a content update, inspect your item field values in the Content Browser.

Extracted but not indexed metadata

The RSS source extracts metadata from RSS feed elements such as author, description, category, and publication date.
 

After a rebuild, review the View and map metadata subpage for the list of indexed metadata, and index additional metadata.

Limitations

  • The source requires a last update time to support refreshes. To that end, the source uses one of the following item properties:

    • Atom 1.0: <updated>

    • RSS 2.0: <a10:updated>

    Without a last update time, the source uses an arbitrary date far in the past, making a rescan or rebuild necessary to retrieve changes to items.

Add an RSS source

Follow the instructions below to add an RSS source.

  1. On the Sources (platform-ca | platform-eu | platform-au) page, click Add source.

  2. In the Add a source of content panel, click the RSS source tile.

  3. Configure your source.

Tip
Leading practice

It’s best to create or edit your source in your sandbox organization first. Once you’ve confirmed that it indexes the desired content, you can copy your source configuration to your production organization, either with a snapshot or manually.

See About non-production organizations for more information and best practices regarding sandbox organizations.

"Configuration" tab

On the Add an RSS Source page, the Configuration tab is selected by default. It contains your source’s general and content information, as well as other parameters.

General information

Name

Enter a name for your source.

Tip
Leading practice

A source name can’t be modified once it’s saved, therefore be sure to use a short and descriptive name, using letters, numbers, hyphens (-), and underscores (_). Avoid spaces and other special characters.

Example

Corporate-RSS-Feeds

Feed URL

Enter the web addresses of RSS feeds that you want to index in the http:// or https:// format.

Example

http://rss.cnn.com/rss/cnn_tech.rss

Project

Use the Project selector to associate your source with one or more Coveo projects.

"Authentication" section

If basic authentication is required, enter the Username and Password of the RSS feed website account that has access to the RSS feed content you want to include. See Source credentials leading practices.

"Items" tab

On the Items tab, you can enable or disable optical character recognition (OCR) on your content.

Content and images

If you want Coveo to extract text from image files or PDF files containing images, enable the appropriate option. The extracted text is processed as item data, meaning that it’s fully searchable and will appear in the item Quick view.

Note

When OCR is enabled, ensure the source’s relevant file type configurations index the item content. Indexing the item’s metadata only or ignoring the item will prevent OCR from being applied.

See Enable optical character recognition for details on this feature.

"Content security" tab

Select who will be able to access the source items through a Coveo-powered search interface. For details on the content security options, see Content security.

"Access" tab

On the Access tab, specify whether each group (and API key, if applicable) in your Coveo organization can view or edit the current source.

For example, when creating a new source, you could decide that members of Group A can edit its configuration, while Group B can only view it.

For more information, see Custom access level.

Build the source

  1. Finish adding or editing your source:

    • When you’re done editing the source and want to make your changes effective, click Add and build source/Save and rebuild source.

    • When you want to save your source configuration changes without starting a build/rebuild, such as when you know you want to make other changes soon, click Add source/Save. On the Sources (platform-ca | platform-eu | platform-au) page, click Launch build or Start required rebuild when you’re ready to make your changes effective and index your content.

  2. On the Sources (platform-ca | platform-eu | platform-au) page, follow the progress of your source addition or modification.

  3. Once the source is built or rebuilt, review its content in the Content Browser.

Index metadata

To use metadata values in search interface facets or result templates, the metadata must be mapped to fields. Coveo automatically maps only a subset of the metadata it extracts. You must map any additional metadata to fields manually.

Note

Not clear on the purpose of indexing metadata? Watch this video.

  1. On the Sources (platform-ca | platform-eu | platform-au) page, click your source, and then click More > View and map metadata in the Action bar.

  2. Review the default metadata that your source is extracting from your content.

  3. Map any currently not indexed metadata that you want to use in facets or result templates to fields.

    1. Click the metadata and then, at the top right, click Add to Index.

    2. In the Apply a mapping on all item types of a source panel, select the field you want to map the metadata to, or add a new field if none of the existing fields are appropriate.

      Note

      For advanced mapping configurations, like applying a mapping to a specific item type, see Manage mappings.

    3. Click Apply mapping.

  4. Return to the Sources (platform-ca | platform-eu | platform-au) page.

  5. To reindex your source with your new mappings, click your source, and then click More > Rebuild in the Action bar.

  6. Once the source is rebuilt, review your item field values. They should now include the values of the metadata you selected to index.

    1. On the Sources (platform-ca | platform-eu | platform-au) page, click your source, and then click More > Open in Content Browser in the Action bar.

    2. Select the card of the item for which you want to inspect properties, and then click Properties in the Action bar.

    3. In the panel that appears, select the Fields tab.

Required privileges

You can assign privileges to allow access to specific tools in the Coveo Administration Console. The following table indicates the privileges required to view or edit elements of the Sources (platform-ca | platform-eu | platform-au) page and associated panels. See Manage privileges and Privilege reference for more information.

Note

The Edit all privilege isn’t required to create sources. When granting privileges for the Sources domain, you can grant a group or API key the View all or Custom access level, instead of Edit all, and then select the Can Create checkbox to allow users to create sources. See Can Create ability dependence for more information.

Actions Service Domain Required access level

View sources, view source update schedules, and subscribe to source notifications

Content

Fields

View

Sources

Organization

Organization

Edit sources, edit source update schedules, and edit source mappings

Organization

Organization

View

Content

Fields

Edit

Sources

View and map metadata

Content

Source metadata

View

Fields

Organization

Organization

Content

Sources

Edit

What’s next?