Index Contentful content
Index Contentful content
This article explains how to create and configure a generic Coveo GraphQL API source to use the Contentful GraphQL Content API endpoint, and optimize your indexing strategy. This approach ensures that you collect and effectively use your content for enhanced search experiences.
The examples provided show how to configure the source to index Article
items, where each article will be a document in the Coveo index available for search and recommendations.
Create a GraphQL API source
Follow the instructions below to add a GraphQL API source that uses the Cloud content retrieval method.
-
On the Sources (platform-ca | platform-eu | platform-au) page, click Add source.
-
In the Add a source of content panel, select the Cloud () tab.
-
Click the GraphQL API tile.
Configure your source
When creating a GraphQL API source in the Coveo Administration Console, you must provide a JSON configuration that specifies which content to crawl and how to retrieve each type of item.
"Authentication" section
To establish a proper connection to the GraphQL Content API, enter your API key and other necessary authentication details in the Authentication section of the GraphQL source configuration panel.
For more information about the API Key, refer to Contentful API Key.
"Content to include" section
In the JSON configuration box, add the configuration that defines the call to perform against the Contentful GraphQL Content API endpoint and how Coveo will handle the response. See JSON configuration reference for details.
The following example show how to configure the source to index the item Article
, where each article will be a document in the Coveo index available for search and recommendations.
The configuration has a single property, Services, whose value must be an array of objects.
Example
{
"Services": [
{
"Url": "https://graphql.contentful.com",
"Headers": {
"Authorization": "Bearer @ApiKey"
},
"Endpoints": [
{
"paging": {
"offsetType": "item",
"pageSize": 25,
"TotalCountKey": "data.barcaHelpCollection.total"
},
"Path": "content/v1/spaces/<CONTENTFUL_SPACE_ID>",
"Method": "GET",
"ItemPath": "data.barcaHelpCollection.items",
"SkippableErrorCodes": "404",
"ItemType": "Article",
"Body": "%[body]",
"Uri": "https://help.barca.group/article/%[cat_slug]",
"ClickableUri": "https://help.barca.group/article/%[cat_slug]",
"Title": "%[title]",
"ModifiedDate": "%[sys.publishedAt]",
"Metadata": {
"articletags": "%[tags[*]]",
"cat_slug": "%[slug]",
"ec_category": "%[ecCategory[*]]",
"ec_description": "%[ecShortdesc]",
"ec_shortdesc": "%[ecShortdesc]",
"eng_blogimage": "%[engBlogimage.url]",
"objecttype": "Article",
"thumbnailurl": "%[engBlogimage.url]?w=240"
},
"PayloadJsonContent": "@query"
}
]
}
]
}
"GraphQL queries" section
Configure the desired GraphQL query to define the items and fields. Each item in the query will be a document in the Coveo index that will be part of search or recommendations. Each Coveo field will either be populated with the body of the document or a returned piece of metadata that you have mapped.
-
Click Add query.
-
Name your query (for example,
@query
).The GraphQL query field will reference your query by its name.
-
In the GraphQL query field, enter your query.
Use the GraphiQL App to define your GraphQL query.
Consider the following GraphiQL App screenshot showing a sample query and the returned results:
query Query { barcaHelpCollection (skip:@offset, limit:@pageSize) { total items { body { json } ecCategory ecShortdesc objecttype slug tags title engBlogimage { url } sys { publishedAt } } } }
Each document indexed by Coveo must have a unique URL. This is why the value of the
Uri
parameter (that is,https://help.barca.group/article/%[cat_slug]
) in the sample JSON configuration is populated by concatenating:-
a static value (that is,
https://help.barca.group/article/
), and -
the extracted
cat_slug
dynamic metadata value (that is,%[cat_slug]
).
-
-
If the URL returned in the Query field doesn’t return the desired properties, consider adding an indexing pipeline extension (IPE) or using the Sitemap source instead.
-
Include
_modified
or changed fields in your query to ensure that your scheduled content updates pick up changes in a timely manner and that your indexed content remains fresh.
JSON configuration reference
"Services" object
"Services": [
{
"Url": "https://graphql.contentful.com",
"Headers": {
"Authorization": "Bearer @ApiKey"
},
"Endpoints": [
{
"paging": {
"offsetType": "item",
"pageSize": 25,
"TotalCountKey": "data.barcaHelpCollection.total"
},
"Path": "content/v1/spaces/<CONTENTFUL_SPACE_ID>",
"Method": "GET",
"ItemPath": "data.barcaHelpCollection.items",
"SkippableErrorCodes": "404",
"ItemType": "Article",
"Body": "%[body]",
"Uri": "https://help.barca.group/article/%[cat_slug]",
"ClickableUri": "https://help.barca.group/article/%[cat_slug]",
"Title": "%[title]",
"ModifiedDate": "%[sys.publishedAt]",
"Metadata": {
"articletags": "%[tags[*]]",
"cat_slug": "%[slug]",
"ec_category": "%[ecCategory[*]]",
"ec_description": "%[ecShortdesc]",
"ec_shortdesc": "%[ecShortdesc]",
"eng_blogimage": "%[engBlogimage.url]",
"objecttype": "Article",
"thumbnailurl": "%[engBlogimage.url]?w=240"
},
"PayloadJsonContent": "@query"
}
]
}
]
"Url" object
The Url
value is the Contentful GraphQL query URL.
The GraphQL API source configuration Url will be as follows:
"Url": "https://graphql.contentful.com"
"Authentication" object
To establish a proper connection to the GraphQL Content API, enter your API key and other necessary authentication details in the Authentication section of the GraphQL source configuration panel. For more information about the API Key, refer to Contentful API Key.
We recommend using the @ApiKey
placeholder to retrieve the value specified in the GraphQL API source configuration panel.
"Headers": {
"Authorization": "Bearer @ApiKey"
}
"Endpoints" object
"Paging" object
Pagination is a technique used in web development and APIs to divide a large dataset into smaller, more manageable chunks called pages. When configuring the GraphQL API source, pagination allows users to navigate through the data by requesting and displaying one page at a time.
In the context of the GraphQL API source, pagination is being used to retrieve data from the API sequentially, where each request fetches one page of data. This approach helps optimize performance and reduce the amount of data transferred between the client and the server.
When building the query, include the total
field to indicate the total number of pages.
This is calculated by the server-side GraphQL implementation to provide the total count of items matching the query’s filters.
In the source configuration, include the pageSize
to specify the number of items per page, the offsetType
to specify to Coveo how your API paginates its content, and the totalCountKey
that specifies the total expected number of items through the data object.
Then, in your GraphQL query, include the tokens @pageSize
and @offset
.
Coveo will replace @pageSize
with the value of the pageSize
parameter in your paging configuration.
Similarly, @offset
will be replaced with the value extracted from the response, depending on the paging method selected in offsetType
.
"paging": {
"offsetType": "item",
"pageSize": 25,
"TotalCountKey": "data.barcaHelpCollection.total"
}
query Query {
barcaHelpCollection (skip:@offset, limit:@pageSize) {
...
}
}
"IndexingAction" object
This section determines whether to ignore or retrieve an item based on a condition.
When the condition is true, the Coveo crawler applies the specified action to the item.
Possible actions are Retrieve
and Ignore
, meaning the crawler can either index the item or ignore it.
An item ignored with IndexingAction
isn’t indexed and therefore isn’t visible in the Content Browser (platform-ca | platform-eu | platform-au) or a search interface.
In the provided example, only items where my_locale
equals "en"
are indexed.
"IndexingAction": {
"ActionOnItem": "Ignore",
"Condition": "NOT(%[my_locale] == \"en\")"
}
"Path" object
The GraphQL API source configuration will be as follows with your Contentful Space ID:
"Path": "content/v1/spaces/<CONTENTFUL_SPACE_ID>"
"ItemType" object
Whether you’re using the Contentful GraphQL Content API alone or alongside an existing Contentful CMS, specify the type you want to mark the items to pass from Contentful into the Coveo index.
In the provided example, the query is configured to fetch articles, and the configuration looks like this:
"ItemType": "Article"
"Body" object
The body is the full text you want to index. All the text is free-text searchable and available in the quickview of the index. In the provided example, we assign the HTML body:
"Body": "%[body]"
"Metadata" object
This object gives us the ability to associate some specific properties of each item in the JSON response to a specific field in Coveo.
Each key represents the metadata name of the item, while its value is the value path (simple path or JSONPath) in the JSON response. Typically, the value path consists of one or more dynamic values, since a static, hardcoded value would result in identical metadata for all items. However, you could use a hardcoded value so that the corresponding Coveo field is filled even if the API doesn’t provide this information.
In the provided example, we take all tags, the slug, all categories, the description, and the image URL.
"Metadata": {
"articletags": "%[tags[*]]",
"cat_slug": "%[slug]",
"ec_category": "%[ecCategory[*]]",
"ec_description": "%[ecShortdesc]",
"ec_shortdesc": "%[ecShortdesc]",
"eng_blogimage": "%[engBlogimage.url]",
"objecttype": "Article",
"thumbnailurl": "%[engBlogimage.url]?w=240"
}
When you build your source, Coveo retrieves the desired metadata. You can then review a summary of this metadata in the Administration Console and use it to create mapping rules for your source.
Don’t forget to create a field for each metadata name and to define its options (that is, Facet, Multi-value facet, and Sortable).
"PayloadJsonContent" object
The PayloadJsonContent
parameter is a placeholder for your query name that will link to the GraphQL query defined in the GraphQL queries section.
"PayloadJsonContent": "@query"