Indexing optimization with concurrency and batch size

This is for:

Developer

Full indexing may be a time-consuming process, especially for large catalog data. To optimize the full indexing process, SAP Commerce Cloud allows for concurrent indexing of multiple items.

This article provides guidance on how to configure the concurrency and batch size settings to optimize the indexing process in SAP Commerce Cloud 2211.37 and later.

Recommendations

SAP Commerce Cloud exposes a configuration object called SnIndexerConfiguration with the following properties:

Property Description

concurrency

Defines the maximum number of threads that can be used for indexing.

The default value is 5.

batchSize

Defines the number of items that can be processed in a single batch.

The default and minimum value is 100.

While the default values are suitable for most use cases, you can adjust them if you want to optimize the indexing process. Consider the following recommendations:

  • If the catalog objects have many attributes, they require as many value providers to be called. This can make SAP Commerce Cloud spend more time on building processing each batch, so it’s recommended to set the lower batchSize value.

  • If the catalog objects have few attributes, increasing the batchSize value can help to speed up the indexing process.

  • If the catalog objects have few attributes and the catalog data is large, you can increase both concurrency and batchSize values to speed up the indexing process.

    Note

    Such a configuration might require more processing power from the server.

    Important

    High concurrency and low batch size values can potentially hit the Stream API limits. If that happens, the indexing process will be interrupted and the retry mechanism will try to reprocess the failed items.

During indexing optimization, monitor your SAP Commerce Cloud server’s performance, track failed indexing, and adjust the SnIndexerConfiguration values accordingly.

Set up concurrency and batch size settings in the Backoffice

For the required index type, create a new indexer configuration with the following values:

Identifier

Enter a unique identifier for a new configuration, for example, coveoIndexerConfiguration.

Name

Enter a name for a new search provider, for example, Coveo Catalog Indexer Configuration.

Concurrency

Enter the maximum number of threads that can be used for indexing. The default value is 5.

Batch size

Enter the number of items that can be processed in a single batch. The default value is 100.

The configuration is created and attached to your full indexer, you can now run it with the new settings applied. See Run the indexers.

Set up concurrency and batch size settings in the impex file

You can also set up the concurrency and batch size settings in the impex file. The following example repeats the configuration shown in the previous section.

INSERT_UPDATE SnIndexerConfiguration; &indexerConfig            ; id[unique = true]         ; name                                    ; concurrency ; batchSize
                                    ; coveoIndexerConfiguration ; coveoIndexerConfiguration ; Coveo Electronics Indexer Configuration ; 5           ; 100

INSERT_UPDATE SnIndexType; indexConfiguration(id)   ; id[unique = true]      ; name                     ; itemComposedType(code) ; identityProvider   ; listeners                            ; catalogs(id)              ; stores(uid) ; indexerConfiguration(&indexerConfig)
                         ; $coveoIndexConfiguration ; $coveoProductIndexType ; CoveoElectronics Product ; Product                ; snIdentityProvider ; catalogVersionFilterSnSearchListener ; electronicsProductCatalog ; electronics ; coveoIndexerConfiguration