Coveo for Sitecore 5 is now available!

Inactivity Timeout During Rebuild

Inactivity timeout is part of all rebuild processes for any system. This situation is different than when the system shuts down and restarts itself right after you installed new software or enabled new functions. In these cases, it always warns you before OR suggests that you delay the restart – although delaying is not recommended.

Coveo for Sitecore and the platform are meant to synchronize automatically, except beyond a certain number of new documents, where rebuilding is necessary. A rebuild updates indexes in Sitecore by pushing all new documents from Coveo for Sitecore into the Sitecore source in the platform.

To trigger the rebuild process, you go through the following steps (see Analyzing the Rebuild Process for more detailed steps):

  1. You send a rebuild request. Permissions are then sent to Coveo Cloud.
  2. The system detects the number of documents to be pushed into the platform and then indexed.
  3. When pushing and indexing are completed, the rebuild proceeds to remove all old items from the indexes. However, it may happen that out of 567 new documents, only 489 are pushed into the indexes.

Different steps of the rebuilt process have different inactivity timeouts:

  • The validation step times out after 1 hour of inactivity
  • The step responsible for removing old items times out after 5 minutes of inactivity.

After running for one hour without activity, the rebuild will interrupt itself. Based on the difference between numbers of committed and expected items, you will understand that the Coveo indexes are not fully updated.

Symptoms

The most visible sign of an inactivity timeout issue is a characterizing error message in the Sitecore logs. When the rebuild is over, you should see that the last report on the number of committed documents did not reach the expected total.

The following gives an example of a situation where only 21,325 documents were verified searchable (committed) out of a total of 23,052 documents:

ManagedPoolThread #0 10:35:02 INFO  [...] [Rebuilding source "Coveo_web_index - [sourceName]"] Committed documents: 21325 / 23052
[...]
ManagedPoolThread #18 11:35:07 ERROR [ Coveo.SearchProvider.ProviderIndexBase PerformRebuild] [Rebuilding source "Coveo_master_index - [sourceName]"] An error occurred while rebuilding index "Coveo_master_index".
Exception: Coveo.Framework.Exceptions.CoveoSearchProviderException
Message: Inactivity timeout expired. Not all documents were committed in the allotted time (01:00:00). Aborting the rebuild task.
Source: Coveo.AbstractLayer

Causes

When a system does not detect activity for a certain number of minutes, it automatically shuts down. This can also happen while installing software and not completing the required steps in time.

Platform Issues

They are the most common cause of inactivity timeout. The platform usually bugs when you send an invalid document to the index. It might be a PDF that does not have an assigned user or owner, is locked, fails to open, etc. They are easy to diagnose , since they display error codes (Error_code), like CORRUPTED_DOCUMENT. The index basically tries to protect itself and the platform from corrupted, potentially dangerous data (a document carrying viruses, for instance).

Permission Issues

Some permissions might not send, or some documents might hold bad permissions. When a document is pushed into the Sitecore Source, it needs to hold the matching permission. Permissions identified on the item as X but in the Coveo index as Y, although they virtually are the same, do not have matching names, and therefore will not be recognized by the Sitecore source. The document will be left with an unknown permission, which prevents its indexing after the push.

If you send an item into the Sitecore Source, but the users/permissions do not follow, you will run into inactivity timeout because the rebuild will have failed. The rebuild process will have failed because there will be one or more documents without a known user. The next section links you to page that explains you how to solve that problem.

Number of Items and Reset Issues

Uploading many items in Sitecore can also slow down the rebuild process, as there is a maximum number of items and/or security identities that may be pushed each hour. If you want to push many items into the Sitecore source, the rebuild will simply take longer to conclude. However, if you are using an older version of Coveo for Sitecore, you might run into an inactivity timeout. Improvements have been made to more recent versions, and you are encouraged to upgrade to the most recent one.

However, cases of interruption related to the number of items are very rare. What is most likely to cause an interruption is the timer refreshing if the upload process detects change in the number of documents.

Time Number of documents processed Timer state
11h05 0/100 0
11h10 10/100 5 minutes, reset at 0
11h15 10/100 10 minutes, no change
12h05 10/100 55 minutes, no change
12h10

10/100

Or, if there is change:

11/100

(which means that next inactivity timeout will occur at 13h10, unless there is change)

1 hour, inactivity, reset at 0

Query Issues

The step Waiting for documents to be searchable failed because although documents were pushed correctly, the query could not find them. The reason to this failure is usually that the user executing the query does not have the right to see some indexed documents. Invalid permissions trigger security issues in the rebuilt process, which might cause it to fail.

If written on an item that a user is allowed to view it, but does not actually own this permission, the rebuild will interrupt itself before indexing the documents.

Resolution

To solve the problem of non-completed indexing, restart the rebuild. If the numbers of committed and expected documents even out, you will want to test the rebuild by looking up random items in the Coveo indexes. If you search for an old document, it should not exist in the indexes anymore, while an item pulled from Coveo for Sitecore to the platform should be indexed. If this does not work, identify the possible cause and apply one of the following solutions.

Platform Issues

Look up which documents are broken and either remove them from the platform, or fix their issue.

Security Issues

Most problems related to users, groups and permissions associated to pushed items can be validated directly in the Security tab in the Cloud Administration Console (see Security Issues).

Number of Items and Reset Issues

To avoid problems with permissions related to a large number of documents, make sure you are using the Push API batch process, which compresses items before processing them.

Query Issues

To understand the problem related to a failed query, see if indexed dates on indexed documents were updated during the rebuild. If yes, the rebuild is simply still running.