Inactivity Timeout During Rebuild
Inactivity Timeout During Rebuild
Inactivity timeout is part of all rebuild processes for any system. This situation is different than when the system shuts down and restarts itself right after you installed new software or enabled new functions. In these cases, it always warns you before OR suggests that you delay the restart – although delaying isn’t recommended.
Coveo for Sitecore and the platform are meant to synchronize automatically, except beyond a certain number of new documents, where rebuilding is necessary. A rebuild updates indexes in Sitecore by pushing all new documents from Coveo for Sitecore into the Sitecore source in the platform.
To trigger the rebuild process, you go through the following steps (see Analyzing the Rebuild Process for more detailed steps):
-
You send a rebuild request. Permissions are then sent to Coveo Cloud.
-
The system detects the number of documents to be pushed into the platform and then indexed.
-
When pushing and indexing are completed, the rebuild proceeds to remove all old items from the indexes. However, it may happen that out of 567 new documents, only 489 are pushed into the indexes.
Different steps of the rebuilt process have different inactivity timeouts:
-
The validation step times out after 1 hour of inactivity
-
The step responsible for removing old items times out after 5 minutes of inactivity.
After running for one hour without activity, the rebuild will interrupt itself. Based on the difference between numbers of committed and expected items, you will understand that the Coveo indexes aren’t fully updated.
Symptoms
The most visible sign of an inactivity timeout issue is a characterizing error message in the Sitecore logs. When the rebuild is over, you should see that the last report on the number of committed documents didn’t reach the expected total.
The following gives an example of a situation where only 21,325 documents were verified searchable (committed) out of a total of 23,052 documents:
ManagedPoolThread #0 10:35:02 INFO [...] [Rebuilding source "Coveo_web_index - [sourceName]"] Committed documents: 21325 / 23052
[...]
ManagedPoolThread #18 11:35:07 ERROR [ Coveo.SearchProvider.ProviderIndexBase PerformRebuild] [Rebuilding source "Coveo_master_index - [sourceName]"] An error occurred while rebuilding index "Coveo_master_index".
Exception: Coveo.Framework.Exceptions.CoveoSearchProviderException
Message: Inactivity timeout expired. Not all documents were committed in the allotted time (01:00:00). Aborting the rebuild task.
Source: Coveo.AbstractLayer
Causes
When a system doesn’t detect activity for a certain number of minutes, it automatically shuts down. This can also happen while installing software and not completing the required steps in time.
Platform Issues
They’re the most common cause of inactivity timeout. The platform usually bugs when you send an invalid document to the index. It might be a PDF that doesn’t have an assigned user or owner, is locked, fails to open, etc. They’re easy to diagnose , since they display error codes (Error_code), like CORRUPTED_DOCUMENT. The index basically tries to protect itself and the platform from corrupted, potentially dangerous data (a document carrying viruses, for example).
Permission Issues
Some permissions might not send, or some documents might hold bad permissions. When a document is pushed into the Sitecore Source, it needs to hold the matching permission. Permissions identified on the item as X but in the Coveo index as Y, although they virtually are the same, don’t have matching names, and therefore won’t be recognized by the Sitecore source. The document will be left with an unknown permission, which prevents its indexing after the push.
If you send an item into the Sitecore Source, but the users/permissions don’t follow, you will run into inactivity timeout because the rebuild will have failed. The rebuild process will have failed because there will be one or more documents without a known user. The next section links you to page that explains you how to solve that problem.
Number of Items and Reset Issues
Uploading many items in Sitecore can also slow down the rebuild process, as there’s a maximum number of items and/or security identities that may be pushed each hour. If you want to push many items into the Sitecore source, the rebuild will take longer to conclude. However, if you’re using an older version of Coveo for Sitecore, you might run into an inactivity timeout. Improvements have been made to more recent versions, and you’re encouraged to upgrade to the most recent one.
However, cases of interruption related to the number of items are very rare. What’s most likely to cause an interruption is the timer refreshing if the upload process detects change in the number of documents.
Time | Number of documents processed | Timer state |
---|---|---|
11h05 | 0/100 | 0 |
11h10 | 10/100 | 5 minutes, reset at 0 |
11h15 | 10/100 | 10 minutes, no change |
12h05 | 10/100 | 55 minutes, no change |
12h10 |
10/100 Or, if there's change: 11/100 (which means that next inactivity timeout will occur at 13h10, unless there's change) |
1 hour, inactivity, reset at 0 |
Query Issues
The step Waiting for documents to be searchable failed because although documents were pushed correctly, the query couldn’t find them. The reason to this failure is usually that the user executing the query doesn’t have the right to see some indexed documents. Invalid permissions trigger security issues in the rebuilt process, which might cause it to fail.
If written on an item that a user is allowed to view it, but doesn’t actually own this permission, the rebuild will interrupt itself before indexing the documents.
Resolution
To solve the problem of non-completed indexing, restart the rebuild. If the numbers of committed and expected documents even out, you will want to test the rebuild by looking up random items in the Coveo indexes. If you search for an old document, it shouldn’t exist in the indexes anymore, while an item pulled from Coveo for Sitecore to the platform should be indexed. If this doesn’t work, identify the possible cause and apply one of the following solutions.
Platform Issues
Look up which documents are broken and either remove them from the platform, or fix their issue.
Security Issues
Most problems related to users, groups and permissions associated to pushed items can be validated directly in the “Content Security” tab in the Cloud Administration Console.
Number of Items and Reset Issues
To avoid problems with permissions related to a large number of documents, make sure you’re using the Push API batch process, which compresses items before processing them.
Query Issues
To understand the problem related to a failed query, see if indexed dates on indexed documents were updated during the rebuild. If yes, the rebuild is simply still running.