[SERVER-32398] Primary freezes during background index build Created: 19/Dec/17 Updated: 05/Feb/18 Resolved: 21/Dec/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance, WiredTiger |
| Affects Version/s: | 3.4.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Michael Smith | Assignee: | Kelsey Schubert |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | RF | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 16.04 Server |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
mongod 3.4.10 on Ubuntu 16.04 in a replica set with 3 nodes. The primary was building an index and the index build stopped:
At this point it stalled, and no more "Index Build" lines were printed until I restarted mongod several hours later. From here on I could connect to the admin database as the root database user, but I couldn't connect as any other user - the connection would just hang. Logs below in case they're relevant (I suspect they're not).
Backtrace attached. |
| Comments |
| Comment by Kelsey Schubert [ 21/Dec/17 ] |
|
Hi Mike, Due to the issue described in
In summary, you should avoid operations that create replica lag while running with majority read concern enabled. For index builds in particular, you can avoid this issue by using rolling index builds where you build indexes on secondaries one at a time and before building them on the primary. To address your questions specifically, old pages are kept in cache on the primary until it has been written to a majority of members in the replica set. Consequently, in your case, both secondaries would need to be lagging for the issue to manifest. If both secondaries are lagging, I would expect 'tracked dirty bytes in the cache" to increase and for the "freeze" to occur it hits 20% of the maximum configured cache size. Kind regards, |
| Comment by Michael Smith [ 21/Dec/17 ] |
|
Hi Kelsey, What metric did you use to identify the cache pressure? Would it be something like db.serverStatus().wiredTiger.cache['tracked dirty bytes in the cache']? What is blocking pages from being evicted from the cache? Is it waiting until one secondary provides feedback that it's applied the log (i.e. majority write) or is it waiting for ALL secondaries? I had this again tonight while doing a lot of inserts on the primary and the nightly mongodump on both secondaries overlapped. When the secondaries both lagged about 1000-1500 seconds, simple find queries on the primary started to take 30 seconds. I recovered by killing the mongodumps and restarting mongod on both secondaries. Thanks, |
| Comment by Michael Smith [ 19/Dec/17 ] |
|
Thanks Kelsey. Is there anywhere I could read up on how readConcernMajority, replication lag, and cache pressure interact? A couple of weeks ago I had a primary in another cluster fail:
Is the issue that something can't be evicted from the WiredTiger cache until one (all?) replicas have acknowledged some portion of the oplog, and with the cache full, no operations can make progress? |
| Comment by Kelsey Schubert [ 19/Dec/17 ] |
|
Hi Mike, Thanks for providing the logs and diagnostic.data for the secondary. After examining these files, we have a hypothesis that root cause of the lag on the secondary was caused by Kind regards, |
| Comment by Michael Smith [ 19/Dec/17 ] |
|
Hi Kelsey, I've uploaded the secondaries' logs and diagnostic.data to the portal. Best, |
| Comment by Kelsey Schubert [ 19/Dec/17 ] |
|
Hi Mike, Thanks for providing the diangostic.data and logs for the affected mongod; it's very helpful. Would you please provide the same diagnostics for one of the secondaries? I see that the two secondaries begin lagging when the text index is starts being built, eventually leading to cache pressure that slows the system. The logs and diagnostic.data from a secondary will give us a clearer picture of the state of the cluster as whole preceding this issue. Thanks, |
| Comment by Michael Smith [ 19/Dec/17 ] |
|
Uploaded. Thanks! |
| Comment by Kelsey Schubert [ 19/Dec/17 ] |
|
Hi Mike, I've created a secure upload portal for you to use. Files uploaded to this portal are only visible to MongoDB employees investigating this issue and are routinely deleted after some time. Thanks for your help, |
| Comment by Michael Smith [ 19/Dec/17 ] |
|
Hi @kelsey.schubert, Happy to upload diagnostic.data and logs to a secure portal if they can stay private. Best, |
| Comment by Kelsey Schubert [ 19/Dec/17 ] |
|
Hi mzs, Thank you for reporting this issue and providing stacks; we're still looking through them. It may help our investigation if we could review the diagnostic.data and complete mongod logs. Would you please upload an archive of the diagnostic.data directory and the complete mongod logs covering this event? If needed, I'd be happy to provide a secure portal. Thank you, |
| Comment by Michael Smith [ 19/Dec/17 ] |
|
Forgot to mention: during the stall, CPU was pegged at 400% (on a 4-processor machine). |