[SERVER-18829] Cache usage exceeds configured maximum during index builds under WiredTiger Created: 04/Jun/15  Updated: 30/Mar/16  Resolved: 26/Jun/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.3
Fix Version/s: 3.0.5, 3.1.5

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Michael Cahill (Inactive)
Resolution: Done Votes: 4
Labels: RF, WTmem, WTplaybook
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File incident.png     PNG File oom.png     PNG File partial-repro-ckpt.png     PNG File partial-repro-stacks.png     PNG File partial-repro.png     HTML File ss-alex.html     Text File ss-alex.log    
Issue Links:
Depends
is depended on by WT-1973 MongoDB changes for WiredTiger 2.7.0 Closed
Duplicate
is duplicated by SERVER-18842 WiredTiger & indexing: "kernel: Out o... Closed
is duplicated by SERVER-19066 Out Of Memory issues with WiredTiger ... Closed
is duplicated by SERVER-19339 crash with this log Closed
is duplicated by SERVER-19620 Mongod killed because of OOM during i... Closed
Related
is related to SERVER-20159 Out of memory on index build during i... Closed
is related to SERVER-18674 Very low throughput during portion of... Closed
is related to SERVER-18677 Throughput drop during transaction pi... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Participants:

 Description   
Issue Status as of Jul 14, 2015

ISSUE SUMMARY
MongoDB running with the WiredTiger storage engine may, during large index builds, exceed the amount of memory allocated to the database cache.

This issue may prevent large index builds in some situations, such as during initial sync of new replica set members.

USER IMPACT
Excessive memory consumption may cause mongod to either abort with an out-of-memory condition, or be killed by the operating system's OOM killer, leading to a loss of availability of the affected node.

WORKAROUNDS
Lowering the amount of cache available to WiredTiger to the 1GB minimum may allow users affected by this issue to complete large index builds.

AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.4

FIX VERSION
The fix is included in the 3.0.5 production release.

Original description

This has been seen under somewhat different circumstances by a couple of customers.

Initial sync of a large db (multi TB, couple billion documents). Foreground build of _id index starts at A, and cache usage rises steadily to 6x configured maximum before being terminated by OOM.

Mongorestore of 100GB db. Multiple parallel background index builds begin at A; at B cache usage begins to grow until it reaches about 1.5x configured max and is terminated by OOM.

No complete repro yet, although may have got a partial repro: during initial sync of a 500 GB db cache usage briefly rose to about 120% of configured max.



 Comments   
Comment by Githook User [ 29/Jun/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: SERVER-18829 Have pages start in the middle of the LRU queue for eviction.

(cherry picked from commit d57dc26729bbc59c5bc3928aa90bb2ac3cd15d6d)
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/28c395baf4be3cdceb398fea80eb8f7b7513759c

Comment by Alexander Gorrod [ 10/Jun/15 ]

A fix for this is now in MongoDB master, and will be included in the 3.1.5 release.

Comment by Githook User [ 10/Jun/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2018 from wiredtiger/read-gen-midpoint

SERVER-18829 Have pages start in the middle of the LRU queue for eviction
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/d57dc26729bbc59c5bc3928aa90bb2ac3cd15d6d

Comment by Githook User [ 10/Jun/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-18829 Have pages start in the middle of the LRU queue for eviction.
Also make sure that when eviction first needs to run, it can find some pages to
evict.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/32144696f78cf726b8b1df8becca0a86d870efa3

Comment by Githook User [ 10/Jun/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2018 from wiredtiger/read-gen-midpoint

SERVER-18829 Have pages start in the middle of the LRU queue for eviction
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/d57dc26729bbc59c5bc3928aa90bb2ac3cd15d6d

Comment by Githook User [ 10/Jun/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-18829 Have pages start in the middle of the LRU queue for eviction.
Also make sure that when eviction first needs to run, it can find some pages to
evict.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/32144696f78cf726b8b1df8becca0a86d870efa3

Comment by Githook User [ 10/Jun/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2018 from wiredtiger/read-gen-midpoint

SERVER-18829 Have pages start in the middle of the LRU queue for eviction
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/d57dc26729bbc59c5bc3928aa90bb2ac3cd15d6d

Comment by Githook User [ 10/Jun/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-18829 Have pages start in the middle of the LRU queue for eviction.
Also make sure that when eviction first needs to run, it can find some pages to
evict.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/32144696f78cf726b8b1df8becca0a86d870efa3

Generated at Thu Feb 08 03:48:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.