[SERVER-22793] Unbounded memory usage by long-running query using projection Created: 22/Feb/16  Updated: 17/Nov/16  Resolved: 24/Feb/16

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.0.9, 3.2.3, 3.3.2
Fix Version/s: 3.0.10, 3.2.4, 3.3.3

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: David Storch
Resolution: Done Votes: 0
Labels: WTplaybook, code-only
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File growth.png    
Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Sprint: Query 11 (03/14/16)
Participants:

 Description   
  • Create a collection with 240 M records containing nothing but default _id (i.e. db.c.insert({})).
  • run mongoexport with a projection stage, e.g.

    mongoexport -d test -c c --type=csv -f _id -o /dev/null
    

Memory behavior as follows:

  • (collection is being created A-B; during this period we see WT cache usage grow, and at A we see a difference of about 500 MB between allocated memory and WT cache; this is probably a small (~3%) WT accounting error related to unaccounted for allocator overhead. This is unrelated to this ticket.)
  • starting at B the mongoexport query is running, and we see apparently linear memory growth outside WT cache ("tcmalloc allocated minus wt cache") rising to several GB
  • the growth pattern appears to correspond to a growing buffer that is doubled in size every time more space is needed; at C and D we see temporary spikes, presumably corresponding to both the old buffer and the new expanded buffer existing simultaneously while the old data is copied to the new buffer.


 Comments   
Comment by Githook User [ 26/Feb/16 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-22793 always clear buffered WorkingSetIDs on saveState

This ensures that the set of WorkingSetIDs does not grow without
bound.
Branch: v3.0
https://github.com/mongodb/mongo/commit/cf6f3fb6341910f471f81172d75919ff5cf82eb9

Comment by Githook User [ 25/Feb/16 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-22793 clear buffered WorkingSetIDs on storage engines without document-level concurrency

This ensures that the set of WorkingSetIDs does not grow without bound
on MMAPv1.

(cherry picked from commit aaa5074a59327cdd2d0a462bb27c98f1c1c3ec6a)
Branch: v3.2
https://github.com/mongodb/mongo/commit/5149712212bd52eec0d511779427fd95781d7838

Comment by Githook User [ 24/Feb/16 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-22793 clear buffered WorkingSetIDs on storage engines without document-level concurrency

This ensures that the set of WorkingSetIDs does not grow without bound
on MMAPv1.
Branch: master
https://github.com/mongodb/mongo/commit/aaa5074a59327cdd2d0a462bb27c98f1c1c3ec6a

Comment by David Storch [ 24/Feb/16 ]

After further investigation, it appears that this issue also affects servers using the MMAP storage engine on the master and 3.2 branches. Servers running 3.2.x or 3.3.x versions of WiredTiger are not affected. On the 3.0 branch, both MMAP and WiredTiger are affected. However, only queries which use covered execution plans are affected on 3.0.x WiredTiger.

Given this new information, the plan is to use our normal backport workflow. Specifically,

  1. Develop a fix and push to the master branch.
  2. Backport the fix to the 3.2 branch.
  3. Develop a custom fix for the 3.0 branch, since 3.2/master have diverged from 3.0.
Comment by Bruce Lucas (Inactive) [ 23/Feb/16 ]

Yes, that fixes it, thanks.

Comment by David Storch [ 23/Feb/16 ]

Thanks bruce.lucas. After some more code reading I think I found the likely cause of the problem. Would you be able to try applying the following patch to the v3.0 branch and see if it fixes the problem?

diff --git a/src/mongo/db/query/plan_executor.cpp b/src/mongo/db/query/plan_executor.cpp
index 4cf50d9..1b03876 100644
--- a/src/mongo/db/query/plan_executor.cpp
+++ b/src/mongo/db/query/plan_executor.cpp
@@ -252,10 +252,7 @@ void PlanExecutor::saveState() {
     // mergeSort, sort) which are no longer protected by the storage engine's transactional
     // boundaries. Force-fetch the documents for any such record ids so that we have our
     // own copy in the working set.
-    //
-    // This is not necessary for covered plans, as such plans never use buffered record ids
-    // for index or collection lookup.
-    if (supportsDocLocking() && _collection && (!_qs.get() || _qs->root->fetched())) {
+    if (supportsDocLocking() && _collection) {
         WorkingSetCommon::forceFetchAllLocs(_opCtx, _workingSet.get(), _collection);
     }

If I'm right, the problem isn't a WorkingSetMember leak. Instead, it's some logic that buffers WorkingSetMember ids. In preparation for yield (when the storage engine supports document-granularity concurrency), we traverse this set of ids, do a bit of necessary prep work, and then clear the set. We skip this work for covered plans since it is not necessary for correctness---but it is necessary for keeping our memory footprint in check!

Comment by Bruce Lucas (Inactive) [ 23/Feb/16 ]

It is using a $snapshot query. With that information I was able to reproduce the problem with the following shell queries:

db.c.find({}, {_id:1}).snapshot().itcount()
db.c.find({}, {_id:1}).hint({_id:1}).itcount()

Problem does not reproduce with the following:

db.c.find({}).hint({_id:1}).itcount()
db.c.find({}, {_id:1}).hint({$natural:1}).itcount()

So it seems that both index scan and projection must be present.

Comment by David Storch [ 22/Feb/16 ]

bruce.lucas, it is possible that the vector holding the WorkingSetMembers is growing without bound if there is some code path which leaks a WorkingSetMember. I did a little bit of code inspection on HEAD of the 3.0 branch to see if I could spot such a leak, but didn't find one.

What is the actual operation that mongoexport is running against the server?

Comment by Bruce Lucas (Inactive) [ 22/Feb/16 ]

Does not reproduce without the project stage, e.g. mongoexport without -f.

Comment by Bruce Lucas (Inactive) [ 22/Feb/16 ]

Could not reproduce this on 3.2.3.

Comment by Bruce Lucas (Inactive) [ 22/Feb/16 ]

A customer reported an OOM error because of this, with the following stack trace:

mongod(_ZN5mongo29reportOutOfMemoryErrorAndExitEv+0x49) [0xf99ad9]
mongod(tc_new+0x1B5) [0x146b915]
mongod(_ZNSt6vectorImSaImEE19_M_emplace_back_auxIIRKmEEEvDpOT_+0x38) [0x9bdcd8]
mongod(_ZN5mongo9IndexScan4workEPm+0x2FE) [0xa12dbe]
mongod(_ZN5mongo15ProjectionStage4workEPm+0x54) [0xa209c4]
mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x142) [0xbe7be2]
mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x3B) [0xbe7efb]
mongod(_ZN5mongo7getMoreEPNS_16OperationContextEPKcixRNS_5CurOpEiRbPb+0x56E) [0xbb30ae]
mongod(_ZN5mongo15receivedGetMoreEPNS_16OperationContextERNS_10DbResponseERNS_7MessageERNS_5CurOpE+0x3A1) [0xabec51]
mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x1469) [0xac4239]
mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xF5) [0x7fd995]
mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x339) [0xf49b59]
 libpthread.so.0(+0x79D1) [0x3a950079d1]
 libc.so.6(clone+0x6D)

The vector being added to in IndexScan::work, called from ProjectionScan::work, seems a likely culprit.

Generated at Thu Feb 08 04:01:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.