[SERVER-12282] Memory leak in query PlanCache Created: 07/Jan/14  Updated: 11/Jul/16  Resolved: 21/Jan/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.5.4
Fix Version/s: 2.5.5

Type: Bug Priority: Critical - P2
Reporter: Rui Zhang (Inactive) Assignee: Benety Goh
Resolution: Done Votes: 0
Labels: 26qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

SHA & Build Infor:
04221630900335fc6ee0d9922edae9d29f02d66d
Build 2014-01-03
x86_64

Tested with Ec2 Mongodb AMI


Attachments: Text File dmesg.log     PNG File memory_trend_248_vs_255.png     PNG File memory_trend_255.png     File plancacheleak.svg    
Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Add some data to your db (5 gb is fine). Then:

db.foo.ensureIndex({x:1})
while (true) { db.foo.insert({x:1}); db.foo.remove({x:1});}

Run multiple while(true) loops to leak faster...

Participants:

 Description   

Running a simple operation (such as insert, remove the same document, in a loop) will result in heap space growth. See attached svg for heap analyzer result, which targets a std::vector and the stuff we're storing in it, within PlanCache. See reproduction steps above, for debugging.



 Comments   
Comment by Rui Zhang (Inactive) [ 22/Jan/14 ]

for comparison, this is memory usage for 2.4.8

Total private: 37668 kb (d: 29836 | c: 7832)
Total private: 1810460 kb (d: 31768 | c: 1778692)
Total private: 2043816 kb (d: 31820 | c: 2011996)
Total private: 2075752 kb (d: 32036 | c: 2043716)
Total private: 2080084 kb (d: 32032 | c: 2048052)
Total private: 2080680 kb (d: 32032 | c: 2048648)
Total private: 2080780 kb (d: 32032 | c: 2048748)
Total private: 2080796 kb (d: 32032 | c: 2048764)
Total private: 2080804 kb (d: 32032 | c: 2048772)
Total private: 2080808 kb (d: 32036 | c: 2048772)

close this ticket

Comment by Rui Zhang (Inactive) [ 22/Jan/14 ]

made a build with the fix, and memory leak is fixed.

test run with 100k per batch, memory usage taken after each run

Total private: 49732 kb (d: 26216 | c: 23516)
Total private: 1822276 kb (d: 27628 | c: 1794648)
Total private: 2056200 kb (d: 27788 | c: 2028412)
Total private: 2087692 kb (d: 28056 | c: 2059636)
Total private: 2092008 kb (d: 28256 | c: 2063752)
Total private: 2092856 kb (d: 28496 | c: 2064360)
Total private: 2093296 kb (d: 28784 | c: 2064512)
Total private: 2093308 kb (d: 28788 | c: 2064520)
Total private: 2093320 kb (d: 28796 | c: 2064524)
Total private: 2093324 kb (d: 28800 | c: 2064524)

will continue QA-373 test for much longer test run, and keep monitoring this. so far, it looks good.

Comment by Benety Goh [ 21/Jan/14 ]

Fixed the CachedSolution memory leak in get_runner.cpp as indicated by Eric's investigation.

Comment by Githook User [ 21/Jan/14 ]

Author:

{u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}

Message: SERVER-12282 fixed memory leak in get_runner. fixed planFromCache signature to reflect cached solution read-only access
Branch: master
https://github.com/mongodb/mongo/commit/06f7f3ca68c95277b1c11f469f01e61cbeb3cc76

Comment by Rui Zhang (Inactive) [ 07/Jan/14 ]

dan@10gen.com update the environment field with SHA and build infor

schwerin & acm I haven't be able to isolate to that level. Already have traffic profiles for those, Will run it over the next few hours and over night.

Comment by Daniel Pasette (Inactive) [ 07/Jan/14 ]

Is this running against 2.5.3 as indicated in the ticket? If not, please supply the git hash of the version used in testing.

Generated at Thu Feb 08 03:28:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.