[SERVER-9861] MapReduceFinishCommand Memory Leak Created: 04/Jun/13  Updated: 11/Jul/16  Resolved: 06/Aug/13

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: 2.2.4, 2.4.4
Fix Version/s: 2.4.7, 2.5.2

Type: Bug Priority: Major - P3
Reporter: James Wahlin Assignee: Greg Studer
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

MongoDB 2.2.4, Ubuntu 12.0.4


Attachments: Text File 1456.txt    
Issue Links:
Depends
Related
is related to SERVER-8442 Map-reduce memory leak Closed
Operating System: ALL
Steps To Reproduce:

To reproduce:

  1. Setup a 2 shard MongoDB 2.2.4 cluster with a mongos and config server.
  2. mongorestore attached shard1, shard2 and config dumps.
  3. Run attached map-reduce-loop.js script against the clusters mongos.
    You will see memory grow on shard1 (looks like no data is found on shard2).

Files to use for the restore are:
Map Reduce Script
Mongodump - Shard 1
Mongodump - Shard 2
Mongodump - Config

Participants:

 Description   
Issue Status as of October 23rd, 2013

ISSUE SUMMARY
When mapReduce is run repeatedly on the same client connection the mongod will continue to keep track of any temporary collections used during the mapReduce. These temporary collection names will slowly build up in a cache on the mongod, appearing as a slow memory leak.

USER IMPACT
This impacts users of mapReduce on sharded collections and manifests as a slow increase in non-mapped virtual memory on mongod. It is present in versions of MongoDB prior to and including v2.4.6.

SOLUTION
After each mapReduce completes and the temporary collections are dropped, also explicitly remove the collection name from the cache used for keeping track of namespace versioning.

WORKAROUNDS
This issue can be worked around by stepping down the primary mongod of the shard.

PATCHES
Production release v2.4.7 contains the fix for this issue, and production release v2.6.0 will contain the fix as well.

Original Description

When running map-reduce continually with the test set provided we see 2 paths in which there is continual heap growth. The following is valgrind massif output for both. Note that in my test, case 1 grew from 2 to 6 MB and case 2 grew from 1 to 3 MB, over the course of a few hours.
Case 1:

->08.50% (6,239,857B) 0x50E9A87: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
| ->08.45% (6,208,357B) 0x50EA7F9: std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
| | ->08.45% (6,208,323B) 0x50EA8DE: std::string::reserve(unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
| | | ->08.44% (6,200,449B) 0x50EAE0B: std::string::append(std::string const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
| | | | ->08.44% (6,199,973B) 0x66A933: mongo::mr::MapReduceFinishCommand::run(std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) (basic_string.h:2310)
| | | | | ->08.44% (6,199,973B) 0x67DBFD: mongo::_execCommand(mongo::Command*, std::string const&, mongo::BSONObj&, int, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1859)
| | | | |   ->08.44% (6,199,973B) 0x67E684: mongo::execCommand(mongo::Command*, mongo::Client&, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1985)
| | | | |     ->08.44% (6,199,973B) 0x67F58B: mongo::_runCommands(char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (dbcommands.cpp:2069)
| | | | |       ->08.44% (6,199,973B) 0x74CFC8: mongo::runCommands(char const*, mongo::BSONObj&, mongo::CurOp&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (query.cpp:43)
| | | | |         ->08.44% (6,199,973B) 0x750AAD: mongo::runQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&) (query.cpp:920)
| | | | |           ->08.44% (6,199,973B) 0x6FB166: mongo::assembleResponse(mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) (instance.cpp:244)
| | | | |             ->08.44% (6,199,973B) 0x592777: mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) (db.cpp:193)
| | | | |               ->08.44% (6,199,973B) 0x90DDC9: mongo::pms::threadRun(mongo::MessagingPort*) (message_server_port.cpp:85)
| | | | |                 ->08.44% (6,199,973B) 0x4E35E98: start_thread (pthread_create.c:308)
| | | | |                   ->08.44% (6,199,973B) 0x5950CCB: clone (clone.S:112)

Second (looks like stats logging for temporary MR namespaces?):

->04.73% (3,470,360B) 0x5A8F4E: std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::_M_insert_(std::_Rb_tree_node_base const*, std::_Rb_tree_node_base const*, std::string const&) (new_allocator.h:92)
| ->04.73% (3,470,120B) 0x5BAA50: std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::_M_insert_unique(std::string const&) (stl_tree.h:1291)
| | ->04.73% (3,470,080B) 0x8B5BA7: mongo::ClientConnections::_check(std::string const&) (stl_set.h:410)
| | | ->04.73% (3,470,080B) 0x8B5C28: mongo::ClientConnections::get(std::string const&, std::string const&) (shardconnection.cpp:149)
| | |   ->04.73% (3,470,080B) 0x8B4142: mongo::ShardConnection::_init() (shardconnection.cpp:323)
| | |     ->04.73% (3,470,080B) 0x8B420C: mongo::ShardConnection::ShardConnection(std::string const&, std::string const&, boost::shared_ptr<mongo::ChunkManager const>) (shardconnection.cpp:316)
| | |       ->04.73% (3,470,080B) 0x5F8E86: mongo::ParallelSortClusteredCursor::_oldInit() (parallel.cpp:1377)
| | |         ->04.73% (3,470,080B) 0x669753: mongo::mr::MapReduceFinishCommand::run(std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) (mr.cpp:1308)
| | |           ->04.73% (3,470,080B) 0x67DBFD: mongo::_execCommand(mongo::Command*, std::string const&, mongo::BSONObj&, int, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1859)
| | |             ->04.73% (3,470,080B) 0x67E684: mongo::execCommand(mongo::Command*, mongo::Client&, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1985)
| | |               ->04.73% (3,470,080B) 0x67F58B: mongo::_runCommands(char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (dbcommands.cpp:2069)
| | |                 ->04.73% (3,470,080B) 0x74CFC8: mongo::runCommands(char const*, mongo::BSONObj&, mongo::CurOp&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (query.cpp:43)
| | |                   ->04.73% (3,470,080B) 0x750AAD: mongo::runQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&) (query.cpp:920)
| | |                     ->04.73% (3,470,080B) 0x6FB166: mongo::assembleResponse(mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) (instance.cpp:244)
| | |                       ->04.73% (3,470,080B) 0x592777: mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) (db.cpp:193)
| | |                         ->04.73% (3,470,080B) 0x90DDC9: mongo::pms::threadRun(mongo::MessagingPort*) (message_server_port.cpp:85)
| | |                           ->04.73% (3,470,080B) 0x4E35E98: start_thread (pthread_create.c:308)
| | |                             ->04.73% (3,470,080B) 0x5950CCB: clone (clone.S:112)



 Comments   
Comment by auto [ 02/Oct/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-9861 explicitly forget temporary collections for versioning in M/R
Branch: v2.4
https://github.com/mongodb/mongo/commit/8d082558e94bf937de09a1d6be62cc85441c1c5b

Comment by auto [ 06/Aug/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-9861 explicitly forget temporary collections for versioning in M/R
Branch: master
https://github.com/mongodb/mongo/commit/e0d9e65fd3c2dfaec1c82504c310288aacac9ce4

Comment by James Wahlin [ 04/Jun/13 ]

Attached ms_print formatted valgrind massif output.

Generated at Thu Feb 08 03:21:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.