[SERVER-8442] Map-reduce memory leak Created: 02/Feb/13  Updated: 11/Jul/16  Resolved: 15/Feb/13

Status: Closed
Project: Core Server
Component/s: MapReduce, Sharding
Affects Version/s: 2.2.2, 2.2.3, 2.3.2
Fix Version/s: 2.2.4, 2.4.0-rc1

Type: Bug Priority: Major - P3
Reporter: James Wahlin Assignee: Greg Studer
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Reproduced on Ubuntu 12.04.1 LTS and OS X


Attachments: Text File 31190.txt     Text File 31307.txt    
Issue Links:
Depends
Related
related to SERVER-8456 Mongod memory leak during MapReduce i... Closed
related to SERVER-9861 MapReduceFinishCommand Memory Leak Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

To reproduce:
1) Setup a 2 shard cluster with a mongos and config server.
2) mongorestore attached shard1, shard2 and config dumps.
3) Run attached map-reduce-loop.js script against the clusters mongos.
You will see memory grow on shard1 (looks like no data is found on shard2).

Participants:

 Description   

Non-mapped virtual memory grows on a mongod primary while map-reduce jobs are run. This leads to bouncing the primary once a day to prevent OOM failures. This happens on a single shard of a sharded cluster.

Also attached is valgrind massif output from both shards. From the shard1 file you will see memory allocated in the following grow throughout the run:

->12.98% (10,360,855B) 0x96F7C9: mongo::ourmalloc(unsigned long) (allocator.h:28)
| ->08.59% (6,855,680B) 0xD6C6B1: mongo::MessagingPort::recv(mongo::Message&) (message_port.cpp:188)
| | ->08.59% (6,854,656B) 0x9AF673: mongo::DBClientConnection::recv(mongo::Message&) (dbclient.cpp:1145)
| | | ->08.59% (6,854,656B) 0x9D2D29: mongo::DBClientCursor::initLazyFinish(bool&) (dbclientcursor.cpp:89)
| | |   ->08.59% (6,854,656B) 0x9FC29E: mongo::ParallelSortClusteredCursor::_oldInit() (parallel.cpp:1425)
| | |     ->08.59% (6,854,656B) 0x9FB507: mongo::ParallelSortClusteredCursor::_init() (parallel.cpp:1282)
| | |       ->08.59% (6,854,656B) 0x9F0DD3: mongo::ClusteredCursor::init() (parallel.cpp:83)
| | |         ->08.59% (6,854,656B) 0xA7CADB: mongo::mr::MapReduceFinishCommand::run(std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) (mr.cpp:1311)
| | |           ->08.59% (6,854,656B) 0xA98918: mongo::_execCommand(mongo::Command*, std::string const&, mongo::BSONObj&, int, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1859)
| | |             ->08.59% (6,854,656B) 0xA9968A: mongo::execCommand(mongo::Command*, mongo::Client&, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1985)
| | |               ->08.59% (6,854,656B) 0xA9A0E3: mongo::_runCommands(char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (dbcommands.cpp:2069)
| | |                 ->08.59% (6,854,656B) 0xB8F3CC: mongo::runCommands(char const*, mongo::BSONObj&, mongo::CurOp&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (query.cpp:43)
| | |                   ->08.59% (6,854,656B) 0xB93DAD: mongo::runQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&) (query.cpp:920)
| | |                     ->08.59% (6,854,656B) 0xB1E974: mongo::receivedQuery(mongo::Client&, mongo::DbResponse&, mongo::Message&) (instance.cpp:244)
| | |                       ->08.59% (6,854,656B) 0xB1F922: mongo::assembleResponse(mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) (instance.cpp:390)
| | |                         ->08.59% (6,854,656B) 0x9757F1: mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) (db.cpp:192)
| | |                           ->08.59% (6,854,656B) 0xD6F2A8: mongo::pms::threadRun(mongo::MessagingPort*) (message_server_port.cpp:85)
| | |                             ->08.59% (6,854,656B) 0x4E35E98: start_thread (pthread_create.c:308)
| | |                               ->08.59% (6,854,656B) 0x5950CBB: clone (clone.S:112)



 Comments   
Comment by auto [ 20/Mar/13 ]

Author:

{u'date': u'2013-02-08T19:03:14Z', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-8442 don't release() direct m/r cursors to shards
Branch: v2.2
https://github.com/mongodb/mongo/commit/33aa9495f99cc61659961ef7358fb31d8ba1b347

Comment by auto [ 11/Feb/13 ]

Author:

{u'date': u'2013-02-08T19:03:14Z', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-8442 don't release() direct m/r cursors to shards
Branch: master
https://github.com/mongodb/mongo/commit/4257f887cca2171cde6353a9aada243a7a02852e

Comment by James Wahlin [ 04/Feb/13 ]

Attached valgrind massif output for both shards during map-reduce job.

Comment by James Wahlin [ 02/Feb/13 ]

As part of investigation please see whether there are any workarounds that can be used for MongoDB 2.2.2.

Generated at Thu Feb 08 03:17:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.