[SERVER-8967] Memory leaks from DatabaseHolder::getOrCreate() and related Created: 13/Mar/13  Updated: 06/Dec/22  Resolved: 14/Sep/18

Status: Closed
Project: Core Server
Component/s: MMAPv1, Storage
Affects Version/s: 2.2.3, 2.4.0-rc2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ben Becker Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-2498 small memory leak when closing a data... Closed
Assigned Teams:
Storage Execution
Operating System: ALL
Steps To Reproduce:

for (var i = 0; i<100000; i++) {
  var x = db.getSiblingDB('foo'+i).foo.findOne();
}

Participants:

 Description   

There appear to be several related leaks present:

  • DatabaseHolder::getOrCreate() allocates a new Database object and stores this pointer in _paths[path][dbname]. Although Database::closeDatabase() removes these entries, there doesn't seem to be any logic to remove entries for a database that is created but not written to.
  • When we getOrCreate() a db that's never used, the global mmfiles set is updated with a MongoFile* that has no file name. There is a comment in MongoFile::destroyed() which indicates this function should be called when a derived class is destroyed. However, ~MongoMMF() doesn't appear to be called for a database which is not written to. Note that ~MongoMMF() calls MongoMMF::close(), which calls MemoryMappedFile::close(), which finally calls MongoFile::destroy(). ~MongoMMF() is called from a few places; notably ~MongoDataFile(), ~NamespaceIndex(), and the journal recovery code.
  • There are additional leaks that have not yet been identified, but can be reproduced with the repro script. The script not only causes a leak when the unused databases are 'accessed', but also a continuously increasing leak while mongod remains alive. The leak rate is about 2mb per minute when the attached script is reduced to only execute 100 iterations.

*Also, possibly worth noting, DataFileSync must iterate over all unique unwritten database names.



 Comments   
Comment by Ben Becker [ 02/May/13 ]

Note that one manifestation of this bug is the following log entry, which accounts for more files than actually exist:

[DataFileSync] flushing mmaps took 12379ms  for 15642 files

Since MongoFile::_flushAll() holds LockMongoFilesShared while iterating over the set, additional lock contention may be observed for write operations, yielding, journaling, etc. Also, commands like listDatabases and serverStatus will generally be slower.

Generated at Thu Feb 08 03:18:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.