[SERVER-26593] Chunk metadata memory leak on refresh after migration commit Created: 12/Oct/16  Updated: 19/Nov/16  Resolved: 14/Oct/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.4.0-rc0
Fix Version/s: 3.4.0-rc1

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Esha Maharishi (Inactive)
Resolution: Done Votes: 0
Labels: DF
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File sharding.png    
Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2016-10-31
Participants:

 Description   

A node undergoing active balancing was observed to accumulate excess allocated memory at a rate of about 2 GB per day for several days until the node hit OOM. The initial part of a run with the heap profiling enabled shows the following stacks, all in _refreshMetadata, responsible for the bulk of the allocated memory.

heapProfile stack888: { 0: "tc_new", 1: "std::pair<std::_Rb_tree_iterator<std::pair<mongo::BSONObj const, mongo::BSONObj> >, bool> std::_Rb_tree<mongo::BSONObj, std::pair<mongo::BSONObj const...", 2: "mongo::CollectionMetadata::fillRanges", 3: "mongo::MetadataLoader::initChunks", 4: "mongo::MetadataLoader::makeCollectionMetadata", 5: "mongo::ShardingState::_refreshMetadata", 6: "mongo::ShardingState::refreshMetadataNow", 7: "mongo::MigrationSourceManager::MigrationSourceManager", 8: "0x55abc2eabea1", 9: "0x55abc2eaddeb", 10: "mongo::Command::run", 11: "mongo::Command::execCommand", 12: "mongo::runCommands", 13: "mongo::assembleResponse", 14: "mongo::ServiceEntryPointMongod::_sessionLoop", 15: "0x55abc261ce80", 16: "0x55abc326839a", 17: "0x7fd9de1acdc5", 18: "clone" }
heapProfile stack887: { 0: "tc_new", 1: "mongo::ConfigDiffTracker<mongo::BSONObj>::calculateConfigDiff", 2: "mongo::MetadataLoader::initChunks", 3: "mongo::MetadataLoader::makeCollectionMetadata", 4: "mongo::ShardingState::_refreshMetadata", 5: "mongo::ShardingState::refreshMetadataNow", 6: "mongo::MigrationSourceManager::MigrationSourceManager", 7: "0x55abc2eabea1", 8: "0x55abc2eaddeb", 9: "mongo::Command::run", 10: "mongo::Command::execCommand", 11: "mongo::runCommands", 12: "mongo::assembleResponse", 13: "mongo::ServiceEntryPointMongod::_sessionLoop", 14: "0x55abc261ce80", 15: "0x55abc326839a", 16: "0x7fd9de1acdc5", 17: "clone" }
heapProfile stack869: { 0: "tc_malloc", 1: "mongo::mongoMalloc", 2: "mongo::BSONObj::copy", 3: "mongo::BSONObj::getOwned", 4: "mongo::ChunkRange::fromBSON", 5: "mongo::ChunkType::fromBSON", 6: "mongo::ShardingCatalogClientImpl::getChunks", 7: "mongo::MetadataLoader::initChunks", 8: "mongo::MetadataLoader::makeCollectionMetadata", 9: "mongo::ShardingState::_refreshMetadata", 10: "mongo::ShardingState::refreshMetadataNow", 11: "mongo::MigrationSourceManager::MigrationSourceManager", 12: "0x55abc2eabea1", 13: "0x55abc2eaddeb", 14: "mongo::Command::run", 15: "mongo::Command::execCommand", 16: "mongo::runCommands", 17: "mongo::assembleResponse", 18: "mongo::ServiceEntryPointMongod::_sessionLoop", 19: "0x55abc261ce80", 20: "0x55abc326839a", 21: "0x7fd9de1acdc5", 22: "clone" }
heapProfile stack872: { 0: "tc_malloc", 1: "mongo::mongoMalloc", 2: "mongo::BSONObj::copy", 3: "mongo::BSONObj::getOwned", 4: "mongo::ChunkRange::fromBSON", 5: "mongo::ChunkType::fromBSON", 6: "mongo::ShardingCatalogClientImpl::getChunks", 7: "mongo::MetadataLoader::initChunks", 8: "mongo::MetadataLoader::makeCollectionMetadata", 9: "mongo::ShardingState::_refreshMetadata", 10: "mongo::ShardingState::refreshMetadataNow", 11: "mongo::MigrationSourceManager::MigrationSourceManager", 12: "0x55abc2eabea1", 13: "0x55abc2eaddeb", 14: "mongo::Command::run", 15: "mongo::Command::execCommand", 16: "mongo::runCommands", 17: "mongo::assembleResponse", 18: "mongo::ServiceEntryPointMongod::_sessionLoop", 19: "0x55abc261ce80", 20: "0x55abc326839a", 21: "0x7fd9de1acdc5", 22: "clone" }



 Comments   
Comment by Githook User [ 14/Oct/16 ]

Author:

{u'name': u'Esha Maharishi', u'email': u'esha.maharishi@mongodb.com'}

Message: SERVER-26593 decrement usage counter in ScopedCollectionMetadata's move constructor
Branch: master
https://github.com/mongodb/mongo/commit/c1a3bd37d4eb3af247f10398576dcfb5343e81e2

Comment by Kaloian Manassiev [ 13/Oct/16 ]

The leak happens because the move operator of ScopedCollectionMetadata does not execute the destructor logic before accepting the 'moved' object's contents, which means that it will leak a reference. I believe the only usage of the move operator is in MigrationSourceManager::commitChunkMetadataOnConfig. We should preferably get rid of the move operator altogether or we should fix it so it is correct.

Generated at Thu Feb 08 04:12:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.