[SERVER-5093] shard3.js memory corruption in Linux 64-bit build Created: 25/Feb/12  Updated: 11/Jul/16  Resolved: 27/Feb/12

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 2.0.5, 2.1.1

Type: Bug Priority: Major - P3
Reporter: Eric Milkie Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: buildbot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux 64


Issue Links:
Duplicate
Operating System: ALL
Participants:

 Description   

 m30000| Fri Feb 24 14:07:06 [conn13] { $err: "ns: test.foo ns: test.foo [test.foo] shard version not ok in Client::Context: collection was dropped or this shard no longer valid version: 0|0 client...", code: 13388, ns: "test.foo" }
*** glibc detected *** /mnt/home/buildbot/slave/Linux_64bit/mongo/mongos: corrupted double-linked list: 0x0000000000c09a60 ***
 m30998| Fri Feb 24 14:07:06 [conn1] retrying command: { count: "foo", query: {}, fields: {} }
 m30998| Fri Feb 24 14:07:06 [conn1] DBConfig unserialize: test { _id: "test", partitioned: true, primary: "shard0001" }
 m30998| Fri Feb 24 14:07:06 [conn1] DBConfig unserialize: test { _id: "test", partitioned: true, primary: "shard0001" }
 m30998| Fri Feb 24 14:07:06 [conn1] User Assertion: 10181:not sharded:test.foo
 m30998| Fri Feb 24 14:07:06 [conn1] warning: chunk manager not found for test.foo :: caused by :: 10181 not sharded:test.foo
======= Backtrace: =========
/lib64/libc.so.6[0x2aaaab6e49d3]
/lib64/libc.so.6[0x2aaaab6e657e]
/lib64/libc.so.6(cfree+0x8c)[0x2aaaab6e9f2c]
/mnt/home/buildbot/slave/Linux_64bit/mongo/mongos(_ZN5mongo14BSONObjBuilderD1Ev+0x222)[0x4d8842]
/mnt/home/buildbot/slave/Linux_64bit/mongo/mongos(_ZN5mongo14SingleStrategy7queryOpERNS_7RequestE+0x17db)[0x4eb1ab]
/mnt/home/buildbot/slave/Linux_64bit/mongo/mongos(_ZN5mongo13ShardStrategy7queryOpERNS_7RequestE+0x993)[0x4e1b63]
/mnt/home/buildbot/slave/Linux_64bit/mongo/mongos(_ZN5mongo7Request7processEi+0x17c)[0x530afc]
/mnt/home/buildbot/slave/Linux_64bit/mongo/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x74)[0x546c54]
/mnt/home/buildbot/slave/Linux_64bit/mongo/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x2f2)[0x79af32]
/lib64/libpthread.so.0[0x2aaaaaccd407]
/lib64/libc.so.6(clone+0x6d)[0x2aaaab748b0d]

http://buildbot.mongodb.org:8081/builders/Linux%2064-bit/builds/4166/steps/test_4/logs/stdio



 Comments   
Comment by auto [ 19/Apr/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5093 fix logic error to avoid memory corruption.
Branch: v2.0
https://github.com/mongodb/mongo/commit/1911459b0dd643506afe9f0280288090fa123a06

Comment by auto [ 27/Feb/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5093 fix logic error to avoid memory corruption
Branch: master
https://github.com/mongodb/mongo/commit/f31ebbd79b7976b6b11124a42ed8b7f461b36b27

Comment by Eric Milkie [ 27/Feb/12 ]

Sorry, I was wrong about the threading issue. This is just a logic error. We are taking a reference to an object, freeing the object, then attempting to set the reference to something new:

            CollectionInfo& ci = _collections[ns];
            
            bool earlyReload = ! ci.isSharded() && ( shouldReload || forceReload );
            if ( earlyReload ) {
                // this is to catch cases where there this is a new sharded collection
                _reload();                              <--------------------------------- _collections is freed/reset here
                ci = _collections[ns];                  <--------------------------------- writes to freed memory
            }

The fix should be easy.

Comment by Eric Milkie [ 27/Feb/12 ]

Issue: writing to freed memory due to modifying a red-black tree in one thread and accessing it from another thread:

 m30998| ==30621== Thread 9:
 m30998| ==30621== Invalid write of size 8
 m30998| ==30621==    at 0x64172F: mongo::DBConfig::getChunkManager(std::string const&, bool, bool) (bsonobj.h:474)
 m30998| ==30621==    by 0x642269: mongo::DBConfig::getChunkManagerIfExists(std::string const&, bool, bool) (config.cpp:260)
 m30998| ==30621==    by 0x53B0AB: mongo::VersionManager::forceRemoteCheckShardVersionCB(std::string const&) (shard_version.cpp:142)
 m30998| ==30621==    by 0x4F078A: mongo::SingleStrategy::queryOp(mongo::Request&) (strategy_single.cpp:73)
 m30998| ==30621==    by 0x4E89DA: mongo::ShardStrategy::queryOp(mongo::Request&) (strategy_shard.cpp:39)
 m30998| ==30621==    by 0x5179FB: mongo::Request::process(int) (request.cpp:129)
 m30998| ==30621==    by 0x525121: mongo::ShardedMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) (server.cpp:95)
 m30998| ==30621==    by 0x5F713F: mongo::pms::threadRun(mongo::MessagingPort*) (message_server_port.cpp:77)
 m30998| ==30621==    by 0x3037007D8F: start_thread (pthread_create.c:309)
 m30998| ==30621==    by 0x3036CEF48C: clone (clone.S:115)
 m30998| ==30621==  Address 0x5634678 is 40 bytes inside a block of size 88 free'd
 m30998| ==30621==    at 0x4A06336: operator delete(void*) (vg_replace_malloc.c:457)
 m30998| ==30621==    by 0x57CB29: std::_Rb_tree<std::string, std::pair<std::string const, mongo::DBConfig::CollectionInfo>, std::_Select1st<std::pair<std::string const, mongo::DBConfig::CollectionInfo> >, std::less<std::string>, std::allocator<std::pair<std::string const, mongo::DBConfig::CollectionInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<std::string const, mongo::DBConfig::CollectionInfo> >*) (new_allocator.h:98)
 m30998| ==30621==    by 0x644809: std::_Rb_tree<std::string, std::pair<std::string const, mongo::DBConfig::CollectionInfo>, std::_Select1st<std::pair<std::string const, mongo::DBConfig::CollectionInfo> >, std::less<std::string>, std::allocator<std::pair<std::string const, mongo::DBConfig::CollectionInfo> > >::_M_erase_aux(std::_Rb_tree_const_iterator<std::pair<std::string const, mongo::DBConfig::CollectionInfo> >, std::_Rb_tree_const_iterator<std::pair<std::string const, mongo::DBConfig::CollectionInfo> >) (stl_tree.h:799)
 m30998| ==30621==    by 0x6407C9: mongo::DBConfig::_load() (stl_tree.h:787)
 m30998| ==30621==    by 0x641714: mongo::DBConfig::getChunkManager(std::string const&, bool, bool) (config.cpp:281)
 m30998| ==30621==    by 0x642269: mongo::DBConfig::getChunkManagerIfExists(std::string const&, bool, bool) (config.cpp:260)
 m30998| ==30621==    by 0x53B0AB: mongo::VersionManager::forceRemoteCheckShardVersionCB(std::string const&) (shard_version.cpp:142)
 m30998| ==30621==    by 0x4F078A: mongo::SingleStrategy::queryOp(mongo::Request&) (strategy_single.cpp:73)
 m30998| ==30621==    by 0x4E89DA: mongo::ShardStrategy::queryOp(mongo::Request&) (strategy_shard.cpp:39)
 m30998| ==30621==    by 0x5179FB: mongo::Request::process(int) (request.cpp:129)
 m30998| ==30621==    by 0x525121: mongo::ShardedMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) (server.cpp:95)
 m30998| ==30621==    by 0x5F713F: mongo::pms::threadRun(mongo::MessagingPort*) (message_server_port.cpp:77)
 m30998| ==30621== 
 m30998| ==30621== 

Generated at Thu Feb 08 03:07:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.