[SERVER-25314] DBConfig::_dropShardedCollections doesn't use lock_guard for CollectionInfoMap Created: 27/Jul/16  Updated: 19/Nov/16  Resolved: 22/Sep/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.3.14

Type: Bug Priority: Major - P3
Reporter: Kamran K. Assignee: Nathan Myers
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2016-08-29, Sharding 2016-09-19, Sharding 2016-10-10
Participants:
Linked BF Score: 0

 Description   

DBConfig::_dropShardedCollections and its caller DBConfig::dropDatabase do not use a lock_guard before accessing _collections.

I think this is the cause of a crash triggered by the concurrent fuzzer:

I SHARDING [conn641] distributed lock with ts: 5798f5bc8ca63ab7b5856c00' unlocked.
I CONTROL  [conn641] *** unhandled exception (access violation) at 0x000007FEFB0CC890, terminating
I CONTROL  [conn641] *** access violation was a read from 0x000000340031003A
I CONTROL  [conn641] *** stack trace for unhandled exception:
I CONTROL  [conn641] VCRUNTIME140.dll                                                                                   memmove+0x60
I CONTROL  [conn641] MSVCP140.dll                                                                                       std::basic_streambuf<char,std::char_traits<char> >::xsputn+0x5f
I CONTROL  [conn641] mongos.exe        c:\program files (x86)\microsoft visual studio 14.0\vc\include\string(196)       std::operator<<<char,std::char_traits<char>,std::allocator<char> >+0x115
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\config.cpp(666)                                                  mongo::DBConfig::_dropShardedCollections+0x46b
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\config.cpp(571)                                                  mongo::DBConfig::dropDatabase+0x59a
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\commands\cluster_drop_database_cmd.cpp(112)                      mongo::`anonymous namespace'::DropDatabaseCmd::run+0x220
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\s_only.cpp(156)                                                  mongo::Command::execCommandClientBasic+0x63b
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\commands\strategy.cpp(110)                                       mongo::`anonymous namespace'::runAgainstRegistered+0x26d
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\commands\strategy.cpp(266)                                       mongo::Strategy::clientCommandOp+0x859
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\commands\request.cpp(110)                                        mongo::Request::process+0x3e8
I CONTROL  [conn641] mongos.exe        ...\src\mongo\s\service_entry_point_mongos.cpp(108)                              mongo::ServiceEntryPointMongos::_sessionLoop+0x1a5
I CONTROL  [conn641] mongos.exe        ...\src\mongo\transport\service_entry_point_utils.cpp(74)                        mongo::`anonymous namespace'::runFunc+0x193
I CONTROL  [conn641] mongos.exe        c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(247)  std::_LaunchPad<std::unique_ptr<std::tuple<std::_Binder<std::_Unforced,void * __ptr64 (__cdecl&)(void * __ptr64),mongo::`anonymous namespace'::Context * __ptr64> >,std::default_delete<std::tuple<std::_Binder<std::_Unforced,void * __ptr64 (__cdecl&)(void * __ptr64),mongo::`anonymous namespace'::Context * __ptr64> > > > >::_Run+0x75
I CONTROL  [conn641] mongos.exe        c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(210)  std::_Pad::_Call_func+0x9
I CONTROL  [conn641] ucrtbase.DLL                                                                                       crt_at_quick_exit+0x7d
I CONTROL  [conn641] kernel32.dll                                                                                       BaseThreadInitThunk+0xd
I -        [conn641]
I CONTROL  [conn641] writing minidump diagnostic file C:\data\mci\b574bba2245ed393127dcbb0d1720cf7\src\mongos.2016-07-27T17-56-15.mdmp
I CONTROL  [conn641] *** immediate exit due to unhandled exception



 Comments   
Comment by Githook User [ 22/Sep/16 ]

Author:

{u'username': u'nathan-myers-mongo', u'name': u'Nathan Myers', u'email': u'nathan.myers@10gen.com'}

Message: SERVER-25314 lock sharded database while dropping
Branch: master
https://github.com/mongodb/mongo/commit/3a372a491db4c852dfb2258594a1b3b2846c3d5a

Comment by Randolph Tan [ 16/Sep/16 ]

One way a seg fault can manifest:

  1. Thread A calls dropDatabase on test.
  2. Thread B calls drop on test.foo.
  3. Thread A tries to all collections in test and gets an iterator pointing to the CollectionInfo entry of test.foo.
  4. Thread B gets dist lock on test.foo.
  5. Thread A tries to start drop on test.foo but gets blocked because thread B has dist lock.
  6. Thread B finishes dropping test.foo and unlocks dist lock.
  7. Thread B calls DBConfig::invalidateNS on test.foo. This deletes CollectionInfo entry for test.foo.
  8. Thread A gets the dist lock and tries to use the iterator for test.foo -> segfault
Generated at Thu Feb 08 04:08:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.