Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10201

DistributedLockPinger::shouldKill not thread safe (sync6.js failure)

    XMLWordPrintable

Details

    • Windows

    Description

      Windows 64-bit 2008R2+ DEBUG Build #194 July 13 (rev 458cfbf633d)

      http://buildbot.mongodb.org/builders/Windows%2064-bit%202008R2%2B%20DEBUG/builds/194/steps/test_11/logs/stdio
      http://buildlogs.mongodb.org/Windows%2064-bit%202008R2%2B%20DEBUG/builds/194/test/sharding/sync6.js

      Ends with:

       m30004| Sat Jul 13 15:59:13.145 [thread-7] about to acquire distributed lock 'TimeSkewFailNewTest_lock_8/AMAZONA-VVETJQF:30004:1373745544:18467:thread-7:41:
       m30004| { "state" : 1,
       m30004|   "who" : "AMAZONA-VVETJQF:30004:1373745544:18467:thread-7:41",
       m30004|   "process" : "AMAZONA-VVETJQF:30004:1373745544:18467:thread-7:41",
       m30004|   "when" : { "$date" : "Wed Jul 17 14:06:52 2013" },
       m30004|   "why" : "Testing distributed lock with skew.",
       m30004|   "ts" : { "$oid" : "51e1b190f8b27d377bf2e2e6" } }
       m30004| { "_id" : "TimeSkewFailNewTest_lock_8",
       m30004|   "state" : 0,
       m30004|   "ts" : { "$oid" : "51e1b18af8b27d377bf2e2e2" } }
      Sat Jul 13 15:59:13.161 Socket recv() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:30004
      Sat Jul 13 15:59:13.161 SocketException: remote: 127.0.0.1:30004 error: 9001 socket exception [RECV_ERROR] server [127.0.0.1:30004] 
      Sat Jul 13 15:59:13.161 DBClientCursor::init call() failed
      Sat Jul 13 15:59:13.192 Error: error doing query: failed at src/mongo/shell/query.js:81
      failed to load: D:\slave\Windows_64bit_2008R2+_DEBUG\mongo\jstests\sharding\sync6.js

      Selected lines from log:

      m30004| Sat Jul 13 15:59:08.403 [LockPinger] *** C runtime error: C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\xtree(321) : Assertion failed: map/set iterators incompatible
      // ... snip ...
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\mongo\util\stacktrace.cpp(169)                                        mongo::printStackTrace+0x5b
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\mongo\util\stacktrace.cpp(284)                                        mongo::crtDebugCallback+0x137
      m30004| Sat Jul 13 15:59:10.462 [thread-3] created new distributed lock for TimeSkewFailNewTest_lock_8 on localhost:30000,localhost:30001,localhost:30002 ( lock timeout : 4000, ping interval : 133, process : 1 )
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  f:\dd\vctools\crt_bld\self_64_amd64\crt\src\dbgrptt.c(605)                    _VCrtDbgReportW+0x812
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  f:\dd\vctools\crt_bld\self_64_amd64\crt\src\dbgrpt.c(242)                     _CrtDbgReportWV+0x43
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  f:\dd\vctools\crt_bld\self_64_amd64\crt\src\dbgrpt.c(258)                     _CrtDbgReportW+0x4d
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  f:\dd\vctools\crt_bld\self_64_amd64\crt\src\stdthrow.cpp(13)                  std::_Debug_message+0x34
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  c:\program files (x86)\microsoft visual studio 10.0\vc\include\xtree(322)     std::_Tree_const_iterator<std::_Tree_val<std::_Tset_traits<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::less<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,0> > >::operator==+0x72
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  c:\program files (x86)\microsoft visual studio 10.0\vc\include\xtree(335)     std::_Tree_const_iterator<std::_Tree_val<std::_Tset_traits<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::less<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,0> > >::operator!=+0x32
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  c:\program files (x86)\microsoft visual studio 10.0\vc\include\xutility(889)  std::_Distance2<std::_Tree_const_iterator<std::_Tree_val<std::_Tset_traits<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::less<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,0> > >,unsigned __int64>+0x51
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  c:\program files (x86)\microsoft visual studio 10.0\vc\include\xutility(922)  std::_Distance<std::_Tree_const_iterator<std::_Tree_val<std::_Tset_traits<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::less<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,0> > >,unsigned __int64>+0xc6
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  c:\program files (x86)\microsoft visual studio 10.0\vc\include\xtree(1445)    std::_Tree<std::_Tset_traits<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::less<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,0> >::count+0xda
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\mongo\client\distlock.cpp(304)                                        mongo::DistributedLockPinger::shouldKill+0x8c
      m30004| Sat Jul 13 15:59:10.462 [thread-2] checking last ping for lock 'TimeSkewFailNewTest_lock_8/AMAZONA-VVETJQF:30004:1373745544:18467:thread-6:41' against process AMAZONA-VVETJQF:30004:1373745544:18467:thread-6:41 and ping Sun Jul 14 14:37:01 2013
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\mongo\client\distlock.cpp(104)                                        mongo::DistributedLockPinger::_distLockPingThread+0x342
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\mongo\client\distlock.cpp(236)                                        mongo::DistributedLockPinger::distLockPingThread+0xb0
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\third_party\boost\boost\bind\mem_fn_template.hpp(506)                 boost::_mfi::mf4<void,mongo::DistributedLockPinger,mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,unsigned __int64>::operator()+0xa0
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\third_party\boost\boost\bind\bind.hpp(526)                            boost::_bi::list5<boost::_bi::value<mongo::DistributedLockPinger * __ptr64>,boost::_bi::value<mongo::ConnectionString>,boost::_bi::value<__int64>,boost::_bi::value<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,boost::_bi::value<unsigned __int64> >::operator()<boost::_mfi::mf4<void,mongo::DistributedLockPinger,mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,unsigned __int64>,boost::_bi::list0>+0x173
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\third_party\boost\boost\bind\bind_template.hpp(21)                    boost::_bi::bind_t<void,boost::_mfi::mf4<void,mongo::DistributedLockPinger,mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,unsigned __int64>,boost::_bi::list5<boost::_bi::value<mongo::DistributedLockPinger * __ptr64>,boost::_bi::value<mongo::ConnectionString>,boost::_bi::value<__int64>,boost::_bi::value<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,boost::_bi::value<unsigned __int64> > >::operator()+0x61
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\third_party\boost\boost\thread\detail\thread.hpp(63)                  boost::detail::thread_data<boost::_bi::bind_t<void,boost::_mfi::mf4<void,mongo::DistributedLockPinger,mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,unsigned __int64>,boost::_bi::list5<boost::_bi::value<mongo::DistributedLockPinger * __ptr64>,boost::_bi::value<mongo::ConnectionString>,boost::_bi::value<__int64>,boost::_bi::value<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,boost::_bi::value<unsigned __int64> > > >::run+0x2f
      m30004| Sat Jul 13 15:59:10.462 [LockPinger] mongod.exe  ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(185)               boost::`anonymous namespace'::thread_start_function+0x4c

      The DistributedLockPinger::shouldKill() routine does not lock the mutex before calling _kill.count(); this may explain how the set was seen to be corrupt.

      This is the MSVC code that threw the error:
      c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xtree lines 315 to 337

      	bool operator==(const _Myiter& _Right) const
      		{	// test for iterator equality
       #if _ITERATOR_DEBUG_LEVEL == 2
      		if (this->_Getcont() == 0
      			|| this->_Getcont() != _Right._Getcont())
      			{	// report error
      			_DEBUG_ERROR("map/set iterators incompatible");
      			_SCL_SECURE_INVALID_ARGUMENT;
      			}
       
       #elif _ITERATOR_DEBUG_LEVEL == 1
      		_SCL_SECURE_VALIDATE(this->_Getcont() != 0
      			&& this->_Getcont() == _Right._Getcont());
       #endif /* _ITERATOR_DEBUG_LEVEL */
       
      		return (this->_Ptr == _Right._Ptr);
      		}
       
      	bool operator!=(const _Myiter& _Right) const
      		{	// test for iterator inequality
      		return (!(*this == _Right));
      		}
      	};

      Attachments

        Activity

          People

            randolph@mongodb.com Randolph Tan
            tad Tad Marshall
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: