-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 2.1.0
-
Component/s: Replication
-
None
-
ALL
If you run jstests/sharding/mongos_no_replica_set_refresh.js with Memcheck, you will get this:
m31100| ==19834== Thread 15: m31100| ==19834== Invalid read of size 8 m31100| ==19834== at 0x303E86E860: std::_Rb_tree_increment(std::_Rb_tree_node_base*) (tree.cc:62) m31100| ==19834== by 0x97EB15: mongo::ReplSetImpl::getMostElectable() (stl_tree.h:277) m31100| ==19834== by 0x788992: mongo::Manager::checkElectableSet() (manager.cpp:97) m31100| ==19834== by 0x789020: mongo::Manager::msgCheckNewState() (manager.cpp:171) m31100| ==19834== by 0x6E269A: mongo::task::Server::doWork() (function_template.hpp:760) m31100| ==19834== by 0x6E0D09: mongo::task::Task::run() (task.cpp:53) m31100| ==19834== by 0x5CFDE8: mongo::BackgroundJob::jobBody(boost::shared_ptr<mongo::BackgroundJob::JobStatus>) (background.cpp:61) m31100| ==19834== by 0x5D201A: boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf1<void, mongo::BackgroundJob, boost::shared_ptr<mongo::BackgroundJob::JobStatus> >, boost::_bi::list2<boost::_bi::value<mongo::BackgroundJob*>, boost::_bi::value<boost::shared_ptr<mongo::BackgroundJob::JobStatus> > > > >::run() (mem_fn_template.hpp:165) m31100| ==19834== by 0x4E3AC38: thread_proxy (thread.cpp:121) m31100| ==19834== by 0x3037007D8F: start_thread (pthread_create.c:309) m31100| ==19834== by 0x3036CEF48C: clone (clone.S:115) m31100| ==19834== Address 0x1ed87498 is 24 bytes inside a block of size 40 free'd m31100| ==19834== at 0x4A06336: operator delete(void*) (vg_replace_malloc.c:457) m31100| ==19834== by 0x78981F: std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::_M_erase(std::_Rb_tree_node<unsigned int>*) (new_allocator.h:98) m31100| ==19834== by 0x789983: std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::erase(unsigned int const&) (stl_tree.h:799) m31100| ==19834== by 0x97EBAF: mongo::ReplSetImpl::getMostElectable() (stl_set.h:528) m31100| ==19834== by 0x788992: mongo::Manager::checkElectableSet() (manager.cpp:97) m31100| ==19834== by 0x789020: mongo::Manager::msgCheckNewState() (manager.cpp:171) m31100| ==19834== by 0x6E269A: mongo::task::Server::doWork() (function_template.hpp:760) m31100| ==19834== by 0x6E0D09: mongo::task::Task::run() (task.cpp:53) m31100| ==19834== by 0x5CFDE8: mongo::BackgroundJob::jobBody(boost::shared_ptr<mongo::BackgroundJob::JobStatus>) (background.cpp:61) m31100| ==19834== by 0x5D201A: boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf1<void, mongo::BackgroundJob, boost::shared_ptr<mongo::BackgroundJob::JobStatus> >, boost::_bi::list2<boost::_bi::value<mongo::BackgroundJob*>, boost::_bi::value<boost::shared_ptr<mongo::BackgroundJob::JobStatus> > > > >::run() (mem_fn_template.hpp:165) m31100| ==19834== by 0x4E3AC38: thread_proxy (thread.cpp:121) m31100| ==19834== by 0x3037007D8F: start_thread (pthread_create.c:309)
I actually ran into this on Windows first, then tried it with memcheck on Linux. On Windows, you get an assertion in the iterator code when you attempt to iterate over the freed memory in _electableSet – no memcheck required.
This may indicate a thread safety problem, where we are reading a std::set (which is a red-black tree under the hood) in one thread, while another thread is adding/removing nodes and rebalancing the tree.