[SERVER-7434] Startup race with --fork Created: 21/Oct/12  Updated: 11/Jul/16  Resolved: 01/Feb/13

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: 2.2.0, 2.2.2, 2.3.2
Fix Version/s: 2.2.3, 2.4.0-rc1

Type: Bug Priority: Minor - P4
Reporter: Martin Buechler Assignee: Andy Schwerin
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

All supported OSes except Windows.


Issue Links:
Depends
Duplicate
is duplicated by SERVER-8251 Startup hangs infinitely, DataFileSyn... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Installed the 2.2.0 rpm package from 10gen repo. 'service mongod start' creates 3 processes:

root 12603 9583 0 12:59 pts/0 00:00:00 /bin/sh /sbin/service mongod restart
root 12608 12603 0 12:59 pts/0 00:00:00 /bin/bash /etc/init.d/mongod restart
root 12622 12608 0 12:59 pts/0 00:00:00 runuser -s /bin/bash - mongod -c ulimit -S -c 0 >/dev/null 2>&1 ; numactl --interleave=all /usr/bin/mongod -f /etc/mongod.
mongod 12623 12622 0 12:59 ? 00:00:00 -bash -c ulimit -S -c 0 >/dev/null 2>&1 ; numactl --interleave=all /usr/bin/mongod -f /etc/mongod.conf
mongod 12645 12623 1 12:59 ? 00:00:00 /usr/bin/mongod -f /etc/mongod.conf
mongod 12647 12645 0 12:59 ? 00:00:00 /usr/bin/mongod -f /etc/mongod.conf
mongod 12648 12647 0 12:59 ? 00:00:00 /usr/bin/mongod -f /etc/mongod.conf

strace of PID 12648, the third - obviously hanging - process gives:
...
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 67669616}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 60868464}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 167533952}

) = -1 ETIMEDOUT (Connection timed out)
...

gdb:

Thread 2 (Thread 0x40a87940 (LWP 12649)):
#0 0x00000000008d2996 in base::internal::SpinLockDelay(int volatile*, int, int) ()
#1 0x000000000086210c in SpinLock::SlowLock() ()
#2 0x0000000000866056 in tcmalloc::ThreadCache::CreateCacheIfNecessary() ()
#3 0x00000000009b0857 in ?? ()
#4 0x0000000000c22872 in tc_malloc ()
#5 0x00000000009e63aa in boost::detail::get_once_per_thread_epoch() ()
#6 0x00000000007c4ff8 in void boost::call_once<void ()>(boost::once_flag&, void ()) ()
#7 0x00000000007c1e57 in boost::detail::set_current_thread_data(boost::detail::thread_data_base*) ()
#8 0x00000000007c3646 in ?? ()
#9 0x0000003124e0677d in start_thread () from /lib64/libpthread.so.0
#10 0x00000031246d3c1d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x2b5613d478c0 (LWP 12648)):
#0 0x00000000008d2996 in base::internal::SpinLockDelay(int volatile*, int, int) ()
#1 0x000000000086210c in SpinLock::SlowLock() ()
#2 0x00000000008b92db in tcmalloc::CentralFreeList::Populate() ()
#3 0x00000000008b9498 in tcmalloc::CentralFreeList::FetchFromSpansSafe() ()
#4 0x00000000008b9534 in tcmalloc::CentralFreeList::RemoveRange(void*, void*, int) ()
#5 0x0000000000865c0d in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) ()
#6 0x00000000009b0f2f in ?? ()
#7 0x0000000000c21c95 in tc_new ()
#8 0x0000000000599e14 in boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf1<void, mongo::BackgroundJob, boost::shared_ptr<mongo::BackgroundJob::JobStatus> >, boost::_bi::list2<boost::_bi::value<mongo::BackgroundJob*>, boost::_bi::value<boost::shared_ptr<mongo::BackgroundJob::JobStatus> > > > >* boost::detail::heap_new_impl<boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf1<void, mongo::BackgroundJob, boost::shared_ptr<mongo::BackgroundJob::JobStatus> >, boost::_bi::list2<boost::_bi::value<mongo::BackgroundJob*>, boost::_bi::value<boost::shared_ptr<mongo::BackgroundJob::JobStatus> > > > >, boost::_bi::bind_t<void, boost::_mfi::mf1<void, mongo::BackgroundJob, boost::shared_ptr<mongo::BackgroundJob::JobStatus> >, boost::_bi::list2<boost::_bi::value<mongo::BackgroundJob*>, boost::_bi::value<boost::shared_ptr<mongo::BackgroundJob::JobStatus> > > >&>(boost::_bi::bind_t<void, boost::_mfi::mf1<void, mongo::BackgroundJob, boost::shared_ptr<mongo::BackgroundJob::JobStatus> >, boost::_bi::list2<boost::_bi::value<mongo::BackgroundJob*>, boost::_bi::value<boost::shared_ptr<mongo::BackgroundJob::JobStatus> > > >&) ()
#9 0x00000000005954fa in mongo::BackgroundJob::go() ()
#10 0x00000000005630cc in ?? ()
#11 0x0000000000565399 in main ()

This behaviour is somewhat random, because sometimes the startup works.

Notes: I rebuilt mongod from source r2.2.0, stripped the binary manually and to my surprise this binary, does not show this behaviour. Alas, another binary installed with 'scons install' always hangs.



 Comments   
Comment by Andy Schwerin [ 11/Mar/13 ]

This issue probably has gperftools issue-496, resolved by r196, as its root cause. See link below.

https://code.google.com/p/gperftools/source/detail?r=196

The fixes applied to the mongodb source to resolve SERVER-7434 are all good changes, even absent the bug fix, so no further action is recommended.

Comment by auto [ 01/Feb/13 ]

Author:

{u'date': u'2013-01-22T23:19:22Z', u'email': u'schwerin@10gen.com', u'name': u'Andy Schwerin'}

Message: SERVER-7434 Do not spawn a thread in grandparent process when using --fork
Branch: master
https://github.com/mongodb/mongo/commit/8ab462ba516003db27c4e4574688deef0f8e8dbe

Comment by auto [ 23/Jan/13 ]

Author:

{u'date': u'2013-01-22T19:29:18Z', u'name': u'Andy Schwerin', u'email': u'schwerin@10gen.com'}

Message: SERVER-7434 Do not spawn a thread in grandparent process when using --fork.
Branch: v2.2
https://github.com/mongodb/mongo/commit/0a3b5f08e689b87b41792a1d4e87c493403ede58

Comment by Andy Schwerin [ 22/Jan/13 ]

I've confirmed the existence of a startup race when --fork is passed. When using --fork, there are three processes of interest. The "child" is the final process left running mongod. The "parent" is the process that invokes it, and the "grandparent" is the process that invokes the parent. These three processes are all images of mongod.

The problem occurs because the grandparent starts a thread before calling fork, to handle terminal interrupts. That thread allocates heap memory as part of its startup. If the grandparent calls fork() from the main thread while the interrupt thread holds a malloc lock, the parent and grandparent both execute in address spaces where the malloc lock is held but the holding thread does not exist.

Easiest repro is to install the product from RPM on a RHEL system, and run the following shell scriptlet as root:

service mongod start
while true; do service mongod restart; done

Then, just wait for it to hang, and use gdb -p to attach to the hung processes.

Because of the nature of the race, it is easier to stimulate if you drive up the rate of context switches and drive down the number of available cores. numctl or running on a single-core VM helps with the latter, and a script like the following helps with the former:

while true; do
  for (( i=0; $i < 10; ++i )); do
    openssl rand -out /dev/null 20000000 &
  done
  wait
done

Comment by WangYu [ 20/Jan/13 ]

I meet the same issue when using python script to start mongodb. From the ps, there are three processes and hang. Strace the last process, the result as below. This not happen every time, but random.

Any suggestion how to avoid this issue? Thanks!

-bash-4.1# ps -ef|grep mongo
root 9778 9671 0 00:45 pts/0 00:00:00 grep mongo
root 30444 30426 0 Jan19 ? 00:00:00 /opt/mongodb/mongodb-linux-x86_64-2.2.0/bin/mongod --fork --port 27018 --dbpath /opt/mongodb/data --logpath /opt/mongodb/logs/mongod.log --logappend --auth
root 30446 30444 0 Jan19 ? 00:00:00 /opt/mongodb/mongodb-linux-x86_64-2.2.0/bin/mongod --fork --port 27018 --dbpath /opt/mongodb/data --logpath /opt/mongodb/logs/mongod.log --logappend --auth
root 30447 30446 0 Jan19 ? 00:00:03 /opt/mongodb/mongodb-linux-x86_64-2.2.0/bin/mongod --fork --port 27018 --dbpath /opt/mongodb/data --logpath /opt/mongodb/logs/mongod.log --logappend --auth
-bash-4.1# strace -p 30447
Process 30447 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 62268768}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 263559680}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 126873152}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 243237408}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 86392656}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 167695552}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 177540480}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 218584944}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 73609392}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 108531680}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 66404528}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 94315584}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 18061952}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 36596160}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 110801056}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 16613728}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 47067712}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 145683984}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 256660144}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 164049408}

) = -1 ETIMEDOUT (Connection timed out)
futex(0x13dc400, FUTEX_WAIT_PRIVATE, 2,

{0, 209718064}

^C <unfinished ...>
Process 30447 detached

Generated at Thu Feb 08 03:14:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.