Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
2.2.0
-
None
-
None
-
$ uname -a
Linux mongo02 2.6.38-8-virtual #42-Ubuntu SMP Mon Apr 11 04:06:34 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
-
ALL
Description
I used a single mongodb server and then wanted to create a replicaset to improve durability.
Created a secondary and an arbiter.
then restarted the master in replSet mode.
> rs.initialize()
OK
> rs.reconfig(
)
OK
wait....
they start synching...
And then out of nowhere, the master node sees:
0xade6e1 0x5582d9 0x558862 0x7f3b3ce47c60 0x7f3b3c185bf6 0x580ee5 0x94e796 0x94227e 0x6b26b9 0xb5ba7d 0xb5d052 0x56fa52 0x5dbf11 0x7f3b3ce3ed8c 0x7f3b3c1e104d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x5582d9]
/usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x558862]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfc60) [0x7f3b3ce47c60]
/lib/x86_64-linux-gnu/libc.so.6(memcpy+0x296) [0x7f3b3c185bf6]
/usr/bin/mongod(_ZNK5mongo7BSONObj4copyEv+0x45) [0x580ee5]
/usr/bin/mongod(_ZN5mongo11ParsedQuery4initERKNS_7BSONObjE+0x516) [0x94e796]
/usr/bin/mongod(_ZN5mongo11ParsedQueryC1ERNS_12QueryMessageE+0x9e) [0x94227e]
/usr/bin/mongod(ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1+0x39) [0x6b26b9]
/usr/bin/mongod() [0xb5ba7d]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x3a2) [0xb5d052]
/usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x82) [0x56fa52]
/usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x411) [0x5dbf11]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6d8c) [0x7f3b3ce3ed8c]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f3b3c1e104d]
This happend a few times during the initial sync process. The sync isn't over, it's still running...
On another set of mongodb servers I did exactly the same, but there it went smoothly and the replicaset was created successfully.
I'm still running the faulting server in hope that it would finish its replication to the secondary (it'll take a few hours) but anyway, the crash itself is bad enough to report...