-
Type:
Bug
-
Resolution: Done
-
Priority:
Minor - P4
-
None
-
Affects Version/s: 2.0.1
-
Component/s: Replication
-
None
-
Environment:centos 5.5 x86_64
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
I'm using hostnames, not IPs, for my replica sets. I recently upgraded a secondary server in a replica set, and the underlying IP address changed. I updated the hosts files on all my boxes, except I forgot to update the host file on the secondary itself (which had an entry for itself pointing to the old IP address). I added the secondary to the set before I caught my mistake, and it ended up crashing with this error:
Tue Dec 13 04:39:07 [rsMgr] replset msgReceivedNewConfig version: version: 13
Tue Dec 13 04:39:07 [rsMgr] replSet info saving a newer config version to local.system.replset
Tue Dec 13 04:39:07 [rsMgr] replSet saveConfigLocally done
Tue Dec 13 04:39:07 [rsMgr] self doesn't match: 3
Tue Dec 13 04:39:07 [rsMgr] Assertion failure false db/repl/rs.cpp 440
0x57eeb6 0x589d6b 0x7c214b 0x7c32f2 0x7c4080 0x7f5ec5 0x5939f3 0x591d25 0x591383 0x578d0f 0x57adc4 0xaa4560 0x2aaaaacce617 0x2aaaab748c2d
mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57eeb6]
mongod(_ZN5mongo8assertedEPKcS1_j+0xfb) [0x589d6b]
mongod(_ZN5mongo11ReplSetImpl14initFromConfigERNS_13ReplSetConfigEb+0xadb) [0x7c214b]
mongod(_ZN5mongo7ReplSet13haveNewConfigERNS_13ReplSetConfigEb+0xd2) [0x7c32f2]
mongod(_ZN5mongo7Manager20msgReceivedNewConfigENS_7BSONObjE+0x2e0) [0x7c4080]
mongod(_ZN5boost6detail8function26void_function_obj_invoker0INS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo7ManagerENS7_7BSONObjEEENS3_5list2INS3_5valueIPS8_EENSC_IS9_EEEEEEvE6invokeERNS1_15function_bufferE+0x65) [0x7f5ec5]
mongod(_ZNK5boost9function0IvEclEv+0x243) [0x5939f3]
mongod(_ZN5mongo4task6Server6doWorkEv+0x225) [0x591d25]
mongod(_ZN5mongo4task4Task3runEv+0x33) [0x591383]
mongod(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE+0xbf) [0x578d0f]
mongod(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv+0x74) [0x57adc4]
mongod(thread_proxy+0x80) [0xaa4560]
/lib64/libpthread.so.0 [0x2aaaaacce617]
/lib64/libc.so.6(clone+0x6d) [0x2aaaab748c2d]
Tue Dec 13 04:39:07 [rsMgr] replSet error unexpected exception in haveNewConfig() : 0 assertion db/repl/rs.cpp:440
Tue Dec 13 04:39:07 [rsMgr] replSet error fatal, stopping replication
— (repeats)
When I noticed the host file error, I updated it correctly. I attempted to restart with the --repair flag but got repeated entries of this in the log:
—
Tue Dec 13 04:44:37 [initandlisten] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: local.oplog.rs top: { opid: 8, active: true, waitingForLock: false, secs_running: 0, op: "getmore", ns:
"local.oplog.rs", query: {}, client: "0.0.0.0:0", desc: "initandlisten", threadId: "0x2aaaab9cce00", numYields: 0 }
— (repeats)
I killed the mongod process again, restarted without the flag. This time it repaired and was successfully added back to the set.