-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.5.2
-
Component/s: Replication
-
ALL
I removed a node from replica set, restarted mongod on a different port and added it back in with new host:port.
When coming back into the set, I noticed in the logs:
$ mongod --port 37020 --dbpath /data/rs3 --replSet asyaS17 --oplogSize 100 2013-09-21T11:21:30.780-0700 [initandlisten] MongoDB starting : pid=44500 port=37020 dbpath=/data/rs3 64-bit host=asyasmacbook.local 2013-09-21T11:21:30.780-0700 [initandlisten] 2013-09-21T11:21:30.780-0700 [initandlisten] ** NOTE: This is a development version (2.5.2) of MongoDB. 2013-09-21T11:21:30.780-0700 [initandlisten] ** Not recommended for production. 2013-09-21T11:21:30.780-0700 [initandlisten] 2013-09-21T11:21:30.780-0700 [initandlisten] ** WARNING: soft rlimits too low. Number of files is 256, should be at least 1000 2013-09-21T11:21:30.780-0700 [initandlisten] 2013-09-21T11:21:30.780-0700 [initandlisten] db version v2.5.2 2013-09-21T11:21:30.780-0700 [initandlisten] git version: b283169ae51328f3fb5e3c8a3c9017a2e42c148f 2013-09-21T11:21:30.780-0700 [initandlisten] build info: Darwin bs-osx-106-x86-64-2.10gen.cc 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:32:41 PDT 2011; root:xnu-1504.15.3~1/RELEASE_X86_64 x86_64 BOOST_LIB_VERSION=1_49 2013-09-21T11:21:30.780-0700 [initandlisten] allocator: system 2013-09-21T11:21:30.780-0700 [initandlisten] options: { dbpath: "/data/rs3", oplogSize: 100, port: 37020, replSet: "asyaS17" } 2013-09-21T11:21:30.781-0700 [initandlisten] journal dir=/data/rs3/journal 2013-09-21T11:21:30.783-0700 [initandlisten] recover : no journal files present, no recovery needed 2013-09-21T11:21:30.824-0700 [initandlisten] waiting for connections on port 37020 2013-09-21T11:21:30.826-0700 [IndexRebuilder] assertion 13435 not master and slaveOk=false ns:cs.system.namespaces query:{} 2013-09-21T11:21:30.826-0700 [IndexRebuilder] warning: index rebuilding did not complete 2013-09-21T11:21:30.826-0700 [rsStart] replSet REMOVED 2013-09-21T11:21:30.826-0700 [rsStart] replSet info self not present in the repl set configuration: 2013-09-21T11:21:30.826-0700 [rsStart] { _id: "asyaS17", version: 7, members: [ { _id: 0, host: "asyasmacbook.local:37017", priority: 2.0 }, { _id: 1, host: "asyasmacbook.local:37018" } ] } 2013-09-21T11:21:30.826-0700 [rsStart] replSet info Couldn't load config yet. Sleeping 20sec and will try again. 2013-09-21T11:21:50.827-0700 [rsStart] replSet info Couldn't load config yet. Sleeping 20sec and will try again. 2013-09-21T11:21:50.899-0700 [initandlisten] connection accepted from 127.0.0.1:59868 #1 (1 connection now open) 2013-09-21T11:21:51.960-0700 [initandlisten] connection accepted from 127.0.0.1:59870 #2 (2 connections now open) 2013-09-21T11:22:09.971-0700 [conn2] end connection 127.0.0.1:59870 (1 connection now open) 2013-09-21T11:22:09.971-0700 [initandlisten] connection accepted from 127.0.0.1:59876 #3 (2 connections now open) 2013-09-21T11:22:10.827-0700 [rsStart] trying to contact asyasmacbook.local:37017 2013-09-21T11:22:10.829-0700 [rsStart] trying to contact asyasmacbook.local:37018 2013-09-21T11:22:10.830-0700 [rsStart] replSet I am asyasmacbook.local:37020 2013-09-21T11:22:10.830-0700 [rsStart] replSet got config version 8 from a remote, saving locally 2013-09-21T11:22:10.830-0700 [rsStart] replSet info saving a newer config version to local.system.replset 2013-09-21T11:22:10.836-0700 [rsStart] replSet saveConfigLocally done 2013-09-21T11:22:10.836-0700 [rsStart] replSet info Couldn't load config yet. Sleeping 20sec and will try again. 2013-09-21T11:22:10.918-0700 [conn1] end connection 127.0.0.1:59868 (1 connection now open) 2013-09-21T11:22:10.918-0700 [initandlisten] connection accepted from 127.0.0.1:59879 #4 (2 connections now open) 2013-09-21T11:22:30.833-0700 [clientcursormon] mem (MB) res:36 virt:3088 2013-09-21T11:22:30.833-0700 [clientcursormon] mapped (incl journal view):576 2013-09-21T11:22:30.833-0700 [clientcursormon] connections:2 2013-09-21T11:22:30.837-0700 [rsStart] trying to contact asyasmacbook.local:37017 2013-09-21T11:22:30.838-0700 [rsStart] trying to contact asyasmacbook.local:37018 2013-09-21T11:22:30.838-0700 [rsStart] replSet I am asyasmacbook.local:37020 2013-09-21T11:22:30.839-0700 [rsStart] replSet STARTUP2 2013-09-21T11:22:32.840-0700 [rsHealthPoll] replSet member asyasmacbook.local:37017 is up 2013-09-21T11:22:32.840-0700 [rsHealthPoll] replSet member asyasmacbook.local:37017 is now in state PRIMARY 2013-09-21T11:22:32.841-0700 [rsHealthPoll] replSet member asyasmacbook.local:37018 is up 2013-09-21T11:22:32.841-0700 [rsHealthPoll] replSet member asyasmacbook.local:37018 is now in state SECONDARY
It seemed like it got the config and applied it successfully and yet logged that it couldn't and slept for another 20 seconds.
Looking at the code (I looked for 2.5.2 to make sure it matched my executables) I see this:
https://github.com/mongodb/mongo/blob/r2.5.2/src/mongo/db/repl/rs.cpp#L793
so false was returned from _loadConfigFinish and the only place I see that is https://github.com/mongodb/mongo/blob/r2.5.2/src/mongo/db/repl/rs.cpp#L701 and the messages at 11:22:10 clearly indicate that we're in saveConfigLocally https://github.com/mongodb/mongo/blob/r2.5.2/src/mongo/db/repl/rs.cpp#L705 after which only true is returned.
Not sure if I'm misreading something, but it seems strange that the logs indicate first successful save of new version of config and then a warning that we "Couldn't load config yet".
- is related to
-
SERVER-15770 Fassert after removing and re-adding node from replset
- Closed