|
Hi,
I hit the exact same problem that Linar did with MongoDB 3.0.8. I have a 5-node, distributed replica set with 2 nodes in Dallas (Texas), 2 nodes in Toronto (Canada) and an arbiter (also in Toronto, but we pretend it is offsite). The workload was a stress test involving 1 database and 1 collection. Our test driver has many writers that insert and remove documents (about 1 MB each) and a number of readers that count those documents. There are no indexes (aside from the implicit one on _id field).
We wanted to see how failover behaves when all nodes are stressed (load averages > 1). Once the workload saturated the primary and secondaries, we pulled the plug on both nodes in Toronto. Within 10 seconds, a node in Dallas was elected. We kept the Toronto nodes offline for a few minutes and then brought them back. They both transitioned to ROLLBACK state and remained there indefinitely (over an hour). There was basically zero activity on those boxes. After inspection the /rollback directory, we only found about 3 bson files, and according to the replica set status, those nodes were around 800 seconds behind. We could not find any messages like "replSet too much data to roll back" in the mongod logs on either of the Toronto nodes.
While in this ROLLBACK state, the mongod service would not shutdown cleanly. After forcing it to die and bringing it back, the following entries show in the log:
2016-01-27T15:26:50.279-0600 I CONTROL ***** SERVER RESTARTED *****
|
2016-01-27T15:26:50.285-0600 I CONTROL [initandlisten] MongoDB starting : pid=32235 port=27017 dbpath=/var/lib/mongodb 64-bit host=tor02
|
2016-01-27T15:26:50.285-0600 I CONTROL [initandlisten] db version v3.0.8
|
2016-01-27T15:26:50.285-0600 I CONTROL [initandlisten] git version: 83d8cc25e00e42856924d84e220fbe4a839e605d
|
2016-01-27T15:26:50.285-0600 I CONTROL [initandlisten] build info: Linux ip-10-187-89-126 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
|
2016-01-27T15:26:50.285-0600 I CONTROL [initandlisten] allocator: tcmalloc
|
2016-01-27T15:26:50.285-0600 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net: { port: 27017 }, replication: { replSetName: "wa-hadr" }, storage: { dbPath: "/var/lib/mongodb", journal: { enabled: true }, mmapv1: { smallFiles: true } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } }
|
2016-01-27T15:26:50.309-0600 W - [initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.
|
2016-01-27T15:26:50.333-0600 I JOURNAL [initandlisten] journal dir=/var/lib/mongodb/journal
|
2016-01-27T15:26:50.333-0600 I JOURNAL [initandlisten] recover begin
|
2016-01-27T15:26:50.333-0600 I JOURNAL [initandlisten] info no lsn file in journal/ directory
|
2016-01-27T15:26:50.333-0600 I JOURNAL [initandlisten] recover lsn: 0
|
2016-01-27T15:26:50.333-0600 I JOURNAL [initandlisten] recover /var/lib/mongodb/journal/j._0
|
2016-01-27T15:26:50.343-0600 I JOURNAL [initandlisten] recover cleaning up
|
2016-01-27T15:26:50.343-0600 I JOURNAL [initandlisten] removeJournalFiles
|
2016-01-27T15:26:50.344-0600 I JOURNAL [initandlisten] recover done
|
2016-01-27T15:26:50.467-0600 I JOURNAL [durability] Durability thread started
|
2016-01-27T15:26:50.467-0600 I JOURNAL [journal writer] Journal writer thread started
|
2016-01-27T15:26:50.699-0600 I NETWORK [initandlisten] waiting for connections on port 27017
|
2016-01-27T15:26:50.838-0600 I REPL [ReplicationExecutor] New replica set config in use: { _id: "wa-hadr", version: 10, members: [ { _id: 1, host: "tor02.mongohadr.ba.ibm.local:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 2.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "tor01.mongohadr.ba.ibm.local:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 3, host: "dal01.mongohadr.ba.ibm.local:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 5, host: "dal02.mongohadr.ba.ibm.local:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 6, host: "tor03.mongohadr.ba.ibm.local:27017", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatTimeoutSecs: 10, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }
|
2016-01-27T15:26:50.838-0600 I REPL [ReplicationExecutor] This node is tor02.mongohadr.ba.ibm.local:27017 in the config
|
2016-01-27T15:26:50.838-0600 I REPL [ReplicationExecutor] transition to STARTUP2
|
2016-01-27T15:26:50.838-0600 I REPL [ReplicationExecutor] Starting replication applier threads
|
2016-01-27T15:26:50.840-0600 I REPL [ReplicationExecutor] transition to RECOVERING
|
2016-01-27T15:26:50.840-0600 I REPL [ReplicationExecutor] Member tor01.mongohadr.ba.ibm.local:27017 is now in state PRIMARY
|
2016-01-27T15:26:50.841-0600 I REPL [ReplicationExecutor] Member tor03.mongohadr.ba.ibm.local:27017 is now in state ARBITER
|
2016-01-27T15:26:50.842-0600 I REPL [ReplicationExecutor] transition to SECONDARY
|
2016-01-27T15:26:50.887-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.12:35141 #1 (1 connection now open)
|
2016-01-27T15:26:50.903-0600 I REPL [ReplicationExecutor] Member dal01.mongohadr.ba.ibm.local:27017 is now in state SECONDARY
|
2016-01-27T15:26:50.904-0600 I REPL [ReplicationExecutor] Member dal02.mongohadr.ba.ibm.local:27017 is now in state SECONDARY
|
2016-01-27T15:26:51.289-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.12:38122 #2 (2 connections now open)
|
2016-01-27T15:26:52.279-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.6:40460 #3 (3 connections now open)
|
2016-01-27T15:26:52.666-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.32:40157 #4 (4 connections now open)
|
2016-01-27T15:26:53.332-0600 I NETWORK [initandlisten] connection accepted from 10.1.232.114:55462 #5 (5 connections now open)
|
2016-01-27T15:26:53.840-0600 I REPL [ReplicationExecutor] syncing from: tor01.mongohadr.ba.ibm.local:27017
|
2016-01-27T15:26:53.944-0600 W REPL [rsBackgroundSync] we are too stale to use tor01.mongohadr.ba.ibm.local:27017 as a sync source
|
2016-01-27T15:26:53.944-0600 I REPL [ReplicationExecutor] syncing from: dal01.mongohadr.ba.ibm.local:27017
|
2016-01-27T15:26:54.014-0600 I REPL [SyncSourceFeedback] replset setting syncSourceFeedback to dal01.mongohadr.ba.ibm.local:27017
|
2016-01-27T15:27:11.472-0600 I REPL [rsBackgroundSync] replSet our last op time fetched: Jan 27 14:26:25:1b2
|
2016-01-27T15:27:11.472-0600 I REPL [rsBackgroundSync] replset source's GTE: Jan 27 14:27:09:1
|
2016-01-27T15:27:11.472-0600 I REPL [rsBackgroundSync] beginning rollback
|
2016-01-27T15:27:11.472-0600 I REPL [rsBackgroundSync] rollback 0
|
2016-01-27T15:27:11.472-0600 I REPL [ReplicationExecutor] transition to ROLLBACK
|
2016-01-27T15:27:11.472-0600 I REPL [rsBackgroundSync] rollback 1
|
2016-01-27T15:27:11.473-0600 I REPL [rsBackgroundSync] rollback 2 FindCommonPoint
|
2016-01-27T15:27:11.542-0600 I REPL [rsBackgroundSync] replSet info rollback our last optime: Jan 27 14:26:25:1b2
|
2016-01-27T15:27:11.542-0600 I REPL [rsBackgroundSync] replSet info rollback their last optime: Jan 27 14:39:57:ba
|
2016-01-27T15:27:11.542-0600 I REPL [rsBackgroundSync] replSet info rollback diff in end of log times: -812 seconds
|
2016-01-27T15:27:20.896-0600 I NETWORK [conn1] end connection 158.85.123.12:35141 (4 connections now open)
|
2016-01-27T15:27:20.896-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.12:35145 #6 (5 connections now open)
|
2016-01-27T15:27:21.774-0600 I NETWORK [conn2] end connection 10.41.72.12:38122 (4 connections now open)
|
2016-01-27T15:27:21.810-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.12:38126 #7 (5 connections now open)
|
2016-01-27T15:27:22.289-0600 I NETWORK [conn3] end connection 158.85.123.6:40460 (4 connections now open)
|
2016-01-27T15:27:22.290-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.6:40464 #8 (5 connections now open)
|
2016-01-27T15:27:23.163-0600 I NETWORK [conn4] end connection 10.41.72.32:40157 (4 connections now open)
|
2016-01-27T15:27:23.195-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.32:40161 #9 (5 connections now open)
|
2016-01-27T15:27:50.908-0600 I NETWORK [conn6] end connection 158.85.123.12:35145 (4 connections now open)
|
2016-01-27T15:27:50.909-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.12:35149 #10 (5 connections now open)
|
2016-01-27T15:27:52.302-0600 I NETWORK [conn8] end connection 158.85.123.6:40464 (4 connections now open)
|
2016-01-27T15:27:52.328-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.6:40468 #11 (5 connections now open)
|
2016-01-27T15:27:52.354-0600 I NETWORK [conn7] end connection 10.41.72.12:38126 (4 connections now open)
|
2016-01-27T15:27:52.387-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.12:38131 #12 (5 connections now open)
|
2016-01-27T15:27:53.699-0600 I NETWORK [conn9] end connection 10.41.72.32:40161 (4 connections now open)
|
2016-01-27T15:27:53.731-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.32:40165 #13 (5 connections now open)
|
2016-01-27T15:28:20.917-0600 I NETWORK [conn10] end connection 158.85.123.12:35149 (4 connections now open)
|
2016-01-27T15:28:20.918-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.12:35153 #14 (5 connections now open)
|
2016-01-27T15:28:22.335-0600 I NETWORK [conn11] end connection 158.85.123.6:40468 (4 connections now open)
|
2016-01-27T15:28:22.336-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.6:40472 #15 (5 connections now open)
|
2016-01-27T15:28:22.875-0600 I NETWORK [conn12] end connection 10.41.72.12:38131 (4 connections now open)
|
2016-01-27T15:28:22.907-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.12:38135 #16 (5 connections now open)
|
2016-01-27T15:28:24.236-0600 I NETWORK [conn13] end connection 10.41.72.32:40165 (4 connections now open)
|
2016-01-27T15:28:24.268-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.32:40169 #17 (5 connections now open)
|
2016-01-27T15:28:50.928-0600 I NETWORK [conn14] end connection 158.85.123.12:35153 (4 connections now open)
|
2016-01-27T15:28:50.928-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.12:35157 #18 (5 connections now open)
|
2016-01-27T15:28:52.352-0600 I NETWORK [conn15] end connection 158.85.123.6:40472 (4 connections now open)
|
2016-01-27T15:28:52.355-0600 I NETWORK [initandlisten] connection accepted from 158.85.123.6:40476 #19 (5 connections now open)
|
2016-01-27T15:28:53.387-0600 I NETWORK [conn16] end connection 10.41.72.12:38135 (4 connections now open)
|
2016-01-27T15:28:53.419-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.12:38139 #20 (5 connections now open)
|
2016-01-27T15:28:54.768-0600 I NETWORK [conn17] end connection 10.41.72.32:40169 (4 connections now open)
|
2016-01-27T15:28:54.801-0600 I NETWORK [initandlisten] connection accepted from 10.41.72.32:40173 #21 (5 connections now open)
|
2016-01-27T15:28:57.829-0600 I REPL [rsBackgroundSync] replSet rollback found matching events at Jan 27 14:23:51:8
|
2016-01-27T15:28:57.829-0600 I REPL [rsBackgroundSync] replSet rollback findcommonpoint scanned : 790428
|
2016-01-27T15:28:57.932-0600 I REPL [rsBackgroundSync] replSet rollback 3 fixup
|
As you can see, it is once again stuck after "replSet rollback 3 fixup" is emitted.
Below is the rs.status() from the node stuck in ROLLBACK.
{
|
"set" : "wa-hadr",
|
"date" : ISODate("2016-01-27T21:37:41.119Z"),
|
"myState" : 9,
|
"syncingTo" : "dal01.mongohadr.ba.ibm.local:27017",
|
"members" : [
|
{
|
"_id" : 1,
|
"name" : "tor02.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 9,
|
"stateStr" : "ROLLBACK",
|
"uptime" : 651,
|
"optime" : Timestamp(1453926385, 434),
|
"optimeDate" : ISODate("2016-01-27T20:26:25Z"),
|
"syncingTo" : "dal01.mongohadr.ba.ibm.local:27017",
|
"configVersion" : 10,
|
"self" : true
|
},
|
{
|
"_id" : 2,
|
"name" : "tor01.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
"uptime" : 650,
|
"optime" : Timestamp(1453927197, 186),
|
"optimeDate" : ISODate("2016-01-27T20:39:57Z"),
|
"lastHeartbeat" : ISODate("2016-01-27T21:37:39.297Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:37:39.297Z"),
|
"pingMs" : 0,
|
"electionTime" : Timestamp(1453928225, 1),
|
"electionDate" : ISODate("2016-01-27T20:57:05Z"),
|
"configVersion" : 10
|
},
|
{
|
"_id" : 3,
|
"name" : "dal01.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 650,
|
"optime" : Timestamp(1453927197, 186),
|
"optimeDate" : ISODate("2016-01-27T20:39:57Z"),
|
"lastHeartbeat" : ISODate("2016-01-27T21:37:40.541Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:37:40.997Z"),
|
"pingMs" : 33,
|
"lastHeartbeatMessage" : "could not find member to sync from",
|
"configVersion" : 10
|
},
|
{
|
"_id" : 5,
|
"name" : "dal02.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 650,
|
"optime" : Timestamp(1453927197, 186),
|
"optimeDate" : ISODate("2016-01-27T20:39:57Z"),
|
"lastHeartbeat" : ISODate("2016-01-27T21:37:40.570Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:37:40.096Z"),
|
"pingMs" : 38,
|
"lastHeartbeatMessage" : "could not find member to sync from",
|
"configVersion" : 10
|
},
|
{
|
"_id" : 6,
|
"name" : "tor03.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 7,
|
"stateStr" : "ARBITER",
|
"uptime" : 650,
|
"lastHeartbeat" : ISODate("2016-01-27T21:37:39.297Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:37:40.786Z" ),
|
"pingMs" : 0,
|
"configVersion" : 10
|
}
|
],
|
"ok" : 1
|
}
|
Now here is the rs.status() from the arbiter node (which never went down during our tests). It sees everything is OK.
{
|
"set" : "wa-hadr",
|
"date" : ISODate("2016-01-27T21:39:44.310Z"),
|
"myState" : 7,
|
"members" : [
|
{
|
"_id" : 1,
|
"name" : "tor02.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 9,
|
"stateStr" : "ROLLBACK",
|
"uptime" : 775,
|
"optime" : Timestamp(1453926385, 434),
|
"optimeDate" : ISODate("2016-01-27T20:26:25Z"),
|
"lastHeartbeat" : ISODate("2016-01-27T21:39:43.386Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:39:43.872Z"),
|
"pingMs" : 0,
|
"syncingTo" : "dal01.mongohadr.ba.ibm.local:27017",
|
"configVersion" : 10
|
},
|
{
|
"_id" : 2,
|
"name" : "tor01.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
"uptime" : 2633,
|
"optime" : Timestamp(1453927197, 186),
|
"optimeDate" : ISODate("2016-01-27T20:39:57Z"),
|
"lastHeartbeat" : ISODate("2016-01-27T21:39:43.406Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:39:43.400Z"),
|
"pingMs" : 0,
|
"electionTime" : Timestamp(1453928225, 1),
|
"electionDate" : ISODate("2016-01-27T20:57:05Z"),
|
"configVersion" : 10
|
},
|
{
|
"_id" : 3,
|
"name" : "dal01.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 178034,
|
"optime" : Timestamp(1453927197, 186),
|
"optimeDate" : ISODate("2016-01-27T20:39:57Z"),
|
"lastHeartbeat" : ISODate("2016-01-27T21:39:42.705Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:39:43.733Z"),
|
"pingMs" : 36,
|
"configVersion" : 10
|
},
|
{
|
"_id" : 5,
|
"name" : "dal02.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 178034,
|
"optime" : Timestamp(1453927197, 186),
|
"optimeDate" : ISODate("2016-01-27T20:39:57Z"),
|
"lastHeartbeat" : ISODate("2016-01-27T21:39:42.762Z"),
|
"lastHeartbeatRecv" : ISODate("2016-01-27T21:39:43.420Z"),
|
"pingMs" : 35,
|
"configVersion" : 10
|
},
|
{
|
"_id" : 6,
|
"name" : "tor03.mongohadr.ba.ibm.local:27017",
|
"health" : 1,
|
"state" : 7,
|
"stateStr" : "ARBITER",
|
"uptime" : 178075,
|
"configVersion" : 10,
|
"self" : true
|
}
|
],
|
"ok" : 1
|
}
|
|