-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Critical - P2
-
None
-
Affects Version/s: 3.0.2
-
Component/s: Storage
-
None
env:
3-node RS on 3.0.2
sequence:
1) reconfig RS to have slaveDelay on both secondaries (say, 60 min)
2) insert a doc on the primary. Here is the oplog:
> db.oplog.rs.find().sort({$natural:1}) { "ts" : Timestamp(1443061365, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } { "ts" : Timestamp(1443064118, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "Reconfig set", "version" : 2 } } { "ts" : Timestamp(1443064145, 1), "h" : NumberLong("-5315938192568877039"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "o" : { "create" : "test" } } { "ts" : Timestamp(1443064145, 2), "h" : NumberLong("-1100597721925905403"), "v" : 2, "op" : "i", "ns" : "test.test", "o" : { "_id" : ObjectId("56036951c98eee1d6f729dae"), "x" : 1 } }
3) shut down all nodes, restart them as standalones, and manually alter the local.system.replset to remove slaveDelay
4) start both former secondaries back into the replica set - they will elect a new primary, and then insert more data, so the oplog will look like this:
{ "ts" : Timestamp(1443061365, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } { "ts" : Timestamp(1443065179, 1), "h" : NumberLong("-5315862695633747950"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "o" : { "create" : "test" } } { "ts" : Timestamp(1443065179, 2), "h" : NumberLong("-932390550458295111"), "v" : 2, "op" : "i", "ns" : "test.test", "o" : { "_id" : ObjectId("56036d5b8bfc46dee02d439f"), "x" : 2 } } { "ts" : Timestamp(1443065181, 1), "h" : NumberLong("2732461908951260525"), "v" : 2, "op" : "i", "ns" : "test.test", "o" : { "_id" : ObjectId("56036d5d8bfc46dee02d43a0"), "x" : 3 } }
5) start the former primary back and observe the ROLLBACK:
2015-09-24T11:33:46.852+0800 I REPL [ReplicationExecutor] New replica set config in use: { _id: "replset", version: 3, members: [ { _id: 0, host: "AD-MAC10G-2.local:27050", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "AD-MAC10G-2.local:27051", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "AD-MAC10G-2.local:27052", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatTimeoutSecs: 10, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } } 2015-09-24T11:33:46.852+0800 I REPL [ReplicationExecutor] This node is AD-MAC10G-2.local:27050 in the config 2015-09-24T11:33:46.853+0800 I REPL [ReplicationExecutor] transition to STARTUP2 2015-09-24T11:33:46.853+0800 I REPL [ReplicationExecutor] Starting replication applier threads 2015-09-24T11:33:46.854+0800 I REPL [ReplicationExecutor] transition to RECOVERING 2015-09-24T11:33:46.856+0800 I REPL [ReplicationExecutor] transition to SECONDARY 2015-09-24T11:33:46.857+0800 I REPL [ReplicationExecutor] Member AD-MAC10G-2.local:27051 is now in state PRIMARY 2015-09-24T11:33:46.857+0800 I REPL [ReplicationExecutor] Member AD-MAC10G-2.local:27052 is now in state SECONDARY 2015-09-24T11:33:49.866+0800 I REPL [ReplicationExecutor] syncing from: AD-MAC10G-2.local:27052 2015-09-24T11:33:49.870+0800 I REPL [SyncSourceFeedback] replset setting syncSourceFeedback to AD-MAC10G-2.local:27052 2015-09-24T11:33:49.870+0800 I REPL [rsBackgroundSync] replSet our last op time fetched: Sep 24 11:09:05:2 2015-09-24T11:33:49.871+0800 I REPL [rsBackgroundSync] replset source's GTE: Sep 24 11:26:19:1 2015-09-24T11:33:49.871+0800 I REPL [rsBackgroundSync] beginning rollback 2015-09-24T11:33:49.872+0800 I REPL [rsBackgroundSync] rollback 0 2015-09-24T11:33:49.873+0800 I REPL [ReplicationExecutor] transition to ROLLBACK 2015-09-24T11:33:49.876+0800 I REPL [rsBackgroundSync] rollback 1 2015-09-24T11:33:49.876+0800 I REPL [rsBackgroundSync] rollback 2 FindCommonPoint 2015-09-24T11:33:49.880+0800 I REPL [rsBackgroundSync] replSet info rollback our last optime: Sep 24 11:09:05:2 2015-09-24T11:33:49.881+0800 I REPL [rsBackgroundSync] replSet info rollback their last optime: Sep 24 11:26:21:1 2015-09-24T11:33:49.881+0800 I REPL [rsBackgroundSync] replSet info rollback diff in end of log times: -1036 seconds 2015-09-24T11:33:49.882+0800 I REPL [rsBackgroundSync] replSet rollback found matching events at Sep 24 10:22:45:1 2015-09-24T11:33:49.882+0800 I REPL [rsBackgroundSync] replSet rollback findcommonpoint scanned : 7 2015-09-24T11:33:49.882+0800 I REPL [rsBackgroundSync] replSet rollback 3 fixup 2015-09-24T11:33:49.883+0800 I REPL [rsBackgroundSync] rollback 3.5 2015-09-24T11:33:49.884+0800 I REPL [rsBackgroundSync] rollback 4 n:1 2015-09-24T11:33:49.885+0800 I REPL [rsBackgroundSync] replSet minvalid=Sep 24 11:26:21 56036d5d:1 2015-09-24T11:33:49.894+0800 I REPL [rsBackgroundSync] rollback 4.6 2015-09-24T11:33:49.895+0800 I REPL [rsBackgroundSync] replSet rollback drop: test.test 2015-09-24T11:33:49.898+0800 I REPL [rsBackgroundSync] rollback 4.7 2015-09-24T11:33:49.898+0800 E REPL [rsBackgroundSync] rollback cannot find object by id 2015-09-24T11:33:49.899+0800 I REPL [rsBackgroundSync] rollback 5 d:1 u:0 2015-09-24T11:33:49.900+0800 I REPL [rsBackgroundSync] rollback 6 2015-09-24T11:33:49.900+0800 I REPL [rsBackgroundSync] rollback done 2015-09-24T11:33:49.901+0800 I REPL [ReplicationExecutor] transition to RECOVERING 2015-09-24T11:33:49.902+0800 I REPL [rsBackgroundSync] rollback finished 2015-09-24T11:33:49.903+0800 I REPL [ReplicationExecutor] syncing from: AD-MAC10G-2.local:27052 2015-09-24T11:33:49.909+0800 I REPL [ReplicationExecutor] transition to SECONDARY
However, there is no "rollback" directory in the dbpath after the rollback:
├── db │ ├── journal │ │ ├── j._0 │ │ └── lsn │ ├── local.0 │ ├── local.1 │ ├── local.ns │ ├── mongod.lock │ ├── storage.bson │ ├── test.0 │ └── test.ns └── mongod.log
- duplicates
-
SERVER-18211 MongoDB fails to correctly roll back collection creation
- Closed
- is related to
-
SERVER-16841 rollback could exhaust all free file handles
- Closed
-
SERVER-15929 Verify rollback files are written without errors
- Closed