Details
-
Bug
-
Resolution: Won't Do
-
Major - P3
-
None
-
None
Description
I am following the instructions here to resize the oplog - https://docs.mongodb.com/manual/tutorial/change-oplog-size/.
However the instructions here do not account for the fact that a rollback error could happen when you are trying to resize the oplog on the primary. Instead of copying just the last saved entry shouldn't we be copying the entire previous oplog?
Once we hit this error the only way out appears to be to resync the cluster
2017-04-22T06:02:26.334+0000 I REPL [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: (term: -1, timestamp: Apr 22 05:46:50:e). source's G
|
TE: (term: 832, timestamp: Apr 22 05:47:01:2) hashes: (-8212210602019233945/-4988990565338983357)
|
2017-04-22T06:02:26.334+0000 I REPL [rsBackgroundSync] beginning rollback
|
2017-04-22T06:02:26.334+0000 I REPL [rsBackgroundSync] rollback 0
|
2017-04-22T06:02:26.334+0000 I REPL [ReplicationExecutor] transition to ROLLBACK
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn2] end connection 54.215.75.22:45878 (10 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn4] end connection 54.215.75.22:45880 (10 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn5] end connection 54.219.33.176:46482 (10 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn7] end connection 10.33.131.5:43168 (10 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn10] end connection 54.170.243.118:43286 (10 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn12] end connection 10.123.166.151:40596 (8 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn6] end connection 10.30.222.199:45690 (10 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn1] end connection 54.170.243.118:43284 (9 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn8] end connection 54.151.49.85:42197 (8 connections now open)
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn3] end connection 10.164.16.253:42084 (10 connections now open)
|
2017-04-22T06:02:26.334+0000 I REPL [rsBackgroundSync] rollback 1
|
2017-04-22T06:02:26.334+0000 I NETWORK [conn9] end connection 52.44.59.57:53018 (8 connections now open)
|
2017-04-22T06:02:26.362+0000 I NETWORK [initandlisten] connection accepted from 54.78.134.204:57990 #13 (1 connection now open)
|
2017-04-22T06:02:26.707+0000 I ACCESS [conn13] Successfully authenticated as principal __system on local
|
2017-04-22T06:02:26.862+0000 I REPL [rsBackgroundSync] rollback 2 FindCommonPoint
|
2017-04-22T06:02:26.927+0000 I REPL [rsBackgroundSync] rollback our last optime: Apr 22 05:46:50:e
|
2017-04-22T06:02:26.927+0000 I REPL [rsBackgroundSync] rollback their last optime: Apr 22 06:02:26:4
|
2017-04-22T06:02:26.927+0000 I REPL [rsBackgroundSync] rollback diff in end of log times: -936 seconds
|
2017-04-22T06:02:27.514+0000 I NETWORK [initandlisten] connection accepted from 54.215.75.22:45882 #14 (2 connections now open)
|
2017-04-22T06:02:27.574+0000 I NETWORK [initandlisten] connection accepted from 54.170.243.118:43288 #15 (3 connections now open)
|
2017-04-22T06:02:27.837+0000 I ACCESS [conn14] Successfully authenticated as principal __system on local
|
2017-04-22T06:02:27.899+0000 W REPL [rsBackgroundSync] ignoring op on rollback no ns TODO : { _id: ObjectId('58faefc2517306110e6961a5'), ts: Timestamp 1492840010000|14, h: -8212
|
210602019233945 }
|
2017-04-22T06:02:27.900+0000 F REPL [rsBackgroundSync] rollback error RS101 reached beginning of local oplog
|
2017-04-22T06:02:27.900+0000 I REPL [rsBackgroundSync] scanned: 4551
|
2017-04-22T06:02:27.900+0000 I REPL [rsBackgroundSync] theirTime: Apr 22 05:46:50 58faee4a:d
|
2017-04-22T06:02:27.900+0000 I REPL [rsBackgroundSync] ourTime: Apr 22 05:46:50 58faee4a:e
|
2017-04-22T06:02:27.900+0000 E REPL [rsBackgroundSync] NoMatchingDocument: RS101 reached beginning of local oplog [2]
|
2017-04-22T06:02:27.900+0000 I REPL [rsBackgroundSync] rollback finished
|
2017-04-22T06:02:27.900+0000 I - [rsBackgroundSync] Fatal assertion 28723 UnrecoverableRollbackError: need to rollback, but unable to determine common point between local and
|
remote oplog: NoMatchingDocument: RS101 reached beginning of local oplog [2] @ 18752
|
2017-04-22T06:02:27.900+0000 I - [rsBackgroundSync]
|