[SERVER-21693] Log a rollback summary Created: 30/Nov/15  Updated: 18/Feb/20  Resolved: 29/Mar/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.7.4

Type: Improvement Priority: Major - P3
Reporter: James Wahlin Assignee: Kyle Suarez
Resolution: Done Votes: 0
Labels: rollback-non-functional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-30501 Create rollback logging component Closed
Documented
is documented by DOCS-12388 Update rollback file examples Closed
Related
related to SERVER-33986 Log stable checkpoint when restarting... Closed
is related to SERVER-29051 RollbackImpl should create rollback d... Closed
is related to SERVER-31062 Add more detailed diagnostics for rol... Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2018-03-26, Repl 2018-04-09
Participants:
Linked BF Score: 60

 Description   

When a rollback is performed we should provide a summary statement in the mongoD log which provides:

  1. Success/failure of the rollback
  2. A clear summary of the actions performed
  3. The name and location of any rollback files written
  4. A unique tag/ID that could be used for alert purposes

The following is an example of what we print in 3.2.0-rc4. This example is rolling back 2 documents that this member contained and the primary did not. There is a summary of the actions taken ("rollback 5 d:2 u:0") but it is not in a form most users will understand.

data/replset/rs2/mongod.log:2015-11-30T10:19:30.578-0500 I REPL     [rsBackgroundSync] starting rollback: OplogStartMissing our last op time fetched: (term: 2, timestamp: Nov 30 10:18:47:1). source's GTE: (term: 3, timestamp: Nov 30 10:19:18:2) hashes: (605599081996457668/881505516521125255)
data/replset/rs2/mongod.log:2015-11-30T10:19:30.579-0500 I REPL     [rsBackgroundSync] beginning rollback
data/replset/rs2/mongod.log:2015-11-30T10:19:30.580-0500 I REPL     [rsBackgroundSync] rollback 0
data/replset/rs2/mongod.log:2015-11-30T10:19:30.580-0500 I REPL     [ReplicationExecutor] transition to ROLLBACK
data/replset/rs2/mongod.log:2015-11-30T10:19:30.581-0500 I REPL     [rsBackgroundSync] rollback 1
data/replset/rs2/mongod.log:2015-11-30T10:19:30.585-0500 I REPL     [rsBackgroundSync] rollback 2 FindCommonPoint
data/replset/rs2/mongod.log:2015-11-30T10:19:30.586-0500 I REPL     [rsBackgroundSync] rollback our last optime:   Nov 30 10:18:47:1
data/replset/rs2/mongod.log:2015-11-30T10:19:30.586-0500 I REPL     [rsBackgroundSync] rollback their last optime: Nov 30 10:19:18:2
data/replset/rs2/mongod.log:2015-11-30T10:19:30.587-0500 I REPL     [rsBackgroundSync] rollback diff in end of log times: -31 seconds
data/replset/rs2/mongod.log:2015-11-30T10:19:30.587-0500 I REPL     [rsBackgroundSync] rollback 3 fixup
data/replset/rs2/mongod.log:2015-11-30T10:19:30.589-0500 I REPL     [rsBackgroundSync] rollback 3.5
data/replset/rs2/mongod.log:2015-11-30T10:19:30.589-0500 I REPL     [rsBackgroundSync] rollback 4 n:1
data/replset/rs2/mongod.log:2015-11-30T10:19:30.591-0500 I REPL     [rsBackgroundSync] rollback 4.6
data/replset/rs2/mongod.log:2015-11-30T10:19:30.591-0500 I REPL     [rsBackgroundSync] rollback 4.7
data/replset/rs2/mongod.log:2015-11-30T10:19:30.594-0500 I REPL     [rsBackgroundSync] rollback 5 d:2 u:0
data/replset/rs2/mongod.log:2015-11-30T10:19:30.594-0500 I REPL     [rsBackgroundSync] rollback 6
data/replset/rs2/mongod.log:2015-11-30T10:19:30.595-0500 I REPL     [rsBackgroundSync] rollback done
data/replset/rs2/mongod.log:2015-11-30T10:19:30.597-0500 I REPL     [rsBackgroundSync] rollback finished



 Comments   
Comment by Githook User [ 29/Mar/18 ]

Author:

{'email': 'kyle.suarez@mongodb.com', 'name': 'Kyle Suarez', 'username': 'ksuarz'}

Message: SERVER-21693 generate a summary after rollback via recover to timestamp
Branch: master
https://github.com/mongodb/mongo/commit/e5f6c368bd34bb9122c0085401d2ead3d4d3f78c

Comment by Githook User [ 29/Mar/18 ]

Author:

{'email': 'kyle.suarez@mongodb.com', 'name': 'Kyle Suarez', 'username': 'ksuarz'}

Message: SERVER-21693 add host and port to oplog interface
Branch: master
https://github.com/mongodb/mongo/commit/e1cb7f58d887815ae696a824291fdfc59fc35861

Comment by Githook User [ 29/Mar/18 ]

Author:

{'email': 'kyle.suarez@mongodb.com', 'name': 'Kyle Suarez', 'username': 'ksuarz'}

Message: Revert "SERVER-21693 generate a summary after rollback via recover to timestamp"

This reverts commit fee5ac1166c2b4a0a11e00e58f2d9c12d4567e25.

Conflicts:
src/mongo/db/op_observer.h
Branch: master
https://github.com/mongodb/mongo/commit/6f706d7893cc54ebcf52ee4bc98ad3612dc0d99b

Comment by Kay Kim (Inactive) [ 28/Mar/18 ]

Thanks kyle.suarez – Probably will add a note to the rollback page to check the log message. We already have a generic https://docs.mongodb.com/manual/reference/log-messages/index.html#ROLLBACK section in the log message page, so it'll be a quick blurb that you can check the rollback status in the log messages or something.

Comment by Kyle Suarez [ 28/Mar/18 ]

kay.kim, I'm marking documentation changes as "Not Needed", as the new rollback algorithm doesn't change anything to make https://docs.mongodb.com/manual/core/replica-set-rollbacks/ outdated. However, we will now have a new rollback summary section that gets printed to the logs and looks something like this: https://gist.github.com/ksuarz/562faa6e84710331ff1be69df80475c1

Just giving the docs team a heads up in case this is something interesting to document.

Comment by Githook User [ 28/Mar/18 ]

Author:

{'email': 'kyle.suarez@mongodb.com', 'name': 'Kyle Suarez', 'username': 'ksuarz'}

Message: SERVER-21693 generate a summary after rollback via recover to timestamp
Branch: master
https://github.com/mongodb/mongo/commit/fee5ac1166c2b4a0a11e00e58f2d9c12d4567e25

Comment by Spencer Brody (Inactive) [ 16/Aug/17 ]

I agree that surfacing information about what happened during a rollback in a different way than we currently do could be valuable, but that is out of scope for this ticket. This ticket is simply about ensuring that the new rollback approach maintains parity with the old one. Can you file a separate feature request describing your proposal for what you'd like to see us do instead robert.guo?

Comment by Robert Guo (Inactive) [ 14/Aug/17 ]

It could be useful to also log the summary in a special collection (e.g. local.rollback_summary) so test suites that introduce random rollbacks can make more precise assertions on the correctness of the cluster during rollback by consulting the rollback summary collection

Generated at Thu Feb 08 03:58:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.