[SERVER-41261] Use the oplog entry after the common point to calculate rollbackTimeLimitSecs Created: 21/May/19 Updated: 29/Oct/23 Resolved: 12/Jul/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.0-rc3, 4.0.13, 4.3.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Alyson Cabral (Inactive) | Assignee: | Jason Chan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Backport Requested: |
v4.2, v4.0
|
||||||||||||||||
| Sprint: | Repl 2019-07-01, Repl 2019-07-15 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
In Atlas you can pause a cluster, effectively shutting the nodes down for a period of time. Let's assume we pause for more than 24 hours and that all the nodes are current having committed all the writes. When they are restarted at the same time, we are seeing two nodes run and two branches of history forming. Eventually, one goes into rollback and gets a fassert because the common point is more than 24 hours behind even though we are only rolling back 1 or 2 very recent oplog entries. The common point, in this case, is from over 24 hours ago where the oplog entry immediately after the common point is from less than 5 mins ago. While we believe we are fixing the two nodes running at the same time problem via |
| Comments |
| Comment by Githook User [ 19/Aug/19 ] |
|
Author: {'username': 'jasonjhchan', 'email': 'jason.chan@10gen.com', 'name': 'Jason Chan'}Message: (cherry picked from commit a5d088eefec42927f339ff9288f9eb078d5a8686) |
| Comment by Githook User [ 17/Jul/19 ] |
|
Author: {'name': 'Jason Chan', 'username': 'jasonjhchan', 'email': 'jason.chan@10gen.com'}Message: (cherry picked from commit a5d088eefec42927f339ff9288f9eb078d5a8686) |
| Comment by Githook User [ 12/Jul/19 ] |
|
Author: {'name': 'Jason Chan', 'username': 'jasonjhchan', 'email': 'jason.chan@10gen.com'}Message: |