[SERVER-49732] Change _currentCommittedSnapshot to be an OpTime instead of OpTimeAndWallTime Created: 20/Jul/20 Updated: 29/Oct/23 Resolved: 31/Jul/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | William Schultz (Inactive) | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||
| Sprint: | Repl 2020-08-10 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Right now, the _currentCommittedSnapshot, which is used as the optime for majority readers, is stored as an OpTimeAndWallTime type. After the removal of the stable optime candidates list in PM-1713, however, this value may no longer be sourced from a real oplog optime, so it may not have a "real" wall clock time. To work around this, we give it a dummy wall clock time when we construct it. The wall clock time of the committed snapshot shouldn't serve any functional purpose in the server, since flow control calculations use the lastCommittedOpTime. It is, however, still reported as the readConcernMajorityOpTime in replSetGetStatus. If we determine that there is no significant value in reporting an accurate wall clock time for the readConcernMajorityOpTime, then we should be able to convert the currentCommittedSnapshot to an OpTime type, to prevent confusion in the code. Note that the committed snapshot was originally converted to include a wall clock time in |
| Comments |
| Comment by Githook User [ 31/Jul/20 ] |
|
Author: {'name': 'Tess Avitabile', 'email': 'tess.avitabile@mongodb.com', 'username': 'tessavitabile'}Message: |
| Comment by Tess Avitabile (Inactive) [ 31/Jul/20 ] |
|
This patch removes the readConcernMajorityWallTime field from the replSetGetStatus output, since this value is no longer tracked by the server. The value was roughly the minimum of lastCommittedWallTime and lastAppliedWallTime (which is just lastCommittedWallTime on a primary), so that value can be used as a substitute in calculations, such as for figuring out majority commit point lag. |
| Comment by Maria van Keulen [ 20/Jul/20 ] |
|
bruce.lucas Yup, Flow Control uses the lastCommitted wall clock time for its lag calculations. |
| Comment by Bruce Lucas (Inactive) [ 20/Jul/20 ] |
|
I'm a little surprised to learn that flow control uses lastCommittedOptime and not readConcernMajorityOptime, but probably that's just due to my vague understanding of the difference between the two. maria.vankeulen can you confirm? Beyond that for diagnostic purposes we look at readConcernMajorityOptime lag to understand cache pressure, flow control, etc. and it's sometimes nice to have a wall-clock time for that for higher precision. However if the number that's currently reported there is not really the correct number and you want to remove it that doesn't seem unreasonable, and we can look at lastCommittedOptime wall clock time instead. |
| Comment by William Schultz (Inactive) [ 20/Jul/20 ] |
|
bruce.lucas Do you know of anyone (or any tool) that relies on the wall clock time of the readConcernMajorityOpTime field from replSetGetStatus? |