-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
Execution Team 2019-12-30, Execution Team 2020-03-23, Execution Team 2020-04-06, Execution Team 2020-04-20, Execution Team 2020-06-01, Execution Team 2020-06-15
According to the findings in SERVER-41187, there is a period of time after an election where the lastCommitted OpTime is still from the previous term, and the lastApplied OpTime on the primary is from the new term. The majority committed "lag" is overstated as a result, because it includes time during which the replica set is not accepting writes.
Calculations of majority committed lag should ideally exclude this down time.
These are the percentile breakdowns of the first majority committed write after a new term across 4.0 and 4.2. The units are in seconds:
| Percentile | v40 | v42 | |------------+----------------+----------------| | 10 | 0.902999997139 | 0.925999879837 | | 20 | 1.10899996758 | 0.996999979019 | | 30 | 1.1819999218 | 1.04299998283 | | 40 | 1.27499985695 | 1.14300012589 | | 50 | 1.40999984741 | 1.38400006294 | | 60 | 1.71499991417 | 1.89600014687 | | 70 | 2.09000015259 | 1.97300004959 | | 80 | 2.1930000782 | 2.03999996185 | | 90 | 2.39699983597 | 2.2009999752 | | 95 | 2.7619998455 | 2.77999997139 | | 99 | 7.47925007343 | 3.97099995613 |
- is related to
-
SERVER-41187 Majority committed replication lag spikes after an election
- Closed