[SERVER-76299] Report writeConflicts in serverStatus on secondaries Created: 19/Apr/23 Updated: 29/Oct/23 Resolved: 14/Jul/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0, 7.0.1, 4.4.24, 5.0.20, 6.0.9 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Wenbin Zhu | Assignee: | Wenbin Zhu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | repl-shortlist | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Backport Requested: |
v7.0, v6.0, v5.0, v4.4
|
||||||||||||||||||||
| Sprint: | Repl 2023-07-24 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Primary records the number of writeConflicts in serverStatus by calling recordCurOpMetrics() in the write path, however secondaries never do that. It is helpful to show writeConflicts counter in serverStatus on secondaries since it can help identify issues during secondary oplog application. For example, we saw a problem on secondary that due to the transaction size exceeding cache threshold, the oplog application got stuck in the writeConflict retry loop. Not recording writeConflicts on secondaries made it hard to confirm the issue and we had to rely on other indirect evidences. |
| Comments |
| Comment by Githook User [ 16/Aug/23 ] |
|
Author: {'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}Message: (cherry picked from commit 251dfc8c42679cd1c3527943bc641214cfcf1c1f) |
| Comment by Githook User [ 03/Aug/23 ] |
|
Author: {'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}Message: (cherry picked from commit 251dfc8c42679cd1c3527943bc641214cfcf1c1f) |
| Comment by Githook User [ 02/Aug/23 ] |
|
Author: {'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}Message: (cherry picked from commit 251dfc8c42679cd1c3527943bc641214cfcf1c1f) |
| Comment by Githook User [ 02/Aug/23 ] |
|
Author: {'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}Message: (cherry picked from commit 251dfc8c42679cd1c3527943bc641214cfcf1c1f) |
| Comment by Githook User [ 14/Jul/23 ] |
|
Author: {'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}Message: |
| Comment by Louis Williams [ 20/Apr/23 ] |
|
A general improvement for both primaries and secondaries would be to increment the global writeConflict counter immediately without waiting until recordCurOpMetrics is called. |
| Comment by Wenbin Zhu [ 19/Apr/23 ] |
|
Note: this is mostly based on code inspection, implementer should probably double check this behavior first. |