[SERVER-41291] Oldest timestamp not always advanced with --enableMajorityReadConcern=false, on secondary nodes Created: 23/May/19 Updated: 29/Oct/23 Resolved: 24/May/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 4.0.5, 4.0.9 |
| Fix Version/s: | 4.0.10 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Eric Milkie |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Execution Team 2019-06-03 | ||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
Run the following against a replica set with majority read concern disabled. It generates bursts of updates spaced 15 seconds apart.
FTDC shows the following:
During the bursts we are executing a couple hundred batches per second, but the oldest timestamp is not advanced on every batch, and in fact may not be advanced for minutes at a time. This can cause a large amount of data to be pinned in cache. |
| Comments |
| Comment by Githook User [ 24/May/19 ] |
|
Author: {'name': 'Eric Milkie', 'email': 'milkie@10gen.com', 'username': 'milkie'}Message: |
| Comment by Eric Milkie [ 24/May/19 ] |
|
|
| Comment by Eric Milkie [ 24/May/19 ] |
|
The problem appears to be a race between setting the oplog read timestamp in oplogDiskLocRegister and the one in the WTOplogJournalThread (oplog visibility thread) here. |
| Comment by Bruce Lucas (Inactive) [ 23/May/19 ] |
|
By the way, a note on running the repro: the chart above is from a 2-node replica set. The behavior reproduces on a 3-node replica set as well, but it seems to be a bit more complicated - maybe one secondary exhibits the behavior more than the other, and maybe they tended not to both exhibit it all the time. So from an analysis standpoint it might be simpler to start with a 2-node replica set. |
| Comment by Bruce Lucas (Inactive) [ 23/May/19 ] |
|
This does not seem to reproduce on 4.1.11 nor 3.6.12 (for this particular reproducer). |