[SERVER-66529] The oplog manager thread updating the oplogReadTimestamp can race with a cappedTruncateAfter operation directly updating the oplogReadTimestamp Created: 17/May/22 Updated: 29/Oct/23 Resolved: 14/Jun/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 6.0.1, 6.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dianna Hohensee (Inactive) | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v6.0
|
||||||||
| Sprint: | Execution Team 2022-06-27 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 0 | ||||||||
| Description |
|
TL;DR the periodic thread updating oplogReadTimestamp doesn't have sufficient mutex coverage to avoid immediately setting the oplogReadTimestamp forward again after cappedTruncateAfter does a direct oplogReadTimestamp update backwards. If the timing is just right. ----------------------------------------------------------------- RecordStore::cappedTruncateAfter has special logic to update the oplogReadTimestamp if it's the record store for the oplog collection. Meanwhile, there's a thread that periodically updates the oplogReadTimestamp. Of note in the thread's logic, it releases the mutex protecting oplogReadTimestamp writes/reads while fetching the WT all_durable timestamp. So here's what I propose happened: 1. The oplogReadTimestamp is T(5,30) So in theory, any internal operation truncating the oplog while the server is up and running (not startup or rollback) could cause this race. If such code exists anywhere. Startup and rollback both restart the storage engine, reseting the all_durable timestamp, and do not have this issue with oplog truncation. |
| Comments |
| Comment by Githook User [ 20/Jul/22 ] |
|
Author: {'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}Message: (cherry picked from commit d9e643fe4dfb7da4ab68d6929477613f244ad361) |
| Comment by Githook User [ 14/Jun/22 ] |
|
Author: {'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}Message: |
| Comment by Dianna Hohensee (Inactive) [ 18/May/22 ] |
|
We should test what happens to the WT all_durable timestamp when the oplog is truncated. I'm not sure whether it unwinds, in which case the unit test isn't going to work as designed. |