[SERVER-71247] DocumentSourceBackupFile should ensure backup cursor is open before returning data Created: 10/Nov/22 Updated: 29/Oct/23 Resolved: 30/Nov/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 6.0.4, 6.3.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Matthew Russotto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | repl-shortlist | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v6.2, v6.1, v6.0
|
||||||||||||
| Sprint: | Repl 2022-12-12 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Currently in FCBIS we call killCursor on the backup cursor in "fire and forget" mode when copying the files from the sync source is done. There is a theoretical race condition where WiredTiger may have internally closed the backup cursor (e.g. due to a slow refresh) and the last file may be corrupt. The documented way to detect this is to check the result on the killCursor when this happens. For FCBIS and any other users of DocumentSourceBackupFile, it is possible and more convenient (and less error-prone) to return an error from DocumentSourceBackupFile if the backup cursor is not still open after reading the data. |
| Comments |
| Comment by Githook User [ 09/Dec/22 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: (cherry picked from commit 1b48004aa0904b8e7610770fe3ab14812923ae9b) |
| Comment by Githook User [ 09/Dec/22 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: (cherry picked from commit ebdfdb87bc6679e053e388cfb7514689b5306869) |
| Comment by Githook User [ 30/Nov/22 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: |
| Comment by Githook User [ 30/Nov/22 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: |
| Comment by Matthew Russotto [ 29/Nov/22 ] |
|
Because of WiredTiger's checksumming, this will not result in silent corruption but rather a crash on the syncing node when it attempts to load a corrupted block. |
| Comment by Matthew Russotto [ 11/Nov/22 ] |
|
I like that idea. Taking an uncontended mutex is just an atomic instruction and a branch on Linux (as is releasing a mutex), and in any case should be small compared to reading a block. |