[SERVER-84121] Backup cursor service reports incorrect 'ns' field in the backup cursor response. Created: 12/Dec/23 Updated: 25/Jan/24 Resolved: 25/Jan/24 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 8.0.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Wei Hu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | storex-shortlist | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Storage Execution
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Sprint: | Execution Team 2024-01-08, Execution Team 2024-01-22, Execution Team 2024-02-05 | ||||
| Participants: | |||||
| Linked BF Score: | 6 | ||||
| Description |
|
It's a bug in mongodb code, particularly this part of the code. Basically, we don't use this mdb_catalog checkpoint cursor that was opened prior to the backup cursor open; instead, we use a different mdb_catalog checkpoint cursor ( getParsedCatalogEntry() opens a new checkpoint cursor), opened after the backup cursor open, to fill in the 'ns' field in the backup cursor response . This means that if a checkpoint occurred after this BackupCursorOpenConflictWithCheckpoint check, the alternate checkpoint cursor might be operating on a different snapshot than the backup cursor, potentially resulting in incorrect 'ns' information in the backup cursor response. As a result, this causes selective_backup_restore_e2e.js to inadvertently skip copying files that's actually part of backup snapshot, resulting in the restore node crash due to missing files. ================= 1) "BackupCursorOpenConflictWithCheckpoint" check currently performs two checks, namely "LastStableRecoveryTimestamp" and "checkpoint id." It would be beneficial to either enhance the error message or add a debug log message to specify which check failed and provide details about the mismatched checkpoint and recovery timestamp. 2) Use ReadSourceScope RAII instead of explicitly setting the TimestampReadSource here and here for better readability. Also, the RAII make sure we abandon the snapshot before we explicitly setting the TimestampReadSource in the recovery unit. |
| Comments |
| Comment by Githook User [ 25/Jan/24 ] |
|
Author: {'name': 'Wei Hu', 'email': 'wei.hu@mongodb.com', 'username': 'wh5a'}Message: GitOrigin-RevId: 11295651dbc664d562a03ab98295af48a14535f0 |
| Comment by Suganthi Mani [ 16/Jan/24 ] |
|
copy-paste of my slack message
|