-
Type:
Improvement
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Live Restore
-
Storage Engines
-
5
-
StorEng - 2025-02-28, StorEng - 2025-03-14
Live restore sets an nbits=-1 metadata entry to indicate that the live restore for that file is complete. This is fine but upon taking a backup and subsequently restoring from it again, we will see nbits=-1, and think that all the files are complete in the database.
Backup re-writes the metadata file on restore and this presents as a good place to remove these existing -1 entries, and also check for any other invalid state.
We can remove the -1 entries at runtime however there is no guarantee that by the time a file is closed out by the sweep server we would be ready to clear the -1. This is because the GLRS state needs to transition to the cleanup phase prior to removal of the -1. We can imagine the following scenario:
Begin live restoreFill holes in a.wt, update the metadata so nbits=-1.The sweep server cleans up a.wt and closes out the btreeTransition to the GLRS cleanup phase, cleaning metadata for open b-trees.Don't remove the -1.Take a backupRestore from that backup, find -1.
In summary the sweep server operating concurrently meant that cleaning up -1 at runtime is not guaranteed to work. A way to avoid this would be to prevent sweeping during live restore but then we'd run into potential cache issues.
The backup / restore metadata file rewrite therefore seem like the best and cleanest option.
Scope:
Implement a replacement function for __metadata_entry_workerTest
edit: We'll first try stripping the -1s when taking the backup, as this leaves the backup file in a clean state with no live restore metadata. This makes it easier for us to handle upgrade/downgrades if they become necessary
- is related to
-
WT-14278 Clean up live restore bitmap metadata on backup open, not on backup close
-
- Closed
-