-
Type:
Technical Debt
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: Live Restore
-
Storage Engines, Storage Engines - Persistence
-
8
-
StorEng - 2025-04-25
Motivating conversation: https://github.com/wiredtiger/wiredtiger/pull/11729#discussion_r1996060799
Interactions between the live restore state lock and the turtle lock are tricky, so we need to intercept all calls to read/write the turtle file so we can take the state lock first.
This ticket is to investigate whether we can simplify that by never taking the state lock when touching non-btree files.
It is important that we don't reintroduce deadlocks here, so keep in mind that when writing the turtle file we must take the state lock to know what to write in the turtle file. This means both the state lock and the turtle lock will be held at the same time, and all paths that take both locks must take those locks in the same order.
Some paths to consider when working on this ticket:
- __wti_live_restore_set_state takes the state lock (A) and then opens the turtle file which requires the turtle lock (B), and inside the turtle logic fetches the live restore state via a readlock (re-entrant A)
- Any path that writes/rewrites/updates the turtle file will open the turtle file (B) and fetch the live restore state (B). We intercept these calls to first take the state lock (A) so behaviour is consistent with the path above.
- All read paths follow the same order of locking, but don't need to take the state lock (A) for a second time when writing the turtle file
- We also take the state lock when opening any file via __live_restore_setup_lr_fh_file, or when we atomically copy a non-btree file. See the linked GitHub comment for details