-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Critical - P2
-
Affects Version/s: None
-
Component/s: DHandles
-
None
-
Storage Engines - Foundations
-
None
-
None
-
1
Outdated btrees occur as a result of picking up a new checkpoint. When a new checkpoint comes in old btrees opened at a previous checkpoint are located and marked to be "outdated". This causes any future opens of the btree to do a complete open (getting the new checkpoint). However, there's no special handling of outdated btrees past that point.
At some point, all cursors that were open on the outdated btree are closed. At that point, the sweep machinery starts, treating this dhandle like any other: the dhandle is marked with a timeofdeath, and we wait until close_idle_time seconds have passed. By default WT sets this to 30. However it looks like it is set to 600 for mongodb here: https://github.com/mongodb/mongo/blob/46fd0f67d113797c0a78c34d6ed96cf40e4f24be/src/mongo/db/storage/wiredtiger/wiredtiger_global_options.idl#L201
So, we're probably keeping those old trees around for 10 minutes. For an outdated btree that has all cursors closed, there should be no path for any session to reopen the btree, so it should be closed immediately.
- blocks
-
WT-16334 crash on disagg follower during reconfig to new checkpoint
-
- Open
-
-
WT-16156 Cross-checkpoint caching POC
-
- Open
-
- causes
-
WT-16591 Fix prune timestamp not advanced
-
- In Code Review
-
- is depended on by
-
WT-15449 test/format (disagg.mode=switch) cache stuck
-
- Open
-
- is duplicated by
-
WT-15043 Allow outdated History Store dhandles to be swept
-
- Closed
-
- is related to
-
WT-15800 Eviction server trying to write the page from a follower on shutdown
-
- Closed
-
-
WT-16292 Follower should use shared history store at a checkpoint that matches the cursor
-
- Closed
-
-
WT-16506 task-timed-out: model-test-long-random-config-disagg on amazon2023-arm64-asan [wiredtiger @ f164a994]
-
- Closed
-
- related to
-
WT-16334 crash on disagg follower during reconfig to new checkpoint
-
- Open
-
-
WT-16232 Complete connection sweeps should run for disagg
-
- Backlog
-
-
WT-15527 We should open the checkpoint of shared metadata in the follower
-
- In Code Review
-
-
WT-16591 Fix prune timestamp not advanced
-
- In Code Review
-
-
WT-16508 Revert change to quickly expire btrees due to regressions
-
- Closed
-
-
WT-16509 Immediately expire outdated btrees, and fix/mitigate problems with earlier fix
-
- Closed
-
-
WT-16528 Investigate whether a separate internal session is required to apply checkpoints
-
- Open
-
-
WT-16517 failed: unit-test-extra-long on rhel80 [wiredtiger @ fd829370]
-
- Open
-
-
WT-16506 task-timed-out: model-test-long-random-config-disagg on amazon2023-arm64-asan [wiredtiger @ f164a994]
-
- Closed
-