Issue Summary
There is a need to add a connection configuration option to disable the shutdown checkpoint, primarily for debugging purposes. This feature would allow developers to simulate scenarios where the primary writes data but does not complete the checkpoint, without resorting to crashing the primary or copying the PALite directory.
Context
- The proposed option could live under the debug= config, as it is not expected to be used by the server in production.
- This would facilitate testing and debugging by enabling simulation of incomplete checkpoints.
- Avoiding the addition of social flags to mark shutdown checkpoints is recommended, as such flags would complicate testing and introduce unnecessary special cases.
- Removing asserts or marking double frees as acceptable is discouraged, as it would weaken tests and potentially allow subtle bugs to slip through.
Proposed Solution
- Implement a connection config option (potentially debug=disable_shutdown_checkpoint) to disable shutdown checkpoint for debugging and testing purposes.
- Ensure this option is only available for testing/debugging and not used in production.
- Do not introduce social flags or remove asserts related to shutdown checkpoint handling.
Original Slack thread: Slack Thread
This ticket was generated by AI from a Slack thread.
- is related to
-
SERVER-121269 Request to implement plh_get_page_ids for improved page discard coverage
-
- Closed
-
- related to
-
WT-16876 Assertion failure in PALITE due to multiple discarded root pages during shutdown checkpoint
-
- Closed
-
-
WT-16870 Review the workflow of reopen_disagg_conn to disable shutdown checkpoint
-
- Backlog
-
-
WT-17110 Update test_layered27.py to use new debug option for skipping checkpoint close
-
- Closed
-