[SERVER-64507] Substitute checkpoint mutex in WT KV Engine with smart waiting Created: 15/Mar/22  Updated: 26/Oct/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Jordi Olivares Provencio Assignee: Backlog - Storage Execution Team
Resolution: Unresolved Votes: 0
Labels: former-storex-namer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-64523 Move all our checkpoint'ing logic ont... Backlog
Assigned Teams:
Storage Execution
Participants:

 Description   

The _checkpointMutex in wiredtiger_kv_engine.cpp is simply awaited to be locked when attempting to perform a checkpoint.

In the case two threads attempt to checkpoint we would have two checkpointing actions performed one after the other. If possible we should instead replace it with something smarter that detects if someone is already checkpointing and wait and return when they finish so we don't perform another checkpoint afterwards.



 Comments   
Comment by Jordi Olivares Provencio [ 15/Mar/22 ]

Suppose we have the following case:

  1. Thread A begins a checkpoint
  2. Thread B does some operation
  3. Thread B begins a checkpoint attempt, sees A doing a checkpoint and waits for A to finish
  4. Thread A finishes the checkpoint and notifies B
  5. Thread B believes it has checkpointed its data when in fact it hasn't possibly leading to an inconsistency

I don't know if the given is possible with WT, but in this case we should verify that relying on Thread A's checkpoint from the perspective of Thread B does not lead to inconsistencies.

Generated at Thu Feb 08 06:00:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.