[SERVER-81829] RecoveryUnit SnapshotIDs do not need to increment atomic for each snapshot Created: 03/Oct/23  Updated: 13/Nov/23  Resolved: 13/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Storage Execution NAMER
Sprint: Execution NAMR Team 2023-10-16, Execution Team 2023-10-30, Execution Team 2023-11-13, Execution Team 2023-11-27
Participants:

 Description   

We use the RecoveryUnit snapshot ID for comparison purposes to tell us two things:

  • Has my snapshot changed on the same RecoveryUnit? (e..g normal yielding)
  • Has my snapshot changed from a different RecoveryUnit? (e.g. with getMore)

We need to maintain that snapshot IDs are globally unique, but instead of incrementing the atomic for every snapshot, we can just have a globally unique RecoveryUnit ID and a locally unique snapshot counter for each RecoveryUnit. This should be sufficient to reduce contention on the Snapshot ID atomic and still maintains uniqueness.



 Comments   
Comment by Louis Williams [ 13/Nov/23 ]

Closing this as "Won't do". The performance benefit is negligible. The proposed change increases the risk of SnapshotID collision, which requires a 32-bit int to roll-over, previously a 64-bit int. Historically, this has led to server crashes, so it's a bit risky considering the marginal improvements.

Generated at Thu Feb 08 06:47:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.