[SERVER-82648] Improve the locking in LogicalTimeValidator::get/set Created: 01/Nov/23  Updated: 05/Feb/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Critical - P2
Reporter: Mark Benvenuto Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: perf-tiger, perf-tiger-handoff, perf-tiger-non-q4, perf-tiger-poc, sharding-nyc-subteam3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Cluster Scalability
Sprint: Cluster Scalability 2023-11-13, Cluster Scalability 2023-11-27, Cluster Scalability 2023-12-11, Cluster Scalability 2023-12-25, Cluster Scalability 2024-1-8, Cluster Scalability 2024-1-22, Cluster Scalability 2024-2-5, Cluster Scalability 2024-2-19
Participants:

 Description   

LogicalTimeValidator::get/set uses a global mutex to protect a shared pointer. C++20 adds support for atomic shared_ptr's but our toolchain v4 compilers do not yet support it. LogicalTimeValidator is set once or twice in its life during startup (see SERVER-78036 ) and never again.

Ideally, a cheap memory reclamation technology like hazard pointers could be used to protect writes to this shared pointer. But since that does not exist in the code base, an alternative is to use a partitioned reader/writer lock so that all the readers do not have to touch the same cache line for taking the reader/writer lock.

POC: https://github.com/10gen/mongo/commit/b28b01966c174f13527ecc19c0e507d0ba38ab36



 Comments   
Comment by Randolph Tan [ 04/Nov/23 ]

Note: set is called only during startup for replSet case/sharding not initialized and during sharding initialization. We can't simply get rid of the mutex once we only allow sharding topologies without handling the case where you create a new shard and add it to the cluster. One possible solution is for the shard to insert the shard identity doc itself with the help of --configdb before accepting incoming connections (this will ensure that sharding is always initialized by the time we start accepting connections).

The other alternative is to change the logical time key validation protocol to something smart that doesn't require the shard to talk to the config server so the validator can be created independent of sharding initialization.

Comment by Ryan Scott [ 03/Nov/23 ]

mark.benvenuto@mongodb.com Are there available results to see the perf gains from the POC? 

Generated at Thu Feb 08 06:49:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.