[SERVER-31595] Generate shardMaps outside MODE_X collection lock Created: 17/Oct/17  Updated: 27/Oct/23  Resolved: 12/Nov/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kevin Pulo Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Gone away Votes: 0
Labels: ShardingRoughEdges, former-quick-wins, gm-ack, max-triage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-30148 Move force primary refresh functional... Closed
is related to SERVER-31428 Poor performance when many concurrent... Closed
Assigned Teams:
Sharding
Participants:
Case:

 Description   

SERVER-31428 moved the generation of the CollectionMetadata chunksMap into the MODE_X collection lock section, as a way of serialising concurrent metadata refreshes, and thereby avoiding the redundant computation performance penalty from all metadata refreshing threads generating the map at the same time. However, all that is actually required is to ensure that this map is generated only once, by one thread, ie. it isn't necessary to hold the MODE_X collection lock to create this data structure. Therefore it would be better if this map was created outside the collection lock (by just one thread), to reduce the time that it's held.

Most likely this will involve a variant of CatalogCache::getCollectionRoutingInfo, which accepts a lambda to be called-back (guaranteed by a single thread) and this lambda is what updates the metadata on the CollectionShardingState.



 Comments   
Comment by Max Hirschhorn [ 12/Nov/21 ]

The changes from 0d07bf5 as part of SERVER-40258 made it so only a collection IX lock was needed to build the ChunkManager. With the advent of the dedicated RecoverRefreshThread (e.g. SERVER-45983), no locks are not held during the shard version refresh itself - only when installing the new collection metadata.

Comment by Dianna Hohensee (Inactive) [ 27/Oct/17 ]

Kal and I were talking about the routing table refresh logic recently, and agreed that we think the active CollectionMetadata object can be updated in an IX lock rather than an X lock. We believe it is safe, but aren't totally sure. We'd like to add the task of investigating whether we can do it and then updating the locks to this ticket, since it alleviates the problem this ticket is trying to improve.

Generated at Thu Feb 08 04:27:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.