[SERVER-46062] Prevent ChunkManagerTargeter from accessing all shard versions before targeting a write Created: 10/Feb/20  Updated: 29/Oct/23  Resolved: 11/Feb/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.3.4

Type: Improvement Priority: Major - P3
Reporter: Blake Oler Assignee: Blake Oler
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Sharding 2020-02-24
Participants:

 Description   

The cluster writer calls ChunkManagerTargeter::targetCollection() in order to verify whether a write targets the config server or shard servers. In doing so, the targeter queries the shard version for each shard. The shard version is necessary in order to create a ShardEndpoint object. However, we don't consume any ShardEndpoint data in the cluster writer. We only use the endpoints to verify whether any of the targeted endpoints are the config server, then we throw away the rest of the object.

This has an unintended side-effect. As a result of PM-1633, when we retrieve a shard version, we will throw an exception if that shard has been marked as stale. As a result, any attempts to target a write through the cluster writer will stall on a catalog cache refresh if any shard is stale, regardless of whether the particular write targets stale shards.

It's key to note that we aren't using ::targetCollection() for its intended purpose – we attempt to collect shard versions that we never use. Fortunately, the cluster writer is the only place where we call ::targetCollection(). We can prevent the issue of querying a shard version causing a refresh if we remove ::targetCollection() entirely.

I propose to remove ::targetCollection() and replace it with a function on NSTargeter/ChunkManagerTargeter endpointIsConfigServer() that will return a boolean representing whether or not the targeted endpoints represent the config server. In doing so, we will retain the same logic that exists, except that we are bypassing creating ShardEndpoints, thus avoiding altogether the shard version issue.

I'm proud to report that from local testing, implementing this change gives a 6% performance improvement in targeted performance workloads.



 Comments   
Comment by Githook User [ 11/Feb/20 ]

Author:

{'username': 'BlakeIsBlake', 'name': 'Blake Oler', 'email': 'blake.oler@mongodb.com'}

Message: SERVER-46062 Prevent ChunkManagerTargeter from accessing all shard versions before targeting a write
Branch: master
https://github.com/mongodb/mongo/commit/df1bdc87c2dca27e3f64eb3fa23411a0c0b758db

Generated at Thu Feb 08 05:10:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.