[SERVER-71915] In catalog shard mode, there is a race between Query Analysis Sampler and Query Analysis Coordinator. Created: 06/Dec/22  Updated: 29/Oct/23  Resolved: 22/Mar/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.0.0-rc0

Type: Task Priority: Major - P3
Reporter: Kshitij Gupta Assignee: Wenqin Ye
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-71818 Use the new cluster role class in the... Closed
Assigned Teams:
Sharding NYC
Backwards Compatibility: Fully Compatible
Sprint: Sharding NYC 2023-03-06, Sharding NYC 2023-03-20, Sharding NYC 2023-04-03
Participants:

 Description   

Query analyses sampler is only supposed to run on the shard servers and it refreshes configurations which adds samplers, while query analysis coordinator is only supposed to run on the config server and on startup checks that there aren't any samplers. In catalog shard mode, there is a race where the sampler adds samplers before the coordinator startup causing the invariant check to fail. As an unblocker, this failure was avoided by conditioning the check on featureFlagCatalogShard being turned off.

 

We should figure out a long term solution for this.



 Comments   
Comment by Githook User [ 21/Mar/23 ]

Author:

{'name': 'wenqinYe', 'email': 'wenqin908@gmail.com', 'username': 'wenqinYe'}

Message: SERVER-71915: In catalog shard mode, there is a race between Query Analysis Sampler and Query Analysis Coordinator
Branch: master
https://github.com/mongodb/mongo/commit/b138e8951449fad26e9e1f8a6f39d269c64e53da

Comment by Wenqin Ye [ 16/Mar/23 ]

I believe this race condition is currently not possible.

A sampler can only be added if the MongoD is open for connection. The MongoD is only open for connection here on line 902. Which is after the QueryAnalysisCoordinator has finished starting up here on line 726. So it is guaranteed that when the QueryAnalysisCoordinator starts up that the `_samplers` will be empty and the invariant will always be true.

I also ran a patch with the feature flag check around the invariant removed, and the invariant is never hit: https://spruce.mongodb.com/version/6412408ad1fe07bdbb51eeae/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

Generated at Thu Feb 08 06:20:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.