[SERVER-71915] In catalog shard mode, there is a race between Query Analysis Sampler and Query Analysis Coordinator. Created: 06/Dec/22 Updated: 29/Oct/23 Resolved: 22/Mar/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.0.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Kshitij Gupta | Assignee: | Wenqin Ye |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Sharding NYC
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Sprint: | Sharding NYC 2023-03-06, Sharding NYC 2023-03-20, Sharding NYC 2023-04-03 | ||||||||
| Participants: | |||||||||
| Description |
|
Query analyses sampler is only supposed to run on the shard servers and it refreshes configurations which adds samplers, while query analysis coordinator is only supposed to run on the config server and on startup checks that there aren't any samplers. In catalog shard mode, there is a race where the sampler adds samplers before the coordinator startup causing the invariant check to fail. As an unblocker, this failure was avoided by conditioning the check on featureFlagCatalogShard being turned off.
We should figure out a long term solution for this. |
| Comments |
| Comment by Githook User [ 21/Mar/23 ] |
|
Author: {'name': 'wenqinYe', 'email': 'wenqin908@gmail.com', 'username': 'wenqinYe'}Message: |
| Comment by Wenqin Ye [ 16/Mar/23 ] |
|
I believe this race condition is currently not possible. A sampler can only be added if the MongoD is open for connection. The MongoD is only open for connection here on line 902. Which is after the QueryAnalysisCoordinator has finished starting up here on line 726. So it is guaranteed that when the QueryAnalysisCoordinator starts up that the `_samplers` will be empty and the invariant will always be true. I also ran a patch with the feature flag check around the invariant removed, and the invariant is never hit: https://spruce.mongodb.com/version/6412408ad1fe07bdbb51eeae/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC |