[SERVER-26030] ConfigServerMetadata is not returned on all responses from shards to mongos Created: 08/Sep/16 Updated: 08/Jan/24 Resolved: 02/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking, Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | [DO NOT USE] Backlog - Sharding Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Sharding
|
||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Platforms 2016-09-19, Platforms 2016-10-10 | ||||||||
| Participants: | |||||||||
| Description |
|
Certain commands, like count for example, still use DBClient to run commands on the shards, which uses the legacy OP_QUERY form of running commands. On the shards, when we are using OP_QUERY, commands use a LegacyReplyBuilder to generate the command response. LegacyReplyBuilder::setMetadata only looks for ShardingMetadata and throws out all other metadata written to the metadata response (including ConfigServerMetadata, where the config server optime is given). We either need to add handling of ConfigServerMetadata to LegacyReplyBuilder::setMetadata, or switch to using OP_COMMAND for all intra-cluster communication. The fact that the ConfigServerMetadata is not included in some responses from shards can result in extra round trips being necessary on some operations. For example if you run an aggregation and the shard returns a StaleConfigException, the mongos will attempt to refresh it's routing table, but may not get new-enough results from the config server to pick up the new version needed and thus may re-run the aggregation with the same version, resulting in yet another StaleConfigException. This would continue until the mongos is able to actually pick up the new information from config.chunks, or hits the max number of retries. |
| Comments |
| Comment by Sheeri Cabral (Inactive) [ 02/Jan/20 ] |
|
We've removed most uses of DBClient and don't think this is a source of a lot of wasted overhead. |
| Comment by Andy Schwerin [ 02/Jun/17 ] |
|
I think we might want to keep ConfigServerMetadata, so that the granularity of the readAfter:clusterTime reads done against the config server use the oldest legal cluster time for operations against the config server, in order to minimize stalls waiting for the config server clock to advance. |
| Comment by Misha Tyulenev [ 02/Jun/17 ] |
|
The change will happen automatically once the ConfigServerMetadata is deprecated and replaced with readAfter:clusterTime the latter is possible only it the change is requested for majority readConcern because clusterTime does not include optime term and hence can be rolled back. It does not seem to be connected directly to Causal Consistency project and there are no plans to make the requested changes in the current release, hence moving it to the backlog. |
| Comment by Mira Carey [ 11/Oct/16 ] |
|
Marking as 3.3 desired for the downconversion of metadata for 3.4.X portion. I'll farm out a separate ticket for 3.5 for removing dbclient |
| Comment by Justin Cohler [ 19/Sep/16 ] |
|
mira.carey@mongodb.com - can you follow up with Spencer on this? |
| Comment by Randolph Tan [ 08/Sep/16 ] |
|
spencer I can think of anything on top of my head aside from chunk metadata commands (split/merge/move). I took a quick peek at the auto split code and it appears to be using the task executor to send the splitChunk command so it should be using OP_COMMAND... |
| Comment by Spencer Brody (Inactive) [ 08/Sep/16 ] |
|
I don't think this can cause a correctness bug, as that could only happen if one of the commands that is using OP_QUERY resulted in the shard performing a write to the config servers, and I think that can only happen from the autosplit logic in write commands, and write commands use OP_COMMAND. That said, it's still a bit worrisome... renctan, can you think of any other times a mongos can run a command against a shard and then the shard would perform a write to the config server as part of that? |