Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26030

ConfigServerMetadata is not returned on all responses from shards to mongos

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Networking, Sharding
    • Labels:
      None
    • Sharding
    • ALL
    • Platforms 2016-09-19, Platforms 2016-10-10

      Certain commands, like count for example, still use DBClient to run commands on the shards, which uses the legacy OP_QUERY form of running commands. On the shards, when we are using OP_QUERY, commands use a LegacyReplyBuilder to generate the command response. LegacyReplyBuilder::setMetadata only looks for ShardingMetadata and throws out all other metadata written to the metadata response (including ConfigServerMetadata, where the config server optime is given). We either need to add handling of ConfigServerMetadata to LegacyReplyBuilder::setMetadata, or switch to using OP_COMMAND for all intra-cluster communication.

      The fact that the ConfigServerMetadata is not included in some responses from shards can result in extra round trips being necessary on some operations. For example if you run an aggregation and the shard returns a StaleConfigException, the mongos will attempt to refresh it's routing table, but may not get new-enough results from the config server to pick up the new version needed and thus may re-run the aggregation with the same version, resulting in yet another StaleConfigException. This would continue until the mongos is able to actually pick up the new information from config.chunks, or hits the max number of retries.

            Assignee:
            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            Reporter:
            spencer@mongodb.com Spencer Brody (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: