Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32871

ReplicaSetMonitorRemoved and ShardNotFound errors on fanout query after removing a shard

    • Fully Compatible
    • ALL
    • v4.2, v4.0
    • Hide
      • Shard a mongo collection, test.test, across shard1, shard2, etc...
      • Make sure that no queries/inserts/... to test.test occur during/after the shard removal
      • On mongos1, {removeShard: "shard1"}
      • Wait until the removal is complete (i.e. removeShard indicates the removal is complete)
      • On each of the mongos's, call db.test.count()
      Show
      Shard a mongo collection, test.test, across shard1, shard2, etc... Make sure that no queries/inserts/... to test.test occur during/after the shard removal On mongos1, {removeShard: "shard1"} Wait until the removal is complete (i.e. removeShard indicates the removal is complete) On each of the mongos's, call db.test.count()
    • Sharding 2019-09-09

      We've noticed that after removing a shard, fanout queries (e.g. issue a collection count against a sharded collection) will return ReplicaSetMonitorRemoved or ShardNotFound errors. While investigating, it looks like the internal chunk cache has an old config (getShardVersion on the collection returns an old version). It appears that as long as no non-fanout queries (or inserts/removes) are issued after the remove has completed, fanout queries on some mongos have a relatively high chance of consistently failing.

        1. logs.txt
          284 kB
        2. logs-3.6.txt
          474 kB
        3. remove4.js
          3 kB
        4. remove4-3.6.js
          3 kB

            Assignee:
            matthew.saltz@mongodb.com Matthew Saltz (Inactive)
            Reporter:
            bartle David Bartley
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: