Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41093

Primary shard can accept writes for chunk it does not own if _shardsvrShardCollection receives error when writing to config server

    XMLWordPrintable

    Details

    • Operating System:
      ALL
    • Linked BF Score:
      25

      Description

      It's possible during shardCollection for the following to occur
      1. _configsvrShardCollection is sent to the config server
      2. The config server sends _shardsvrShardCollection to a shard
      3. The shard sends an update to the config server to add the collection to the 'config.collections' collection
      4. That update succeeds on the config server, but the shard receives some error like NetworkInterfaceExceededTimeLimit, so that it does not hear back from the config server or know whether or not the write succeeded
      5. The _shardsvrShardCollection fails, but the config server thinks that collection is sharded, so there's a mismatch between the shard and the config server

      If the shard then received an unversioned write to that collection, it would not refresh its routing table, and would treat the collection as unsharded and write the document locally. This could be problematic though, if another write goes through a different router, which does do a refresh from the config server and finds the collection is sharded, and writes data to the correct place.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-sharding-emea Backlog - Sharding EMEA
              Reporter:
              matthew.saltz Matthew Saltz
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: