Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46845

Shard which received a StaleShardVersion can get stuck indefinitely in a moveChunk command

    XMLWordPrintableJSON

Details

    • Fully Compatible
    • ALL
    • v4.4
    • Sharding 2020-03-23, Sharding 2020-04-06, Sharding 2020-04-20
    • 15

    Description

      When a shard updates its knowledge of its shard version, post migration commit, it logs a message which looks like this:

      [ShardedClusterFixture:job0:shard1:primary] 2020-02-04T15:18:15.510+0000 I  COMMAND  [conn291] command admin.$cmd appName: "tid:54" command: getMore { getMore: 3975999160804323048, collection: "$cmd.aggregate", lsid: { id: UUID("d2eb0b2e-2ff8-4263-ab12-e5f9514ff6a4"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, $clusterTime: { clusterTime: Timestamp(1580829495, 39), signat[ShardedClusterFixture:job0:shard1:primary] 2020-02-04T15:18:11.560+0000 I  SHARDING [conn55] Updating metadata for collection config.system.sessions from collection version: 15|0||5e398add924cca4d6c4487b2, shard version: 0|0||5e398add924cca4d6c4487b2 to collection version: 16|0||5e398add924cca4d6c4487b2, shard version: 16|0||5e398add924cca4d6c4487b2 due to version change
      

      This log line comes from here and if we zoom inside CollectionMetadata::toStringBasic(), the call to log the current shard version will invoke ChunkManager::getVersion(ShardId).

      If it so happens that the CatalogCache's entry for a collection gets invalidated with the local shard id and there is a concurrently running migration, it is possible that the completion of the chunk migration will get stuck indefinitely, because ChunkManager::getVersion will keep throwing ShardInvalidatedForTargetingInfo exceptions and will keep getting retried under refreshFilteringMetadataUntilSuccess

      I think the bug is currently not happening, because somehow after the logging changes were committed this line is no longer logged in the test output. For example here.

      Attachments

        Activity

          People

            blake.oler@mongodb.com Blake Oler
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: