Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-88191

CheckRoutingTableConsistency might check sharding's catalog in transient state

    • Catalog and Routing
    • Fully Compatible
    • ALL
    • CAR Team 2024-04-01

      SERVER-85441 added a new policy to the balancer to move unsharded collections using moveCollection. Internally, moveCollection uses the resharding infrastructure to perform an online movement of data. This means that now all suites that uses the balancer are randomly calling resharding, including suites that automatically run the CheckRoutingTableConsistency hook, which checks that every chunk has a matching collection in config.collections.

      These two things are incompatible because the commit phase of resharding might temporary leave chunks without a collection in the commit phase, so the following interleaving might happen:

      • The balancer issues a moveCollection
      • The test finishes, starting the CheckRoutingTableConsistency hook
      • CheckRoutingTableConsistency might check the sharding catalog before the commit phase of resharding finishes

      Causing a false positive of metadata inconsistency failure. There is an initiative to use CheckMetadataConsistency instead (SERVER-76646) which actually serializes with DDL so the check is done in a steady state, however, it will require some work, and until is done, this false positive is going to cost time to developers investigating failures in their patches. We should add a temporary workaround by waiting for all resharding operation to finish before running the CheckRoutingTableConsistency checks.

            Assignee:
            paolo.polato@mongodb.com Paolo Polato
            Reporter:
            marcos.grillo@mongodb.com Marcos José Grillo Ramirez
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: