Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-87973

Fix race condition in block_chunk_migrations_without_hashed_shard_key_index

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Catalog and Routing
    • Fully Compatible
    • ALL
    • 5

      The block_chunk_migrations_without_hashed_shard_key_index test migrates chunks between shards and finishes with the following sequence:

      • wait for balancer round to finish
      • wait until a chunk show up on the recipient shard
      • check the shardVersion equivalence on CSRS and the donor shard

      The checking might fail due to a race condition if a stepdown happens during the balancer round of the CSRS.
      In that case the following happens:

      • CSRS stepdown -> balancer round ends
      • donor commands the recipient to update the local catalog (new chunks show up)
      • donor commits the change on CSRS
      • CSRS updates shardVersion
      • check the shardVersion equivalence on CSRS and the donor shard
      • CSRS responses the update -> donor has updated shardVersion

      in this scenario the check happens just in the wrong time causing a fail in the test.
      the intention was to eliminate this race condition with the wait for balancer round to finish but if a CSRS stepdown happens the race condition could occur.

      Recommended solution:
      Wait for the migrations to finish with the _shardsvrJoinMigrations internal command before the shardVersion checking (after the balancer is stopped) before

            Assignee:
            adam.farkas@mongodb.com Wolfee Farkas
            Reporter:
            adam.farkas@mongodb.com Wolfee Farkas
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: