Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-65905

Shard split test times out non deterministically

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • ALL

      On an patch for a change unrelated to shard split on top of master, the shard_split_write_during_split_stepdown.js test has timed out.

      This test does the following :

      • Set the pauseShardSplitAfterBlocking failpoint
      • Start a shard split, wait for the failpoint
      • Start async write operations
      • Stepdown the primary
      • Wait for shard split to return InterruptedDueToReplStateChange
      • Check the write thread returns TenantMigrationCommitted
      • Stop the test

      In this run, a shard split timeout occurred simultaneously with the stepdown. The blocking failpoint can be "unblocked" by the timeout or stepdown as it's  interruptible. The split was resumed by another node (d20522) following stepdown of the initial primary (d20520). The split ran to completion on d20522 then waited for the forget command. The test framework did not send this command and simply stopped it (as expected). d20522 never stopped and the test timed out.

            Assignee:
            didier.nadeau@mongodb.com Didier Nadeau
            Reporter:
            didier.nadeau@mongodb.com Didier Nadeau
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: