Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32692

Make zbigMapReduce.js, sharding_balance4.js, and bulk_shard_insert.js more resilient under slow machines

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.3.1
    • Component/s: Sharding
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.0, v3.6
    • Sprint:
      Sharding 2019-09-09, Sharding 2019-09-23, Sharding 2019-10-07
    • Linked BF Score:
      37

      Description

      zbigMapReduce.js fails occasionally because more than 5 migrations manage finish since the beginning of either of the two bulk writes it executes, causing the test to fail since the write never establishes a shard version. Similarly to sharding_balance4.js as of SERVER-28697, we should ignore a certain number of NoProgressMade errors to make the test fail less frequently.

      sharding_balance4.js and bulk_shard_insert.js occasionally fail because more than 10 migrations complete during the course of a find command exhausting mongos's retry attempts and failing the test. Modifying the test to retry a couple times on StaleShardVersion should make it fail less often.

      We can also consider making a generic override for read commands that retry on StaleShardVersion errors, so it can be load-ed into tests that involve frequent migrations.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              matthew.saltz Matthew Saltz
              Reporter:
              jack.mulrow Jack Mulrow
              Participants:
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: