Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-8003

Fix frequent duplicate keys returned by random cursor in resharding test

    XMLWordPrintable

Details

    • Bug
    • Status: Open
    • Major - P3
    • Resolution: Unresolved
    • None
    • None
    • None
    • 8
    • Prioritised_Pipeline

    Description

      Resharding uses $sample internally.  I.e., it is using a WT random cursor.  In a resharding performance test, occasionally the test fails when $sample repeatedly fails to find 100 unique documents.

      In this ticket we should reproduce the failure, adding instrumentation to WT as needed, and once we understand the issue find a way to make random cursors behave better in the problem case.

      The problem test is the ReshardCollection.yml genny workload. It inserts 100,000 10KB documents split evenly across two shards.  It then reshards the cluster while 100 threads perform reads and writes (find and update commands).  Resharding tries to get ~200 samples from each shard via $sample.  Occasionally, the sample includes duplicate keys.   We see an error when 100 consecutive attempts to get ~200 unique keys all fail.  

      $sample is allowed to return duplicate keys. But given the number of keys and size of the sample, having this happen repeatedly is surprising and undesirable.

      Attachments

        Issue Links

          Activity

            People

              backlog-server-storage-engines Backlog - Storage Engines Team
              keith.smith@mongodb.com Keith Smith
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: