Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56318

timeseries_sample.js can fail spuriously due to pseudorandomness

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Hide

      This can be reproduced reliably by applying the following patch to the test, and then running it locally with resmoke.py:

      diff --git a/jstests/noPassthrough/timeseries_sample.js b/jstests/noPassthrough/timeseries_sample.js
      index 71110731a0..d2dd6f773a 100644
      --- a/jstests/noPassthrough/timeseries_sample.js
      +++ b/jstests/noPassthrough/timeseries_sample.js
      @@ -117,9 +117,12 @@ let runSampleTests = (measurementsPerBucket, backupPlanSelected) => {
           assertUniqueDocuments(result);
      
           // Check that we have executed the correct branch of the TrialStage.
      -    const optimizedSamplePlan =
      -        coll.explain("executionStats").aggregate([{$sample: {size: sampleSize}}]);
      -    assertPlanForSample(optimizedSamplePlan, backupPlanSelected);
      +    for (let i = 0; i < 5000; ++i) {
      +        print("attempt number: " + i);
      +        const optimizedSamplePlan =
      +            coll.explain("executionStats").aggregate([{$sample: {size: sampleSize}}]);
      +        assertPlanForSample(optimizedSamplePlan, backupPlanSelected);
      +    }
      
           // Run an agg pipeline with optimization disabled.
           result = coll.aggregate([{$_internalInhibitOptimization: {}}, {$sample: {size: 1}}]).toArray();
      

      I've never seen the loop added in this patch run more than 1000 times before the test fails, which is why I infer that the probability of the assertion failing is >0.1%.

      Show
      This can be reproduced reliably by applying the following patch to the test, and then running it locally with resmoke.py: diff --git a/jstests/noPassthrough/timeseries_sample.js b/jstests/noPassthrough/timeseries_sample.js index 71110731a0..d2dd6f773a 100644 --- a/jstests/noPassthrough/timeseries_sample.js +++ b/jstests/noPassthrough/timeseries_sample.js @@ -117,9 +117,12 @@ let runSampleTests = (measurementsPerBucket, backupPlanSelected) => { assertUniqueDocuments(result); // Check that we have executed the correct branch of the TrialStage. - const optimizedSamplePlan = - coll.explain( "executionStats" ).aggregate([{$sample: {size: sampleSize}}]); - assertPlanForSample(optimizedSamplePlan, backupPlanSelected); + for (let i = 0; i < 5000; ++i) { + print( "attempt number: " + i); + const optimizedSamplePlan = + coll.explain( "executionStats" ).aggregate([{$sample: {size: sampleSize}}]); + assertPlanForSample(optimizedSamplePlan, backupPlanSelected); + } // Run an agg pipeline with optimization disabled. result = coll.aggregate([{$_internalInhibitOptimization: {}}, {$sample: {size: 1}}]).toArray(); I've never seen the loop added in this patch run more than 1000 times before the test fails, which is why I infer that the probability of the assertion failing is >0.1%.
    • Query Execution 2021-05-03
    • 28

      The test is inherently subject to randomness, since it is testing our random sampling implementation. It intends to make assertions based on the probably of an event being miniscule. However, this assertion can fail with non-negligible probability. I've shown experimentally that the probability of this assertion failing strictly due to randomness is >0.1%. Since this test will indeed run thousands of times, the probably of failure needs to be many orders of magnitude lower.

      In order to pass as currently written, the ARHASH algorithm needs to obtain 5 valid samples in 100 iterations. The buckets are 1% full, so the likelihood of a single iteration obtaining a valid document is ~1%. Getting 5 hits in 100 attempts is apparently not as unlikely as it needs to be!

            Assignee:
            david.storch@mongodb.com David Storch
            Reporter:
            david.storch@mongodb.com David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: