Full coll scan sample not found on read path

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • ALL
    • Hide

      This test case (when added to cbr_persistent_sample.js) will fail because it doesn't find the sample persisted by the analyze command.

       

      {        jsTest.log.info(            "Testing chunk sampling with small collection (200 docs) forces full collection scan - persistent sample hit",        );        assert.commandWorked(db.adminCommand({setParameter: 1, internalQuerySamplingCEMethod: "chunk"}));
              const kSmallCollSize = 200;        coll.drop();        PersistentSamplesUtils.dropSamplesColl(db);        assert.commandWorked(            coll.insert(Array.from({length: kSmallCollSize}, (_, i) => ({_id: i, a: i, tag: "small"}))),        );
              assert.commandWorked(            db.runCommand({                analyze: collName,                mode: "sample",                samplingMethod: "chunk",                sampleSize: kSampleSize,                numChunks: kNumChunks,            }),        );
              // Since the default sample size is greater than 200, we expect to get a full collection scan sample.        const expectedDocCount = 200;
              const meta = getWinningPlanMetadata({a: {$gte: 0}});
              assert.eq(meta.sampleSource, "persisted", "expected persisted sample on hit", {meta});        assert.eq(meta.sampleTechnique, "chunk", "expected chunk technique", {meta});        assert.eq(meta.sampleNumChunks, kNumChunks, "expected numChunks to match", {meta});        assert.eq(meta.sampleDocCount, expectedDocCount, "expected docCount to match persisted sample size", {meta});        assert.eq(meta.sampleRequestedDocCount, kSampleSize, "expected requestedDocCount to match", {meta});    }

       

      Show
      This test case (when added to cbr_persistent_sample.js) will fail because it doesn't find the sample persisted by the analyze command.   { jsTest.log.info( "Testing chunk sampling with small collection (200 docs) forces full collection scan - persistent sample hit" , ); assert .commandWorked(db.adminCommand({setParameter: 1, internalQuerySamplingCEMethod: "chunk" })); const kSmallCollSize = 200; coll.drop(); PersistentSamplesUtils.dropSamplesColl(db); assert .commandWorked( coll.insert(Array.from({length: kSmallCollSize}, (_, i) => ({_id: i, a: i, tag: "small" }))), ); assert .commandWorked( db.runCommand({ analyze: collName, mode: "sample" , samplingMethod: "chunk" , sampleSize: kSampleSize, numChunks: kNumChunks, }), ); // Since the default sample size is greater than 200, we expect to get a full collection scan sample. const expectedDocCount = 200; const meta = getWinningPlanMetadata({a: {$gte: 0}}); assert .eq(meta.sampleSource, "persisted" , "expected persisted sample on hit" , {meta}); assert .eq(meta.sampleTechnique, "chunk" , "expected chunk technique" , {meta}); assert .eq(meta.sampleNumChunks, kNumChunks, "expected numChunks to match" , {meta}); assert .eq(meta.sampleDocCount, expectedDocCount, "expected docCount to match persisted sample size" , {meta}); assert .eq(meta.sampleRequestedDocCount, kSampleSize, "expected requestedDocCount to match" , {meta}); }  
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      When the collection size is less than the default sample size of 384, generateFullCollScanSample() is called instead of generateRandomSample() or generateChunkSample(). In this case, we don't call tryLoadPersistentSample() like we do in the random/chunk sampling functions, so we never find a persisted sample if there is one. We should modify generateFullCollScanSample() to call tryLoadPersistentSample().

            Assignee:
            Natalie Hill
            Reporter:
            Natalie Hill
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: