Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-98479

Revisit full covered index scans as collection scan alternative

    • Query Optimization

      There have been some changes in index scanning code recently. It seems that full covered index scans are now faster than collection scans for simple queries.

      If this is indeed the case, should we revisit the decision to always prefer collection scans in this case? Locally running 

      db.c.aggregate([{$match: {a:1}}, {$project: {_id:0, a:1}}, {$group: {_id: null, count: {$count: {}}}}]);

      On ~300k identical {a:1} documents, seems that workingTimeMillis in the slow query log drops from about 141ms to 109ms when adding an index. In another case, running

      db.c.aggregate([{$match: {a: {$mod: [2, 1]}}}, {$project: {_id:0, a:1}}, {$group: {_id: null, count: {$count: {}}}}]); 

      on ~350k identical {a:1, b:1} documents, seeing a slight drop from ~295ms to ~281ms when adding an index {a, b}. Here are the index bounds

      "indexBounds" : {
          "a" : [
              "[nan.0, inf.0]"
          ],
          "b" : [
              "[MinKey, MaxKey]"
          ]
      }

      There may be something I'm missing here - for instance, this isn't a thorough consideration of the EXACT, INEXACT_COVERED, INEXACT_FETCH cases, so I'm not sure if these examples are truly apples-to-apples comparisons. Without the group stage to force the query to read the entire table, IIRC, the collection scan case uses classic.

      Chris Harris suggested adding something to QO triage to revisit this in light of the recent code changes. Chris also pointed to SERVER-23406 and SERVER-20066 as possibly related issues.

       

            Assignee:
            christopher.harris@mongodb.com Chris Harris
            Reporter:
            evan.bergeron@mongodb.com Evan Bergeron
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: