Aggregation Pipeline with $in filter not picking DISTINCT_SCAN plan

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      When running an aggregation pipeline where the implicit result is a distinct set filtering documents using the $in operator reverts to IXSCAN instead of DISTINCT_SCAN

       

      Example:

      const statuses = ["PENDING", "COMPLETE", "SHIPPED", "CANCELLED", "RETURNED"];
      const bulk = [];
      for (let i = 1; i <= 1000; i++) {
        for (let r = 0; r < 10; r++) {
          bulk.push({
            customer_id: i,
            status: statuses[Math.floor(Math.random() * statuses.length)]
          });
        }
      }
      db.col1.insertMany(bulk);
      
      db.col1.createIndex({customer_id:1,status:1}) 
      

      Running the following:

      // Used DISTINCT_SCAN
      db.col1.explain("executionStats").distinct( "status", { customer_id: { $in: [1, 2, 3] } } ) 
      
      // Used DISTINCT_SCAN
      db.col1.explain("executionStats").distinct( "status", { customer_id: { $in: [1, 500, 750] } } )
      
      // Used DISTINCT_SCAN
      db.col1.aggregate([ { $match: { customer_id: { $in: [1] } } },  { $group: { _id: "$status" } } ]).explain("executionStats") // single value on $in
      
      // Used IXSCAN
      db.col1.aggregate([ { $match: { customer_id: { $in: [1, 2, 3] } } }, { $group: { _id: "$status" } } ]).explain("executionStats") // multiple values on $in
       

       

      • The distinct method is picking the correct plan for this case.
      • Aggregation rewritten into a distinct query where the $in operator is the same as $eq ($in: [1]= $eq: 1) is using DISTINCT_SCAN.
      • Aggregation rewritten into a distinct query where the $in operator pass multiple values revert to IXSCAN.

            Assignee:
            Unassigned
            Reporter:
            Guilherme Wahlbrink
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: