Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-95427

IXScan chosen over distinct scan for multikey query

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 6.0.6
    • Component/s: None
    • None
    • Query Optimization
    • ALL
    • Hide
      db.foo.drop()
      for (var cnt = 0; cnt < 10; cnt++) {
          db.foo.insert({cust: 1, labels: ["Alpha", "Beta"]})
      }
      db.foo.createIndex({cust: 1, labels: 1})
      printjson(db.foo.explain('executionStats').distinct('labels', {cust: 1})) # shows IXScan->Fetch with nReturned=10 -- expect a distinct scan
      print(db.foo.distinct('labels', {cust: 1})) # ['Alpha', 'Beta'] 

       

      Show
      db.foo.drop() for (var cnt = 0; cnt < 10; cnt++) {     db.foo.insert({cust: 1, labels: ["Alpha", "Beta"]}) } db.foo.createIndex({cust: 1, labels: 1}) printjson(db.foo.explain('executionStats').distinct('labels', {cust: 1})) # shows IXScan->Fetch with nReturned=10 -- expect a distinct scan print(db.foo.distinct('labels', {cust: 1})) # ['Alpha', 'Beta']  

      Suppose we have a collection with a large number of documents. Where each document has two fields. One is a scalar value and the other is a list of strings.

      Suppose there's an index on {scalar: 1, list: 1}. And we perform operations of the form:

       foo.distinct('list', {scalar: <exact match>})

      We expect it's a legal optimization to use a distinct scan on the index bounded on the scalar match (despite multikey idiosyncracies).

      What we're observing is that mongod instead does an index scan and fetches each matching document to aggregate the distinct "list" values.

      I suspect this (lack of) optimization dates back to SERVER-28952. IIUC, that ticket describes a correctness problem. But I believe SERVER-28952 is a slight variation as its query predicate also depends on the multikey field. But happy to be wrong here and learn that the proposed optimization is in fact not legal for this simpler case.

            Assignee:
            Unassigned Unassigned
            Reporter:
            danny.gottlieb@gmail.com Daniel Gottlieb
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: