Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18734

The match $in operator is using a ValueSet(std::set)

    • ALL

      While implementing a feature to handle CSV like input of the form:

      A,B,C // header
      1,2,3
      4,5,6
      etc...
      

      We naively implemented it with the following $match condition:

      $or: [
          { A: 1, B: 2, C: 3},
          { A: 4, B: 5, C: 6},
          etc...
      ]
      

      After seeing bad performances/scalability of this approach we tried two alternatives (these are in an aggregation pipeline):

      • One with $in:
      $project: {
          computed_obj: { "1": "$A", "2": "$B", "3": "$C" }
      },
      $match: {
          computed_obj: { 
              $in: [
                  { "1": 1, "2": 2, "3": 3 },
                  { "1": 3, "2": 4, "3": 5 },
                  etc...
              ]
          }
      }
      
      • One with $setIsSubset:
      $project: {
          condition_value: {
              $setIsSubset: [
                  {
                      $map: {
                          input: [null], 
                          as: "var__", 
                          in { "1": "$A", "2": "$B", "3": "$C" }
                      }
                  }, 
                  [
                     {"1": 1, "2": 2, "3": 3},
                     {"1": 3, "2": 4, "3": 5},
                     etc...
                  ]
              ]
          }
      }, 
      $match: { condition_value: true }
      

      We found that when starting to have big enough sets the $in approach was in fact slower and not even the same complexity than the $setIsSubset one.
      We then noticed that $setIsSubset is using a std::unordered_set whereas $in is using a simple std::set.

      Is there a reason why $in is using a std::set over an std::unordered_set?

            Assignee:
            charlie.swanson@mongodb.com Charlie Swanson
            Reporter:
            antoine.hom@amadeus.com Antoine Hom
            Votes:
            1 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: