Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-94693

Equality of strings under collation is not always transitive

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None

      For collations similar to:

      {'locale': 'en', 'strength': 1, 'alternate': 'shifted', 'maxVariable': 'space'}
      

      Transitivity is not guaranteed.

      For three example values:

      a = '\u0020' // Space
      b = '\u0359' // Combining Asterisk Below
      c = '\u00a0' // No-Break Space
      
      > coll.find()
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb94"), "value" : " ", "name" : "a" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb95"), "value" : "͙", "name" : "b" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb96"), "value" : " ", "name" : "c" }
      > coll.find({value:{"$eq":a}}).collation(collation)
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb94"), "value" : " ", "name" : "a" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb95"), "value" : "͙", "name" : "b" }
      > coll.find({value:{"$eq":b}}).collation(collation)
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb94"), "value" : " ", "name" : "a" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb95"), "value" : "͙", "name" : "b" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb96"), "value" : " ", "name" : "c" }
      > coll.find({value:{"$eq":c}}).collation(collation)
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb95"), "value" : "͙", "name" : "b" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb96"), "value" : " ", "name" : "c" }
      

      a == b && b == c but a != c.


      InListData relies on sort and unique to sort and dedupe values, using the provided collator. As the collator does not meet the requirements for an equivalence relation or a strict weak ordering, these do not result in the desired behaviour - it depends on the order of encountered values.

      A such the order of elements in $in can change results.

      e.g.,

      > coll.find({value:{"$in":[a, b, c]}}).collation(collation)
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb94"), "value" : " ", "name" : "a" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb95"), "value" : "͙", "name" : "b" }
      > coll.find({value:{"$in":[c, a, b]}}).collation(collation)
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb94"), "value" : " ", "name" : "a" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb95"), "value" : "͙", "name" : "b" }
      { "_id" : ObjectId("6712505b1fd232a2bd6bfb96"), "value" : " ", "name" : "c" }
      

      gives a different winning plan:

                      "winningPlan" : {
                              "isCached" : false,
                              "stage" : "COLLSCAN",
                              "filter" : {
                                      "value" : {
                                              "$eq" : " "  /// value of a
                                      }
                              },
                              "direction" : "forward"
                      },
      

      vs

                      "winningPlan" : {
                              "isCached" : false,
                              "stage" : "COLLSCAN",
                              "filter" : {
                                      "value" : {
                                              "$in" : [
                                                      " ",  /// value of  a
                                                      " "  /// value of c
                                              ]
                                      }
                              },
                              "direction" : "forward"
                      },
      

      Affected versions not yet narrowed down; at least current master around ebd06e3f122.

            Assignee:
            Unassigned Unassigned
            Reporter:
            james.harrison@mongodb.com James Harrison
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: