Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23881

allow regex word character (\w) and word boundary (\b) escapes to be unicode-aware

    • Type: Icon: New Feature New Feature
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.11
    • Component/s: Querying
    • None
    • Query Optimization

      Provide a way to use regular expressions in MongoDB where the word character (\w) and word boundary (\b) escapes work for code points greater than or equal to 256.

      Original description

      $regex word boundary fails by treating Danish ø character as a non-character

      db.collection.find({ "name" : { "$regex" : ".*\\bden\\b.*" , "$options" : "i"} })
      

      returns a document:

      {  "name": "Death Is A Caress(Døden Er Et Kjærtegn).sub" }
      

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            niccottrell Nic Cottrell (Personal)
            Votes:
            2 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: