Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18447

RLP fails to tokenize Chinese strings with ESC control characters

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: 3.1.2
    • Component/s: Text Search
    • ALL
    • Hide
      var t = db.rlp;
      t.drop();
      
      assert.commandWorked(t.ensureIndex({a: 'text'}));
      assert.eq(t.find({$text: {$search: '\u001b', $language: 'zht'}}).itcount(), 0);
      
      Show
      var t = db.rlp; t.drop(); assert.commandWorked(t.ensureIndex({a: 'text'})); assert.eq(t.find({$text: {$search: '\u001b', $language: 'zht'}}).itcount(), 0);
    • None
    • 0
    • None
    • None
    • None
    • None
    • None
    • None

      This is a bug in RLP, but it can cause issues for MongoDB users who attempt to query for, or index, Chinese strings with ESC control characters.

      It affects both Traditional Chinese (zht) and Simplified Chinese (zhs) strings.

      > var t = db.rlp;
      > t.drop();
      true
      
      > t.ensureIndex({a: 'text'});
      {
      	"createdCollectionAutomatically" : true,
      	"numIndexesBefore" : 1,
      	"numIndexesAfter" : 2,
      	"ok" : 1
      }
      
      // Traditional Chinese
      > t.find({$text: {$search: '\u001b', $language: 'zht'}});
      Error: error: {
      	"$err" : "Unable to process the document with return code: -10005, and document '\u001b'.",
      	"code" : 28627
      }
      
      // Simplified Chinese
      > t.find({$text: {$search: '\u001b', $language: 'zhs'}});
      Error: error: {
      	"$err" : "Unable to process the document with return code: -10005, and document '\u001b'.",
      	"code" : 28627
      }
      

            Assignee:
            backlog-server-platform DO NOT USE - Backlog - Platform Team
            Reporter:
            kamran.khan Kamran K. (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: