Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18266

Standardize token-length limits between RLP and Snowball

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.1.5
    • Affects Version/s: 3.1.2
    • Component/s: Text Search
    • Fully Compatible
    • ALL
    • Hide
      (function() {
          'use strict';
      
          var t = db.fts_rlp;
          t.drop();
      
          assert.commandWorked(t.ensureIndex({a: 'text'}));
      
          assert.writeOK(t.insert({a: new Array(1024 * 16 + 2).join('a'), language: 'en'}));
          assert.writeOK(t.insert({a: new Array(1024 * 16 + 2).join('a'), language: 'zht'}));
      }());
      
      Show
      (function() { 'use strict'; var t = db.fts_rlp; t.drop(); assert.commandWorked(t.ensureIndex({a: 'text'})); assert.writeOK(t.insert({a: new Array(1024 * 16 + 2).join('a'), language: 'en'})); assert.writeOK(t.insert({a: new Array(1024 * 16 + 2).join('a'), language: 'zht'})); }());
    • Platform 4 06/05/15, Platform 5 06/26/16

      The 16KB limit for RLP tokens leads to inconsistencies when issuing write operations across different languages. It'd be nice to remove (or sufficiently increase) the limit to make the language handling transparent to clients:

      > db.foo.insert({a: new Array(1024 * 16 + 2).join('a'), language: 'en'});
      WriteResult({ "nInserted" : 1 })
      
      > db.foo.insert({a: new Array(1024 * 16 + 2).join('a'), language: 'zht'});
      WriteResult({
      	"nInserted" : 0,
      	"writeError" : {
      		"code" : 28632,
      		"errmsg" : "Maximum token size reached"
      	}
      })
      

            Assignee:
            mark.benvenuto@mongodb.com Mark Benvenuto
            Reporter:
            kamran.khan Kamran K.
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: