[SERVER-12518] Text index should not prevent insertion of documents having large keys Created: 28/Jan/14  Updated: 11/Jul/16  Resolved: 13/Feb/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Text Search
Affects Version/s: 2.5.5
Fix Version/s: 2.6.0-rc0

Type: Task Priority: Major - P3
Reporter: Tyler Brock Assignee: Benety Goh
Resolution: Done Votes: 0
Labels: 26qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Participants:

 Description   

> db.test.drop();
> db.test.ensureIndex({a: 'text'});
> var long = '';
> for(var i=0; i<1024; i++){ long = long + 'a'; }
> db.test.insert({a: long})

Shell says:

SingleWriteResult({
	"writeErrors" : [
		{
			"index" : 0,
			"code" : 17280,
			"errmsg" : "insertDocument :: caused by :: 17280 Btree::insert: key too large to index, failing test.test.$a_text 1047 { : \"<long-key>...\", : 1.1 }",
			"op" : {
				"_id" : ObjectId("52e833df70104fa5ad62f2b1"),
				"a" : "<long-key>"
			}
		}
	],
	"writeConcernErrors" : [ ],
	"nInserted" : 0,
	"nUpserted" : 0,
	"nUpdated" : 0,
	"nModified" : 0,
	"nRemoved" : 0,
	"upserted" : [ ]
})

Mongod says:

2014-01-28T17:49:03.940-0500 [conn2] test.test Btree::insert: key too large to index, failing test.test.$a_text 1047 { : "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...", : 1.1 }
2014-01-28T17:49:03.940-0500 [conn2] test.test  caught assertion addKeysToIndex test.test.$a_text_id: ObjectId('52e833df70104fa5ad62f2b1')

This is new behavior for indexing in 2.6, in 2.4 the document was allowed to be inserted even though the index entry would not be created. The concern is that this default might not be the best for a text index given that the data being indexed may commonly have this error.

Clients could obviously catch the SingleWriteResult "writeError" and handle it at the application level by storing the field that would have created the large index key under a non-indexed document e_name and attempt to re-save but it would be nice if they didn't have to.



 Comments   
Comment by Githook User [ 13/Feb/14 ]

Author:

{u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}

Message: SERVER-12518 support insertion of documents with large keys into text index
Branch: master
https://github.com/mongodb/mongo/commit/4b992a6f1cb2c2422ac7ca602f366a847f07e795

Comment by Tyler Brock [ 12/Feb/14 ]

Could we do that sort of hash in the shell or in an application easily? Why not md5 of the whole text so it's easily findable? I believe murmur hash is faster and also what we use for hashed sharding so i would understand if everyone dislikes my idea and happily not interfere.

Generated at Thu Feb 08 03:28:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.