[SERVER-14002] issue querying data Created: 20/May/14  Updated: 10/Dec/14  Resolved: 21/May/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Robert Fehrmann Assignee: Thomas Rueckstiess
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

I just came across an interesting issue in 2.4.9. Not sure if this is a know issue or not but I thought I'll send it across.
If you create a document with a char attribute that's longer than 1011 characters a search via a regular express will not find it yet if you give it a hint to use the id index the document will be found. BTW, the problem is fixed in 2.6.1, ie not a problem for us anymore since we are already upgrading to 2.6.1.

Insert the 2 test records below into collection test on a 2.4.9 instance and the first query returns 1 document, the second query returns 2 documents. In 2.6.1 both queries return 2 documents. Only difference between the documents is that attribute c has 1 additional character.

Thanks
Robert

db.test.find(

{"c" : /123456789/}

)
db.test.find(

{"c" : /123456789/}

).hint("id")

{
"_id" : "1",
"c" : "012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
}
{
"_id" : "2",
"c" : "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901"
}



 Comments   
Comment by Thomas Rueckstiess [ 21/May/14 ]

You are welcome, Robert. I'm glad I could clarify that issue.

I'm resolving the ticket now.

Regards,
Thomas

Comment by Robert Fehrmann [ 21/May/14 ]

Thomas,

In fact I hadn't even notices the second index on "c" and you are right, this problem only hits in case there's an index on attribute c and you can't create an index on a column that's too long in 2.6.1.

So I got exactly the same results as you. Thank you so much for looking into this case.

Thanks
Robert

Comment by Thomas Rueckstiess [ 20/May/14 ]

Hi Robert,

I believe this issue has to do with the maximum size of an index key entry, which is 1024 in MongoDB (including structural BSON overhead), see MongoDB Limits. But I'd like to ask some questions to clarify:

  1. Did you have an index on {c:1} in the example with 2.4.9 ? I can only reproduce the behavior you describe if this index is present.
  2. For 2.6.1 I'm assuming you did not have an index on {c:1} because we've changed the behavior of having too large entries. They are now rejected (rather than silently ignored) and therefore it's not possible to have that document indexed. Can you confirm?

So my hypothesis is this: You had an index on {c:1} in 2.4.9, but the too large document was not present in the index (the log file would have contained lines like this when building the index:

Tue May 20 18:22:58.567 [conn3] build index test.test { c: 1.0 }
Tue May 20 18:22:58.568 [conn3]  test.system.indexes Btree::insert: key too large to index, skipping test.test.$c_1 1025 { : "012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789..." }
Tue May 20 18:22:58.568 [conn3] warning: not all entries were added to the index, probably some keys were too large
Tue May 20 18:22:58.568 [conn3] build index done.  scanned 2 total records. 0.001 secs

Now if you query with the regex (or any other query that uses the index on c), it will only return the indexed document. A query with hint() on _id will return both documents, as both are in the _id index.

I was not able to reproduce what you are seeing with 2.6.1 because like I said it's not possible to have an index on {c:1} with such a document. If you repeated the test in 2.6.1 without an index on {c:1} then you would always get both documents back.

Can you repeat your queries on 2.6.1 but add an .explain() to the end of the queries and paste the output below?

Thanks,
Thomas

Generated at Thu Feb 08 03:33:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.