[SERVER-4408] _id index should be used for find queries against nested fields of the _id object Created: 02/Dec/11  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: 2.0.1
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Thilo Planz Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-4518 match nested fields of index keys Closed
Assigned Teams:
Query Optimization
Participants:
Case:

 Description   

> db.textarea.vermongo.find();
{ "_id" : { "_id" : "y", "_version" : 1 }, "name" : "Thilo Planz", "_version" : 1 }
 
> db.textarea.vermongo.find( {'_id': {'_id': 'y', '_version': 1 }});
{ "_id" : { "_id" : "y", "_version" : 1 }, "name" : "Thilo Planz", "_version" : 1 }
 
explain =>
	"cursor" : "BtreeCursor _id_",
 
> db.textarea.vermongo.find( {'_id._id': 'y'});
 
explain =>
       "cursor" : "BasicCursor",

When using a BSONObject as the _id for the record, querying against only a part of the object should still use the _id btree index. I appreciate that because the existence and order of fields in the nested document is not known, this will not be a range scan (not even for an _id field of the nested document), but a "fast-full-index-scan" type of access path should be possible and faster than a table scan.



 Comments   
Comment by Thilo Planz [ 02/Dec/11 ]

Come to think of it, this reasoning does not only apply for the _id index, but for any kind of index built on nested documents.

And the workaround would be to build an extra index on the nested field to be queried:

> db.textarea.vermongo.ensureIndex(

{'_id._id': 1}

);
> db.textarea.vermongo.find(

{'_id._id': 'y'}

).explain();
{
"cursor" : "BtreeCursor _id._id_1",

However, I was putting these nested documents into the _id field mainly for the purpose of not requiring an additional index in addition to the mandatory _id index.

(And the priority for all this is low, because I can achieve what I want to do by doing an explicit range query on the _id field, as I am querying an the first field of the nested document, so binary sort order in the _id index serves fine. However, if I wanted to query on the second field it would be nice to have a index fast-full scan).

Generated at Thu Feb 08 03:05:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.