[SERVER-5620] _id index underused Created: 16/Apr/12  Updated: 07/Mar/14  Resolved: 16/Apr/12

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: 2.0.4
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Aristarkh Zagorodnikov Assignee: Unassigned
Resolution: Done Votes: 0
Labels: _id, document, indexing, query
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Minor Change
Operating System: ALL
Participants:

 Description   

I'm not sure if the functionality I expect is intended, but the results I got are certainly unexpected.
Consider the following schema:
{_id:

{t:<time>, i:<integer>}

, m:<string>}
And a collection, containing about 1600 documents with different values.
I query the collection (trying to get the latest element by "_id.t") with:
db.coll.find().sort(

{"_id.t":-1}

).limit(1)
It's horribly slow, up to several hundred milliseconds.
When I run explain() on it it says:
{
"cursor" : "BasicCursor",
"nscanned" : 589,
"nscannedObjects" : 589,
"n" : 1,
"scanAndOrder" : true,
"millis" : 113,
"nYields" : 1,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {

}
}
So, it appears to scan the table sequentially.
I have to add index by

{"_id.t":-1}

, to be able to query the collection fast, then explain returns more satistactory results:
{
"cursor" : "BtreeCursor id.t-1",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"_id.t" : [
[

{ "$maxElement" : 1 }

,

{ "$minElement" : 1 }

]
]
}
}

I thought that "_id" index behaves like the others, so I can query, then sort by its subelements left-to-right. Is this not the case?
P.S. I ruled out query optimizer by using hint({_id:1}) – didn't help, it used btree cursor, but still scanned entire collection.

So, is this something that should be expected (then, I think, documentation should state that this is a special case, since http://www.mongodb.org/display/DOCS/Indexes#Indexes-UsingDocumentsasKeys states otherwise), or is it a bug that should be fixed?



 Comments   
Comment by Aristarkh Zagorodnikov [ 16/Apr/12 ]

My bad, it appears that I wasn't reading the docs attentively =) Sorry for the false alarm.

Comment by Aristarkh Zagorodnikov [ 16/Apr/12 ]

As a side note, it only happens when sorting on a "subkey", so this is not exactly the same case that is described in the documentation.

Comment by Eliot Horowitz (Inactive) [ 16/Apr/12 ]

An index on

{ "_id" : 1 }

is very different than an index on

{ "_id.t" : 1 }

this is true of the _id index and all others.

will try to make the documentation clearer

Comment by Scott Hernandez (Inactive) [ 16/Apr/12 ]

What you are describing is how indexes work on embedded documents (which is what you get when you put a document in the _id field). See the docs here: http://www.mongodb.org/display/DOCS/Indexes#Indexes-UsingDocumentsasKeys

Comment by Aristarkh Zagorodnikov [ 16/Apr/12 ]

"about 1600 documents" should read "about 600 documents" (actually 589 as first explain shows).

Generated at Thu Feb 08 03:09:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.