-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 1.7.0
-
Component/s: None
-
None
-
Environment:1.7.2-pre-
-
ALL
Problem:
It appears that the limit() clause is not being pushed down to the shard server for evaluation, since the explain() indicates more documents returned than the limit.
For example, given the shard key "x" has values 0-100 and the split is at 50. Given the query
db.test20101005.find( {x : { $lt : 60}} ).sort(
{ x:-1}).limit(1).explain()
produces
> db.test20101005.find( {x : { $lt : 60}} ).sort(
{ x:-1} ).limit(1).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"10.195.79.0:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 50,
"nscannedObjects" : 50,
"n" : 50,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :
}
],
"10.203.83.53:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 10,
"nscannedObjects" : 10,
"n" : 10,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :
}
]
},
"n" : 60,
"nChunkSkips" : 0,
"nYields" : 0,
"nscanned" : 60,
"nscannedObjects" : 60,
"millisTotal" : 0,
"millisAvg" : 0,
"numQueries" : 2,
"numShards" : 2
}
The value of "n" indicates that shard0000 returns 50 documents, and shard0001 returns 10 before the final sort & limit is applied by the mongos.
Reproduce:
> for (i=0; i < 100; i++) { db.test20101005.insert(
); }
> db.test20101005.ensureIndex(
)
> use admin
switched to db admin
> db.runCommand(
)
{ "ok" : 1 }> db.runCommand( { shardcollection : "test.test20101005", key :
{ x : 1 }} )
{ "collectionsharded" : "test.test20101005", "ok" : 1 }> db.runCommand( { split : "test.test20101005", middle : { x : 50 }} );
{ "ok" : 1 }> db.runCommand( { moveChunk: "test.test20101005", find :
{ x : 51}, to : "shard0001" })
{ "millis" : 2413, "ok" : 1 }> use test
switched to db test
> db.test20101005.find( {x : { $lt : 60}} ).sort(
).limit(1).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"10.195.79.0:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 50,
"nscannedObjects" : 50,
"n" : 50,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :
}
],
"10.203.83.53:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 10,
"nscannedObjects" : 10,
"n" : 10,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :
}
]
},
"n" : 60,
"nChunkSkips" : 0,
"nYields" : 0,
"nscanned" : 60,
"nscannedObjects" : 60,
"millisTotal" : 0,
"millisAvg" : 0,
"numQueries" : 2,
"numShards" : 2
}
Solution:
Certainly in this case, the sort and limit could be applied to the data set returned by each shard. The final sort and limit will still need to happen in the mongos.
Business Case:
Performance.