[SERVER-1896] limit() clause is not being pushed down to the shard for evaluation, causing more documents to be returned than are needed Created: 05/Oct/10  Updated: 12/Jul/16  Resolved: 10/Nov/10

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 1.7.0
Fix Version/s: 1.7.3

Type: Bug Priority: Major - P3
Reporter: Alvin Richards (Inactive) Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

1.7.2-pre-


Operating System: ALL
Participants:

 Description   

Problem:
It appears that the limit() clause is not being pushed down to the shard server for evaluation, since the explain() indicates more documents returned than the limit.

For example, given the shard key "x" has values 0-100 and the split is at 50. Given the query

db.test20101005.find( {x : { $lt : 60}} ).sort(

{ x:-1}

).limit(1).explain()

produces

> db.test20101005.find( {x : { $lt : 60}} ).sort(

{ x:-1}

).limit(1).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"10.195.79.0:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 50,
"nscannedObjects" : 50,
"n" : 50,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :

{ "x" : [ [ 60, -1.7976931348623157e+308 ] ] }

}
],
"10.203.83.53:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 10,
"nscannedObjects" : 10,
"n" : 10,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :

{ "x" : [ [ 60, -1.7976931348623157e+308 ] ] }

}
]
},
"n" : 60,
"nChunkSkips" : 0,
"nYields" : 0,
"nscanned" : 60,
"nscannedObjects" : 60,
"millisTotal" : 0,
"millisAvg" : 0,
"numQueries" : 2,
"numShards" : 2
}

The value of "n" indicates that shard0000 returns 50 documents, and shard0001 returns 10 before the final sort & limit is applied by the mongos.

Reproduce:
> for (i=0; i < 100; i++) { db.test20101005.insert(

{ x: i}

); }
> db.test20101005.ensureIndex(

{x:1}

)
> use admin
switched to db admin
> db.runCommand(

{ enablesharding : "test"}

)

{ "ok" : 1 }

> db.runCommand( { shardcollection : "test.test20101005", key :

{ x : 1 }

} )

{ "collectionsharded" : "test.test20101005", "ok" : 1 }

> db.runCommand( { split : "test.test20101005", middle : { x : 50 }} );

{ "ok" : 1 }

> db.runCommand( { moveChunk: "test.test20101005", find :

{ x : 51}

, to : "shard0001" })

{ "millis" : 2413, "ok" : 1 }

> use test
switched to db test
> db.test20101005.find( {x : { $lt : 60}} ).sort(

{ x:-1}

).limit(1).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"10.195.79.0:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 50,
"nscannedObjects" : 50,
"n" : 50,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :

{ "x" : [ [ 60, -1.7976931348623157e+308 ] ] }

}
],
"10.203.83.53:27000" : [
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 10,
"nscannedObjects" : 10,
"n" : 10,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :

{ "x" : [ [ 60, -1.7976931348623157e+308 ] ] }

}
]
},
"n" : 60,
"nChunkSkips" : 0,
"nYields" : 0,
"nscanned" : 60,
"nscannedObjects" : 60,
"millisTotal" : 0,
"millisAvg" : 0,
"numQueries" : 2,
"numShards" : 2
}

Solution:
Certainly in this case, the sort and limit could be applied to the data set returned by each shard. The final sort and limit will still need to happen in the mongos.

Business Case:
Performance.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 10/Nov/10 ]

Note: as far as I can tell this is just an explain() issue, not a normal query issue.
If you think its on normal queries as well - please open a different case.

Comment by auto [ 10/Nov/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: fix limit on explain SERVER-1896
/mongodb/mongo/commit/af8370b1e2ff5681858b855f0cb816e9197b1230

Comment by Yuri Buyanov [ 19/Oct/10 ]

As I can see in our test environment, this also seem to affect 1.6.3 (debian mongodb-stable 20101018).

> db.related.find(

{"user.old_pk" : 902}

).limit(1).explain()
{
"clusteredType" : "SerialServer",
"shards" : {
"192.168.4.204:27017" : [
{
"cursor" : "BtreeCursor user.old_pk_1_posted_-1",
"nscanned" : 1205,
"nscannedObjects" : 1205,
"n" : 1205,
"millis" : 4,
"indexBounds" : {
"user.old_pk" : [
[
902,
902
]
],
"posted" : [
[

{ "$maxElement" : 1 }

,

{ "$minElement" : 1 }

]
]
}
}
]
},
"nscanned" : 1205,
"nscannedObjects" : 1205,
"n" : 1205,
"millisTotal" : 4,
"millisAvg" : 4,
"numQueries" : 1,
"numShards" : 1
}

Comment by Alvin Richards (Inactive) [ 06/Oct/10 ]

Added jstest/sharding/limit_push.js.

Execution is commented out, so you will need to uncomment in order to get the test to fail.

Per the update above, on a non-sharded system, n=1. It appears at least from the explain output that each shard is return n > 1.

Comment by auto [ 06/Oct/10 ]

Author:

{'login': 'alvin10gen', 'name': 'Alvin Richards', 'email': 'alvin@10gen.com'}

Message: Added jstest for http://jira.mongodb.org/browse/SERVER-1896
http://github.com/mongodb/mongo/commit/ddb16cb7e47806406592a49155e223f5132fd993

Comment by Alvin Richards (Inactive) [ 06/Oct/10 ]

confirmed on a non-shared system that "n" is 1 as you would expect (see below). Will write a js test for this

> db.test20101005.find( {x : { $lt : 60}} ).sort(

{ x:-1}

).limit(1).explain()
{
"cursor" : "BtreeCursor x_1 reverse",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"indexBounds" :

{ "x" : [ [ 60, -1.7976931348623157e+308 ] ] }

}

Comment by Eliot Horowitz (Inactive) [ 06/Oct/10 ]

Yes - please attach a full example w/o sharding with asserts (in js test style)

Comment by Eliot Horowitz (Inactive) [ 05/Oct/10 ]

pretty sure that's notmal behavior when using limit(1)
try it without sharding
kristina knows all about this.
try against a mongod.
assuming i'm right - please close the case

Generated at Thu Feb 08 02:58:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.