[SERVER-27716] mongos 3.2.10, sharded cluster, skip returns duplicates Created: 17/Jan/17 Updated: 27/Oct/23 Resolved: 03/Feb/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Alessandro Gherardi | Assignee: | David Storch |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Query 2017-02-13 | ||||||||
| Participants: | |||||||||
| Description |
|
Hi, This can be verified using the following script:
The script prints several lines with "duplicate: ....". When run against the same database but MongoDB 3.0.4 mongoD + mongoS, the script detects no duplicates. |
| Comments |
| Comment by David Storch [ 03/Feb/17 ] |
|
Hi agherardi, Thanks for the clear repro steps: this was very useful in investigating this issue! I looked into the behavior you reported and have concluded that this is working as designed. Your repro script issues repeated queries with no predicate but with a skip+limit for pagination. This means that each query will execute as a collection scan. (You can see this in the explain output as the COLLSCAN stage.) When running repeated collection scans against an idle server, you might reasonably expect each collection scan to return the results in the same order. Returning the results in a consistent order is, of course, a requirement for your repeated pagination queries to correctly avoid returning duplicates. As you have discovered, however, assuming that repeated COLLSCANs return results in the same order, without an explicit .sort(), is a very dangerous assumption. It might be the case, but the system does not guarantee that it is so. You shouldn't expect the system to give you a sort without asking it for one. Indeed, the repeated queries are not returning the results in the same order in a sharded environment. There is a component of the query execution path in mongos called the AsyncResultsMerger which asks for batches of data from the shards and presents these individual streams as a single logical stream of results: https://github.com/mongodb/mongo/blob/master/src/mongo/s/query/async_results_merger.h Without a sort requested, the AsyncResultsMerger can happen to return results in a different order, depending on the timing with which it gets results back from the shards. This appears to be what is happening when I run your repro script. To double-check, I instrumented mongos to check itself for duplicates during the execution of an individual sharded query. Even when I see duplicates appear from your repro script, mongos detects no duplicates being returned from an individual query, which would have been a clear sign of a bug. The behavior you observe is not present in 3.0.x versions because the AsyncResultsMerger is a new component implemented as part of an overhaul of the mongos read path done for version 3.2.0. Let me know if you have any further questions or concerns. Best, |
| Comment by Kelsey Schubert [ 18/Jan/17 ] |
|
Hi alessandro.gherardi@yahoo.com, Thank for you the detailed report. We're able to reproduce this behavior and are investigating. Kind regards, |
| Comment by Alessandro Gherardi [ 17/Jan/17 ] |
|
The only workaround we found is sorting by a unique attribute - e.g., the _id. If in the script we change the query to: db.dataSources.find().skip(skip).sort({_id:1}).limit(limit) We see no duplicates. |