[SERVER-18077] First reads can be routed to the wrong shard if collection is sharded by other mongos Created: 15/Apr/15 Updated: 14/Apr/16 Resolved: 23/Sep/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.1.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Steps To Reproduce: | Add:
just one line above the shardCollection command here: and run the test. |
||||||||
| Sprint: | Sharding 2 04/24/15, Sharding 3 05/15/15, Sharding 4 06/05/15, Sharding 5 06/26/16, Sharding 6 07/17/15, Sharding 9 (09/18/15), Sharding A (10/09/15) | ||||||||
| Participants: | |||||||||
| Comments |
| Comment by Randolph Tan [ 23/Sep/15 ] |
|
Early diagnosis was incorrect, since an exception will be thrown on version mismatch when establishing the version here: https://github.com/mongodb/mongo/blob/r3.1.2/src/mongo/client/parallel.cpp#L609 This will cause ParallelSortClusteredCursor::startInit to recursively call itself again and re-evaluate the shards that it needs to talk to. |
| Comment by Randolph Tan [ 23/Sep/15 ] |
|
Yes. |
| Comment by Andy Schwerin [ 23/Sep/15 ] |
|
Are find commands safe from this, as they do not use ParallelSortClusteredCursor? |
| Comment by Randolph Tan [ 14/May/15 ] |
|
Diagnosis: https://github.com/mongodb/mongo/blob/r3.1.1/src/mongo/client/parallel.cpp#L602 Will set the version correctly on shard connection correctly and will even load a new chunk manager, but ParallelSortClusteredCursor::startInit will go ahead and use the shards decided before the metadata refresh to perform the queries anyway. And since the version in the connection matches, the shard will not complain. |