[SERVER-24457] Some commands fail when a shard they need to talk to has no primary, even when they are okay to run on secondaries Created: 07/Jun/16 Updated: 02/Apr/20 Resolved: 02/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Sharding 16 (06/24/16), Sharding 2020-04-20 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Some query-like commands (such as distinct) will fail when one of the shards it targets has no primary, even when using 'secondary' read preference. |
| Comments |
| Comment by Randolph Tan [ 02/Apr/20 ] |
|
No longer an issue in current master |
| Comment by Spencer Brody (Inactive) [ 06/Jul/16 ] |
|
Reproduced in 3.0 as well. |
| Comment by Andy Schwerin [ 16/Jun/16 ] |
|
Spencer is going to find out if this is a regression from 3.0, and then bounce it back to needs triage with the additional information. |
| Comment by Spencer Brody (Inactive) [ 07/Jun/16 ] |
|
My current hypothesis is that this is due to the commands trying to run setShardVersion against the primary (via use of ShardConnection), which they do even when processing a slaveOk operations as an attempt to ensure that the operation is still being routed to the proper shard. |
| Comment by Spencer Brody (Inactive) [ 07/Jun/16 ] |
|
Attaching a jstest that demonstrates the issue with the 'distinct' command. Weirdly, in my manual testing I was also seeing 'count' and 'aggregate' failing in this case, but the jstest didn't reproduce those failing. |