[SERVER-22771] multiUpdate and multiRemove ops from a stale mongos may cause repeated work Created: 19/Feb/16 Updated: 08/Mar/16 Resolved: 08/Mar/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.3.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Esha Maharishi (Inactive) | Assignee: | Esha Maharishi (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | The following tests demonstrate the behavior: multiUpdate:
multiRemove:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Because mongos targets shards for multiUpdate and multiRemove, and shards check the shard version info on these requests before beginning to execute the updates and removes, if one shard returns StaleShardVersion to the mongos, the mongos will refresh its metadata, re-target and re-send the request to all relevant shards. Therefore, a shard that did not return StaleShardVersion will re-apply the multiUpdate or multiRemove. This is harmless for multiRemoves (since removes are idempotent), but can cause unexpected behavior (the write can get applied more than once to the same document) for non-idempotent multi-updates. It's worth noting that the semantics of multiUpdate and multiRemove already allow for situations like this even on a single mongod; this issue just exacerbates the likelihood seeing behavior like this. Right now, the mongos targets shards for multiUpdate and multiRemove, and the shards check versioning. A full fix requires both mongos and mongod to do the opposite of what they currently do: The four options are: 1) mongos targets shards, shards check version (what we have now) 2) mongos sends request to all shards, shards check version 3) mongos targets shards, shards ignore versioning 4) mongos sends request to all shards, shards ignore versioning (what we should be doing) |
| Comments |
| Comment by Andy Schwerin [ 20/Feb/16 ] |
|
IIRC, we used to do no versioning for multi-update and multi-remove. I On Sat, Feb 20, 2016, 8:34 AM Scott Hernandez (JIRA) <jira@mongodb.org> |
| Comment by Scott Hernandez (Inactive) [ 20/Feb/16 ] |
|
When we have to re-target on a shard version mismatch/stale, can't we only issue the operation to shards which haven't already seen it in the new target set? I'm a little confused since I understood that we first established a shard version on all targeted shards, and then issued the operation. Has this changed recently? |