[SERVER-44484] Changestream with updateLookup uasserts on updates from before collection was sharded Created: 07/Nov/19 Updated: 29/Oct/23 Resolved: 25/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.0.0, 4.2.0 |
| Fix Version/s: | 4.3.3, 4.0.28, 4.2.19 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bernard Gorman | Assignee: | Bernard Gorman |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v4.2, v4.0
|
||||||||
| Sprint: | Query 2019-12-02, Query 2019-12-30, Query 2020-01-13, Query 2020-01-27, Query 2020-02-10 | ||||||||
| Participants: | |||||||||
| Description |
|
An update operation writes the modified document's documentKey to the o2 field of its oplog entry; this is used by $changeStream to look up the document in the cluster if the updateLookup option is specified. An update on a sharded collection will write the shard key plus _id to the o2 field; on an unsharded collection, just the _id. But this means that if an unsharded collection is subsequently sharded on a key other than _id, the updateLookup for all pre-sharding update events will attempt to target the lookup by _id alone, will be unable to target a single shard, and will therefore always fail with an exception. It is possible for this failure to occur in one of two different ways:
2. An invariant failure on a mongos process which would look something like this in the logs:
|
| Comments |
| Comment by Githook User [ 23/Dec/21 ] |
|
Author: {'name': 'Rishab Joshi', 'email': 'rishab.joshi@mongodb.com', 'username': 'rishvin'}Message: |
| Comment by Githook User [ 13/Dec/21 ] |
|
Author: {'name': 'Rishab Joshi', 'email': 'rishab.joshi@mongodb.com', 'username': 'rishvin'}Message: |
| Comment by Githook User [ 03/Dec/21 ] |
|
Author: {'name': 'Rishab Joshi', 'email': 'rishab.joshi@mongodb.com', 'username': 'rishvin'}Message: |
| Comment by Githook User [ 25/Jan/20 ] |
|
Author: {'username': 'gormanb', 'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com'}Message: create mode 100644 jstests/sharding/change_streams_unsharded_update_resume.js |
| Comment by Bernard Gorman [ 07/Dec/19 ] |
|
Met with asya and charlie.swanson to discuss this today. We decided that the appropriate solution here is to simply perform the updateLookup by _id despite the fact that we cannot target it to a single shard, in the expectation that only a single valid document will be returned. Because the lack of a shard key in the documentKey indicates that the document was inserted while the collection was unsharded, the only way this can return more than one result is if the user has actively inserted another document with the same _id since the collection became sharded. We already have code which will uassert if more than one document is returned by the updateLookup, which would be the correct course of action in this case. If this does happen, the user can either remove the offending document and re-insert it with a different _id, or can resume the stream without updateLookup in order to bypass this entry in the oplog. Moving this back to Needs Scheduling for re-triage. |
| Comment by Craig Homa [ 12/Nov/19 ] |
|
Moving this to 'investigating' as the next step here is to determine how this can be fixed. |