[SERVER-49161] Change stream event with empty fullDocument field for existing document Created: 29/Jun/20 Updated: 11/Oct/21 Resolved: 05/Oct/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Change streams, Sharding |
| Affects Version/s: | 4.0.19, 4.2.8, 5.0.0, 4.4.7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Artem Navrotskiy | Assignee: | Rishab Joshi (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
I tested on host:
|
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Steps To Reproduce: | On Linux host:
|
||||||||||||||||
| Sprint: | Query 2020-11-30, Query 2020-12-14, Query 2020-12-28, Query 2021-01-11, Query 2021-01-25, Query Execution 2021-02-22, Query Execution 2021-03-08, Query Execution 2021-03-22, Query Execution 2021-04-05, Query Execution 2021-04-19, Sharding EMEA 2021-07-12, QE 2021-10-18 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
On investigation flaky test in our codebase I found strange issue: sometimes I got change stream update event with empty fullDocument field for existing document. I reproduced this behaviour by looping with same collection:
Environment details:
On my computer, a spike of errors occurs in the first 10-30 iterations. Then the probability of errors greatly decreases. I wrote a simple program with Golang for reproducing this issue (main.go). This program uses go.mongodb.org/mongo-driver v1.3.4 for MongoDB access but originally this issue was reproduced with github.com/globalsign/mgo fork. |
| Comments |
| Comment by Rishab Joshi (Inactive) [ 05/Oct/21 ] | |
|
I ran the scenarios multiple times with/without change stream optimization feature flag. I can see that everytime with feature flag disabled the issue happens and with the feature flag enabled the issue does not happen. It looks like the post-image stage reordering work fixed the issue. I can't see this issue happening now and hence marking it as a duplicate of the post image work done as part of the change stream optimization project. | |
| Comment by Tommaso Tocci [ 15/Jul/21 ] | |
|
I've been able to reproduce the problem using this js test: change_streams_empty_document.js
From what I understood this is what is happening:
In general the information that we get through the ChangeStream could be misaligned with respect to the metadata that we have cached on the mongos. In fact we could have:
The problem described in this ticket is a symptom of the latter. Unfortunately by just comparing the received UUID with the cached one we can't differentiate between the two situations, because we don't know which of the two UUID is newer. Recently we added a timestamp field to the collection entry that can be used to properly compare collection's metadata. I'm not sure if we can use the oplog entry timestamp to campare it with the collection metadata one, otherwise another solution solution would be to add the collection timestamp along with the UUID in the ShangeStream events. |