[SERVER-79446] `insert` ignores `collectionUUID` for time-series collections Created: 28/Jul/23 Updated: 08/Nov/23 Resolved: 28/Sep/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.2.0-rc0, 7.0.3, 6.0.12 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Felipe Gasper | Assignee: | Gregory Noma |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Storage Execution NAMER
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Backport Requested: |
v7.0, v6.0
|
||||||||||||||||
| Sprint: | Execution NAMR Team 2023-10-02 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Demonstration:
This risks data corruption in mongosync, which relies on `collectionUUID` to indicate when a collection has been renamed or dropped. Previously I thought this was just a matter of the server deprioritizing CollectionUUIDMismatch errors (see JIRA history), but it’s worse. Expected behavior: I would expect CollectionUUIDMismatch to be returned as with non-time-series collections. |
| Comments |
| Comment by Githook User [ 05/Oct/23 ] | ||||||||||||||||||||||||||||
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: (cherry picked from commit 05c59ba19ef3eddb62cc856129ceb8f8a627d29d) | ||||||||||||||||||||||||||||
| Comment by Githook User [ 05/Oct/23 ] | ||||||||||||||||||||||||||||
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: (cherry picked from commit 05c59ba19ef3eddb62cc856129ceb8f8a627d29d) | ||||||||||||||||||||||||||||
| Comment by Githook User [ 28/Sep/23 ] | ||||||||||||||||||||||||||||
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: | ||||||||||||||||||||||||||||
| Comment by Felipe Gasper [ 26/Sep/23 ] | ||||||||||||||||||||||||||||
| Comment by Felipe Gasper [ 22/Sep/23 ] | ||||||||||||||||||||||||||||
|
I’m changing this to a bug because I just found it in another context. In this test run mongosync’s DDL applier lagged a CRUD applier. Thus, this series of events on the source:
… became this on the destination:
… which we normally expect to yield a CollectionUUIDMismatch. Mongosync can then repeat the insert to the temporary collection name, and all is well. Instead, though, we’re getting:
Code 2, “BadValue”, also happens to be one that Mongosync doesn’t normally handle. We could special-case it, but we’d be relying on the fact that non-time-series documents lack one or more required time-series fields, which needn’t be the case. Thus, there’s a chance of data corruption. To reproduce this, just create a time-series collection and try to insert into it with a document {foo:1} and a collectionUUID like UUID("5ef55d9f-a049-48a0-88b1-f776a983614a"). You’ll get an error like the above. Then try to insert CollectionUUIDMismatch really seems like it should supersede document-specific errors. Alternatively, in the specific case of time-series, it might be reasonable to create a distinct, time-series-specific error that specifically says, “collectionUUID is invalid for time-series collections”. | ||||||||||||||||||||||||||||
| Comment by Felipe Gasper [ 15/Aug/23 ] | ||||||||||||||||||||||||||||
|
connie.chen@mongodb.com It doesn’t appear to, but it leaves us in a tight spot because we have to assume that OperationNotSupportedInTransaction only happens to mongosync in the context of a time-series collection. That happens to be true right now but seems not at all unlikely to change in the future. | ||||||||||||||||||||||||||||
| Comment by Felipe Gasper [ 28/Jul/23 ] | ||||||||||||||||||||||||||||
|
Example disparate responses (“weather” is a time-series collection):
|