[SERVER-60796] Allow "undefined" BSON type in metaField in the BucketUnpacker Created: 18/Oct/21 Updated: 06/Dec/22 Resolved: 02/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Gregory Noma | Assignee: | Backlog - Query Execution |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Execution
|
| Participants: |
| Description |
|
The BucketUnpacker currently has an assertion that the metaField for a bucket is not undefined. However, there is nothing stopping a user from inserting documents into a time-series collection that have undefined for the metaField, or performing an update to change the metaField to undefined. In either of these cases, the user would then be unable to query their time-series collection until these documents are removed. |
| Comments |
| Comment by Kyle Suarez [ 02/Nov/21 ] | |||||||||||||||
|
After discussion in the triage meeting, the consensus is that there isn't value in doing this work. rushan.chen and michael.gargiulo, let us know if you disagree – if there is work to be done, Query Execution feels like it should be banning undefined in more places, like on the insert and update codepaths. | |||||||||||||||
| Comment by David Storch [ 27/Oct/21 ] | |||||||||||||||
|
I still don't see why we would spend time on this. Undefined has been deprecated for ages. We want to get rid of it, not spend time improving it. | |||||||||||||||
| Comment by Gregory Noma [ 26/Oct/21 ] | |||||||||||||||
|
kyle.suarez in that case we'd also have to disallow undefined on the update path. Also, another strange behavior is that undefined is only disallowed for the top level of the metaField, but any level of nesting like [undefined] is allowed. | |||||||||||||||
| Comment by Kyle Suarez [ 26/Oct/21 ] | |||||||||||||||
|
Per discussion in the triage meeting, it sounds like there's an inconsistency in that a user can insert a BSON type undefined field but not get it back. Can we just ban inserting undefined in the meta field here? | |||||||||||||||
| Comment by Gregory Noma [ 21/Oct/21 ] | |||||||||||||||
|
I think the motivation would just be that the current behavior is pretty strange and confusing, especially since it only applies to the metaField and only at query time. If we decide that this is okay since we don't care about undefined, that's totally fine but I just want to make sure we're consciously making the decision that we want this behavior. | |||||||||||||||
| Comment by David Storch [ 20/Oct/21 ] | |||||||||||||||
|
Yes, the restriction is purely artificial, but given that this is a new collection type I think it's ok for us to impose this restriction. At the very least, why would we take any special steps to make it work? The sooner we can get to a world without undefined, the better.
We have a project on our backlog about this, but it's not something that we've prioritized or thought much about recently. So the short answer is "no". It would also be helpful if we could somehow figure out how much data is out there in real world MongoDB data that contains undefined. The undefined type has been deprecated for so long that it may be little used in practice, but we'd have to do some research to confirm this. | |||||||||||||||
| Comment by Gregory Noma [ 20/Oct/21 ] | |||||||||||||||
|
david.storch the confusing thing right now is just that we disallow undefined only in the BucketUnpacker, and only for the metaField. And as far as I know, this restriction is purely artificial; removing the assertion would simply work. We could potentially also extend this artificial restriction to the insert and update paths for time-series collections, but I think doing so may not be quite as straightforward.
In terms of getting rid of the undefined BSON type completely, do we have any ideas about how we would go about that? | |||||||||||||||
| Comment by David Storch [ 19/Oct/21 ] | |||||||||||||||
|
What's the value of supporting the undefined BSON type at all for timeseries collections? This type has long been deprecated, and we would like to get rid of it completely. In fact, I'd like to explicitly document if we don't already that the undefined BSON type is not supported in timeseries collections. |