[SERVER-25717] Allow for negative $push indexes. Created: 20/Aug/16 Updated: 06/Apr/23 Resolved: 28/Mar/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Write Ops |
| Affects Version/s: | 3.3.11 |
| Fix Version/s: | 3.5.6 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Dissatisfied Former User | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | pull-request | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Sprint: | Query 2017-03-27, Query 2017-04-17 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Requiring a query to determine the length of a document's array and calculate the appropriate positive integer index introduces a race condition. By allowing negative indexes (indicating distance from the end) on the $position modifier, this race condition could be avoided. This would match Python semantics for list manipulation. An example, given the following document:
And the following update operation:
The resulting document should be:
The default insert location, the end (when $position is omitted or exceeds the length of the array), is unaffected. Thank you for your consideration! |
| Comments |
| Comment by Githook User [ 29/Mar/17 ] | ||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: | ||||
| Comment by Dissatisfied Former User [ 29/Mar/17 ] | ||||
|
After some student discussion ("i just looked at the PR for mongoDB regarding negative pushes... all of my wat") and team meeting, additional humour has been identified: a race condition persists, but is now more grievously subtle. In the unlikely event that you know the negative index of the insertion point you wish to utilize, but this insertion point happens to be near the opposite (left) edge, say, actually inserting at positive position 0, and that first element is removed prior to the update pushing a new value, then the value will be dropped at the end of the array instead of being naturally inserted (correctly!) at the start of the array. Again, in the context of security data such as embedded access control lists this can have catastrophic consequences, so pro tip to anyone wishing to attempt this approach: don't. Just thought you should know that this "fix" adds more problems, using a more concrete example. (Edited to add: of course, the ordering of any two updates to an array this way becomes dependent on atomic ordering of those operations, so I'm not quibbling over the hypothetical problem of the distinct ordering of any two elements in the above, just the literal far-edge case which is most likely to occur and violate intuitive expectations.) | ||||
| Comment by Dissatisfied Former User [ 28/Mar/17 ] | ||||
|
> If the $position is negative in $push, the elements are added at position (length of array + $position). If (length of array + $position) < 0, the elements are added at the end of the array. This does not match Python list manipulation semantics and my absolutely fantastic level of unimpressed continues to grow. I've kept my most vocal complaints to only prompted private e-mail exchanges s far, but six months only to get it wrong by adding a special case that need not be present is one hell of a way to improve the situation.
All negative indexes were formerly an immediate rejection of the query. The special cased behaviour of "over-large negative indexes" being treated the same as omission and over-large positive indexes is inexplicable. As stated in the e-mail exchange, I no longer care if you fix this or not, but the cavalcade of horror must stop at some point. My 68 year old mother who has zero computer experience intuitively grasps the Python semantics, where the sign determines the side to start counting from, and over-large values represent the opposite side, but if this is your solution, so be it. This is hilarious as it is sad, and I'll continue to watch the other issues I've subscribed to (some of which have been open for multiple years) with un-baited breath. (Edited for formatting.) | ||||
| Comment by Githook User [ 28/Mar/17 ] | ||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: | ||||
| Comment by Asya Kamsky [ 28/Dec/16 ] | ||||
|
amcgregor, my apologies for our terrible communication on this issue. We are we are currently in the design phase for 3.6 and are tracking our planned improvements to update capabilities for MongoDB in The update enhancements we are designing are tracked by Again, my deepest apologies to amcgregor for our poor coordination between Jira and github communication. Regards, | ||||
| Comment by Dissatisfied Former User [ 20/Dec/16 ] | ||||
|
The pull request on GitHub has been closed by someone who has not been visibly involved on this ticket. I am highly dissatisfied with the explanation given, which appears as a boilerplate dismissal. I could have used that dismissal in August instead of pointlessly waiting for 3.4's release only to have the functional fix I produced thrown out in favour of Either there's some lack of communication going on or some behind-the-scenes policy has changed, near as I can figure. The result is the same: effort expended, that effort dismissed, the concern leading to that effort not addressed (the race condition remains), and expectations (that whoever ends up being involved would actually read the ticket after 3.4 was released, and something positive might come from this all) violated. To borrow the Babbage quote I used on the PR again: I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a resolution. | ||||
| Comment by Charlie Swanson [ 16/Sep/16 ] | ||||
|
Hi amcgregor, thanks for submitting the pull request! We are currently in our stabilization period for the upcoming 3.4 release. We are focused solely on quality and performance changes until branching for the release, and therefore unfortunately cannot accept code for new features. In the meantime, we will put the proposed feature up for review according to our usual design review process. You will hear back from us after 3.4 is out. | ||||
| Comment by Ramon Fernandez Marina [ 03/Sep/16 ] | ||||
|
Thanks amcgregor, I've sent this to the Query team for review. Please be aware that we have recently entered our stabilization period in preparation for the release of MongoDB 3.4. We're doing our best to freeze new feature development in order to focus on quality and performance, and to ensure that downstream work (such as that required for the documentation and drivers) can be completed, so we may need to punt on this ticket for a few weeks. We'll keep you posted. Regards, | ||||
| Comment by Dissatisfied Former User [ 01/Sep/16 ] | ||||
|
Done and done. | ||||
| Comment by Kelsey Schubert [ 01/Sep/16 ] | ||||
|
Hi amcgregor, Thank you for submitting a pull request. Before we can take a look, would you please sign the contributor agreement? Kind regards, | ||||
| Comment by Dissatisfied Former User [ 01/Sep/16 ] | ||||
|
Implemented in this pull request: https://github.com/mongodb/mongo/pull/1112 Tested manually by running the following against the freshly built development server. It's super effective.
| ||||
| Comment by Dissatisfied Former User [ 01/Sep/16 ] | ||||
|
After some back-of-the-napkin calculations, the single largest possible array in one 16MB BSON document hovers right around 2 million NULLs. 2 million is way, way smaller than the possible maximum offered by a size_t, making compatibility with people inappropriately using extremely large values the only real concern for swapping that out to a more reasonable value, such as a signed 32-bit integer. The current signed-ness test would need to be replaced with an overflow test—or just min/max it to the range of the type used, since "extremely large negative values" (if that makes sense) would behave similarly to the positive approach by simply inserting at the beginning of the array. | ||||
| Comment by Dissatisfied Former User [ 01/Sep/16 ] | ||||
|
Having a chance to examine the code a bit today, the optimistic use of a size_t for _startPosition in modifier_push.h will somewhat hamper efforts to implement this ticket. There is no signed value that can hold all possible size_t values, meaning the change to a signed value here may be backwards-incompatible with any use of $push containing "very large numbers" to indicate the end. Such use is sub-optimal (just omit $position if you want the end) but allowed (possibly encouraged) by the documentation. The backing C++ code uses a generic BSONElement reference when pulling apart the update operation, so there's no particular issue with signed-ness there. My initial thought was that the check for negative indices could be replaced by a length lookup and calculation, however the check occurs early (during parsing, not execution) here, just prior to casting to a size_t: https://github.com/mongodb/mongo/blob/master/src/mongo/db/ops/modifier_push.cpp#L337-L340 — this would prevent the "easy approach" (maintaining the use of size_t internally) from working as the target record is not yet available for inspection. The point of general update would be to add a recalculation of _startPosition just after the calculation of _preparedState->arrayPreModSize here: https://github.com/mongodb/mongo/blob/master/src/mongo/db/ops/modifier_push.cpp#L549 The size_t thing, though, is a bit of a thorn. |