Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-1243

New operator to update all matching items in an array

    Details

      Description

      Issue Status as of Aug 19, 2015

      MongoDB appreciates the challenge of maintaining applications utilizing schemas with large arrays, especially with respect to updating many or all array elements, and we very much recognize the interest from the community around this ticket.

      Unfortunately, to implement such a major feature there are serious requirements:

      • a specification for new language features (like update modifiers or expressions), since we cannot break existing uses
      • should be included with support to match all array elements, as well was those matching a query
        • requests for support to update the last element(s) could be considered.
      • must support all existing update modifiers (correctly, and in a non-backwards-breaking way)
        • $rename, $set, $unset, $pull, $push, $bit...
      • must work efficiently with all arrays, including those having thousands of elements
      • cannot change current update semantics or behaviors which existing applications and deployments depend on (= non-backwards-breaking).

      In summary, adding this as a new feature, or improvement, is not trivial. It will require resources which are currently working on other projects, and in short is a matter of prioritization over other parts of the whole server.

      Please see this comment below for additional details.

      Original description

      Given the following:

      > var obj = t.findOne()
      { "_id" : ObjectId("4b97e62bf1d8c7152c9ccb74"), "title" : "ABC",
        "comments" : [ { "by" : "joe", "votes" : 3 }, { "by" : "jane", "votes" : 7 } ] }
      

      One should be able to modify each item in the comments array by using an update command like the following:

      > t.update( obj, {$set:{'comments.$.votes':1}}, false, true )
      > t.find()
      { "_id" : ObjectId("4b97e62bf1d8c7152c9ccb74"), "title" : "ABC",
        "comments" : [ { "by" : "joe", "votes" : 1 }, { "by" : "jane", "votes" : 1 } ] }
      

        Issue Links

          Activity

          Hide
          jacaetevha Jason Rogers added a comment -

          to Jeremy Martin's response. Honestly Asya Kamsky, at this point saying...

          We do recognize the interest the community is expressing via this ticket and we do take this input into consideration during project planning.

          ... seems like a blowoff. You may not have meant it that way, but after 5 years I'm not sure what else you expect the community to get from that. Perhaps the community does not have a loud enough voice in regards to this feature request.

          Show
          jacaetevha Jason Rogers added a comment - to Jeremy Martin 's response. Honestly Asya Kamsky , at this point saying... We do recognize the interest the community is expressing via this ticket and we do take this input into consideration during project planning. ... seems like a blowoff. You may not have meant it that way, but after 5 years I'm not sure what else you expect the community to get from that. Perhaps the community does not have a loud enough voice in regards to this feature request.
          Hide
          RainyCode Noah added a comment - - edited

          Thank you Asya Kamsky for your reply and insight into the challenges of this feature.

          Show
          RainyCode Noah added a comment - - edited Thank you Asya Kamsky for your reply and insight into the challenges of this feature.
          Hide
          jaan@hebsdigital.com Jaan Paljasma added a comment -

          To be fair and square - I am disappointed. The feature has been up for several years, and I agree with Jason Rogers that the community has done all they could to raise awareness.
          Having many documents with huge arrays is the whole point of the noSQL Document schema. For key-value storage (or key-object) there are faster and more scalable systems out there.

          Show
          jaan@hebsdigital.com Jaan Paljasma added a comment - To be fair and square - I am disappointed. The feature has been up for several years, and I agree with Jason Rogers that the community has done all they could to raise awareness. Having many documents with huge arrays is the whole point of the noSQL Document schema. For key-value storage (or key-object) there are faster and more scalable systems out there.
          Hide
          jmar777 Jeremy Martin added a comment -

          There is one possible workaround, which I didn't see mentioned, and that would be to read the full (matched) document, create a targeted update to the array elements to change, and then updating the document only if no other changes have been made to the field (array) being updated.

          Unless I'm missing something, this seems tantamount to read-update-write with two-phase commit, right? Maybe it's not technically the same, but it provides equal burden on the client to ensure the correctness of the transaction. And, just like in two-phase commit, a real-world application of this technique would require retry logic if the array actually had been updated in the meantime, resulting in yet another query (on top of the initial one that already feels superfluous for an update operation). And this becomes exponentially more problematic when updating highly contentious resources.

          I would like to caution folks that if you have many documents with huge arrays, it's likely not an efficient schema design if you need to be querying and updating multiple array elements (especially if the array is indexed).

          I of course would have to agree with this, but this is ultimately a problem class that is independent of dataset size.

          Alright, obviously I'm frustrated. After 5 years and a hundred comments, I guess this just is what it is. And I honestly don't want to belabor a bunch of points about transactions and atomicity that I know you (Asya) understand at a much deeper level than me. It just seems like Mongo in general can get fuzzy on this topic. I.e., there's a lot of advice to put things in separate documents if you need certain update semantics, and to put them into the same document if you need certain transaction semantics, and then things get kind of hand wavy if you need both.

          I get that no database can be "all things to all users", but this one is perplexing. We're not talking about actually changing transactional or atomicity semantics at all. We simply need a way to utilize those existing semantics via an extension of the query language. I'm not claiming it's trivial, but it doesn't seem intractable, either.

          Show
          jmar777 Jeremy Martin added a comment - There is one possible workaround, which I didn't see mentioned, and that would be to read the full (matched) document, create a targeted update to the array elements to change, and then updating the document only if no other changes have been made to the field (array) being updated. Unless I'm missing something, this seems tantamount to read-update-write with two-phase commit, right? Maybe it's not technically the same, but it provides equal burden on the client to ensure the correctness of the transaction. And, just like in two-phase commit, a real-world application of this technique would require retry logic if the array actually had been updated in the meantime, resulting in yet another query (on top of the initial one that already feels superfluous for an update operation). And this becomes exponentially more problematic when updating highly contentious resources. I would like to caution folks that if you have many documents with huge arrays, it's likely not an efficient schema design if you need to be querying and updating multiple array elements (especially if the array is indexed). I of course would have to agree with this, but this is ultimately a problem class that is independent of dataset size. Alright, obviously I'm frustrated. After 5 years and a hundred comments, I guess this just is what it is. And I honestly don't want to belabor a bunch of points about transactions and atomicity that I know you (Asya) understand at a much deeper level than me. It just seems like Mongo in general can get fuzzy on this topic. I.e., there's a lot of advice to put things in separate documents if you need certain update semantics, and to put them into the same document if you need certain transaction semantics, and then things get kind of hand wavy if you need both. I get that no database can be "all things to all users", but this one is perplexing. We're not talking about actually changing transactional or atomicity semantics at all. We simply need a way to utilize those existing semantics via an extension of the query language. I'm not claiming it's trivial, but it doesn't seem intractable, either.
          Hide
          salzamt riccardo salzer added a comment -

          In my opinion this missing feature breaks the the whole concept of MongoDB. Its meant to nest all information I need to the place where I need it, mostly as an array of subdocuments and what if this is for example user information inside comments. the user changes his image and one has to update all of those subdocuments inside of all comments. MongoDB also recommend this method in their conferences. But actually, if you do this, how is one supposed to update all those information afterwards if something changes?
          As much as I love this database in general, it is embarrassing that this very fundamental and must-have feature (especially because this approach is recommended per design of mongo) is still not available after 5 years! It doesn't matter how you implement it, it will still be faster as to do it in the code.

          Show
          salzamt riccardo salzer added a comment - In my opinion this missing feature breaks the the whole concept of MongoDB. Its meant to nest all information I need to the place where I need it, mostly as an array of subdocuments and what if this is for example user information inside comments. the user changes his image and one has to update all of those subdocuments inside of all comments. MongoDB also recommend this method in their conferences. But actually, if you do this, how is one supposed to update all those information afterwards if something changes? As much as I love this database in general, it is embarrassing that this very fundamental and must-have feature (especially because this approach is recommended per design of mongo) is still not available after 5 years! It doesn't matter how you implement it, it will still be faster as to do it in the code.

            Dates

            • Created:
              Updated:
              Days since reply:
              22 weeks, 5 days ago
              Date of 1st Reply: