Uploaded image for project: 'Mongoid'
  1. Mongoid
  2. MONGOID-2951

Attributes of container types are not persisted when mutated

      If a field is a container object (Set or Array, for example), changes to this object are not recognized by Mongoid and are not persisted.

      The value of the attribute must be reset for Mongoid to persist the new data.

      This issue was originally filed for Set fields, https://jira.mongodb.org/browse/MONGOID-3743 is another instance of this issue for Array fields.

      Not sure what's going on here but I cannot add to sets embedded in a mongoid document.

      class Foo
        include Mongoid::Document
        field :test, type: Set, default: -> {Set.new}

      same result with default: Set.new

      >> f = Foo.new(test: Set.new)
      => #<Foo _id: 5162efe3463a990219000001, test: []>
      >> f.test.add 1
      => #<Set: {1}>
      >> f.test
      => #<Set: {}>
      >> f.test.object_id
      => 70154345163900
      >> f.test.object_id
      => 70154345825180

      Here is an example with a string field:

      > foo = Foo.first
      > foo.bar
      "My random value"
      > foo.bar.gsub!(/random/, '')
      "My  value"> foo.changed?
      > foo.save # does not changes value of bar to "My  value"
      > foo.bar
      "My  value"
      > foo = Foo.first
      > foo.bar
      "My  value" 

      We identified the following separate issues falling under the umbrella of modifying container attributes:

      0. Mongoid provides two ways of accessing attributes: the #attributes method (returning the hash of attributes before demongoization) and #read_attribute / per-attribute reader method (#field_name) (returning attribute values after demongoization).
      1. The demongoized value is constructed on every call to attribute reader / #read_attribute, and is not retained by Mongoid. As a result, any mutation of the demongoized value will not be persisted by Mongoid. One use case is when One use case for persisting mutated demongoized values is when custom field types are used, e.g. the CompressedString example given below in https://gist.github.com/jarthod/6c7ddbea1c47b9ca5f159b6c28f81374.
      2. Assuming the previous issue is resolved, for example by storing in a model the demongoized value, we still have the issue of dirty tracking. Meaning, when the container value is mutated outside of Mongoid, Mongoid needs to know that this happened sometime before the next persistence operation (e.g. #save).
      3. The dirty tracking ideally needs to have the demongoized and raw (the ones in #attributes hash) values in sync at all times. In other words, it is insufficient for Mongoid to know or be told that a container value changed when #save is being executed, ideally Mongoid updates its #attributes hash the moment the container value is mutated in the application (outside of Mongoid).
      4. When accessing resizable field types (Array, Hash) using the getters, the fields are marked as changed and added to the changed_attributes hash. On save, the value for that field at the time it was accessed is compared to the current value of that field to see if it has changed, and if it has it is added to the changed hash. The changed hash contains the fields that are updated during the save. When accessing the resizable fields using the subscript operator [], these fields are not marked as changed, so their updates are not persisted to the database.
      5. A container value may be given to Mongoid model and subsequently the original value may be modified. When a model is instantiated, the original value is mongoized, and subsequent attribute access returns the demongoization of the mongoized value which is a different object from the original value. As a result, mutations of original value are not reflected in the Mongoid model:

      irb(main):053:0> class Foo; field :a, type: Array; end
       @options={:type=>Array, :klass=>Foo}>
      irb(main):054:0> a=Foo.new(a: [1])
      => #<Foo _id: 628e9e3ea15d5d5519d566c0, f: nil, a: [1]>
       irb(main):063:0> a=[42]
      => [42]
       irb(main):065:0> Foo.new(a:a)
      => #<Foo _id: 628e9e9fa15d5d5519d566c1, f: nil, a: [42]>
       irb(main):067:0> a.unshift(3)
      => [3, 42]
      # b.a did not get modified
       irb(main):068:0> b
      => #<Foo _id: 628e9ea4a15d5d5519d566c2, f: nil, a: [42]>

      6. Mongoid implements two-stage dirty tracking: a container attribute is marked changed upon access (using read method only right now, potentially also after being accessed with #read_attribute assuming solution C is applied) and during persistence operation (#save), the value is deep-compared against the original and isn't written out if the container did not actually change. However, after first persistence operation the changed flag is cleared, and subsequent persistence operations will not write the new value if the value changed between first and second persistence operations.

      irb(main):066:0> b=Foo.new(a:a)
      => #<Foo _id: 628e9ea4a15d5d5519d566c2, f: nil, a: [42]>
      irb(main):067:0> a.unshift(3)
      => [3, 42]
      irb(main):068:0> b
      => #<Foo _id: 628e9ea4a15d5d5519d566c2, f: nil, a: [42]>
      irb(main):069:0> ^C
      irb(main):069:0> a=Foo.new(a: [1])
      => #<Foo _id: 628ea3fba15d5d5519d566c3, f: nil, a: [1]>
      irb(main):070:0> b=a.a
      => [1]
      irb(main):071:0> b.unshift(2)
      => [2, 1]
      irb(main):072:0> a
      => #<Foo _id: 628ea3fba15d5d5519d566c3, f: nil, a: [2, 1]>
      irb(main):073:0> a.save!
      => true
      irb(main):074:0> b.unshift(4)
      => [4, 2, 1]
      irb(main):075:0> a
      => #<Foo _id: 628ea3fba15d5d5519d566c3, f: nil, a: [4, 2, 1]>
      irb(main):076:0> a.save!
      => true
      irb(main):077:0> a.reload
      => #<Foo _id: 628ea3fba15d5d5519d566c3, f: nil, a: [2, 1]>

      7. Returning mutable objects from attribute methods may be surprising, just like returning duplicated objects is currently surprising to a different subset of users. Strings are for example mutable, if they get the same behavior as previous issues are discussing for container types, then:

      band = Band.new(name: 'mushroom')
      band.name.gsub('room', '')

      ... would mutate the name of the band in the model, which could be unexpected.

      Possible solutions:

      A. Store demongoized value in Mongoid when it's returned. Solves problem #1 and #5.
      B. Implement subclasses of Array, Hash, Set which override all mutating methods to write the new value back into #attributes hash. Example with CompressedString:

      require 'zlib'
      class CompressedString < String
        def mongoize
        def initialize(model, field_name)
          @model = model
          @field_name = field_name
        class << self
          # Get the object as it was stored in the database, and instantiate
          # this custom class from it.
          def demongoize(object)
            case object
            when BSON::Binary then CompressedString.new(Zlib::Inflate.inflate(object.data))
            when String then CompressedString.new(object)
            else object
          # Takes any possible object and converts it to how it would be
          # stored in the database.
          def mongoize(object)
            case object
            when CompressedString then object.mongoize
            when String then CompressedString.new(object).mongoize
            else object
          # Converts the object that was supplied to a criteria and converts it
          # into a database friendly form.
          alias_method :evolve, :mongoize
        def gsub(*args)
          super.tap do
            @model.send("#{@field_name}=", self)
        mutator :gsub

      This requires the value object to be constructed with a reference to original model and the field name so that the value object can write back to the original model. Solves problems #2, #3, #4, #6, doesn't solve problem #1. Imposes additional requirements on custom field implementation. Mongoid has to define subclasses for all container types it supports natively (Hash, Array, Set). This solution would also permit getting rid of deep copy & deep comparison done in reader methods for container fields, thus improving performance (opposite of what solution C proposes performance-wise).

      C. Mark attribute changed when #read_attribute is called (the attributes are already marked changed when reader methods are called). Solves problem #4. Adds a minor performance penalty due to deep copy on access & deep comparison on save, however both of these already happen when the same attribute is accessed using the reader method.

            Unassigned Unassigned
            tal tal
            1 Vote for this issue
            4 Start watching this issue