Uploaded image for project: 'Mongoid'
  1. Mongoid
  2. MONGOID-4889

Assigning many embedded docs performs a very large number of comparisons

      I've encountered a severe performance problem when assigning many embedded documents. Here's a small example benchmarking the assignment of 1,000 to 5,000 embedded documents:

      require 'mongoid'
      
      class Foo
        include Mongoid::Document
        embeds_many :bars
      end
      
      class Bar
        include Mongoid::Document
        embedded_in :foo
      end
      
      require 'benchmark'
      array_1k = Array.new(1000) { Bar.new }
      array_2k = Array.new(2000) { Bar.new }
      array_3k = Array.new(3000) { Bar.new }
      array_4k = Array.new(4000) { Bar.new }
      array_5k = Array.new(5000) { Bar.new }
      
      Benchmark.bm do |x|
        x.report('1k') { Foo.new.bars = array_1k }
        x.report('2k') { Foo.new.bars = array_2k }
        x.report('3k') { Foo.new.bars = array_3k }
        x.report('4k') { Foo.new.bars = array_4k }
        x.report('5k') { Foo.new.bars = array_5k }
      end
      

      Results:

        user system total real
      1k 0.854672 0.004126 0.858798 ( 0.862249)
      2k 2.736819 0.006788 2.743607 ( 2.750780)
      3k 5.965530 0.005802 5.971332 ( 5.978580)
      4k 11.478882 0.023406 11.502288 ( 11.528631)
      5k 19.855050 0.048544 19.903594 ( 19.958878)

      As you can see, assigning 5,000 embedded documents takes almost 20 seconds!

      Note that this is all within Ruby, the code doesn't hit the database once.

      All of this is caused by object_already_related?:

      def object_already_related?(document)
        _target.any? { |existing| existing.id && existing === document }
      end
      

      The above method is called for every single document. And since _target grows by one after each addition, the number of comparisons keeps increasing:

      docs comparisons
      1,000 499,500
      2,000 1,999,000
      3,000 4,498,500
      4,000 7,998,000
      5,000 12,497,500

      I'm actually dealing with that many (and more) embedded documents. It would be great if you could fix this.

      In the meantime, I'd be deeply grateful if you could provide a work-around.

            Assignee:
            Unassigned Unassigned
            Reporter:
            mail@stefanschuessler.de Stefan Schüßler
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: