-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Associations
I've encountered a severe performance problem when assigning many embedded documents. Here's a small example benchmarking the assignment of 1,000 to 5,000 embedded documents:
require 'mongoid' class Foo include Mongoid::Document embeds_many :bars end class Bar include Mongoid::Document embedded_in :foo end require 'benchmark' array_1k = Array.new(1000) { Bar.new } array_2k = Array.new(2000) { Bar.new } array_3k = Array.new(3000) { Bar.new } array_4k = Array.new(4000) { Bar.new } array_5k = Array.new(5000) { Bar.new } Benchmark.bm do |x| x.report('1k') { Foo.new.bars = array_1k } x.report('2k') { Foo.new.bars = array_2k } x.report('3k') { Foo.new.bars = array_3k } x.report('4k') { Foo.new.bars = array_4k } x.report('5k') { Foo.new.bars = array_5k } end
Results:
user | system | total | real | |
---|---|---|---|---|
1k | 0.854672 | 0.004126 | 0.858798 | ( 0.862249) |
2k | 2.736819 | 0.006788 | 2.743607 | ( 2.750780) |
3k | 5.965530 | 0.005802 | 5.971332 | ( 5.978580) |
4k | 11.478882 | 0.023406 | 11.502288 | ( 11.528631) |
5k | 19.855050 | 0.048544 | 19.903594 | ( 19.958878) |
As you can see, assigning 5,000 embedded documents takes almost 20 seconds!
Note that this is all within Ruby, the code doesn't hit the database once.
All of this is caused by object_already_related?:
def object_already_related?(document) _target.any? { |existing| existing.id && existing === document } end
The above method is called for every single document. And since _target grows by one after each addition, the number of comparisons keeps increasing:
docs | comparisons |
---|---|
1,000 | 499,500 |
2,000 | 1,999,000 |
3,000 | 4,498,500 |
4,000 | 7,998,000 |
5,000 | 12,497,500 |
I'm actually dealing with that many (and more) embedded documents. It would be great if you could fix this.
In the meantime, I'd be deeply grateful if you could provide a work-around.