[SERVER-6204] make $pullAll more efficient Created: 25/Jun/12  Updated: 11/Jul/16  Resolved: 29/Jun/12

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.1.1
Fix Version/s: 2.2.0-rc0

Type: Bug Priority: Minor - P4
Reporter: Kevin Matulef Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Participants:

 Description   

It appears that the $pullAll operator is implemented inefficiently. The current implementation loops over the target array and for each element, loops over the set of things to pull. So if you are pulling m elements from an array of size n, this takes O(n*m) time. It is possible to do this in O(n+m) by using a hash table to loop once over things to pull, then once over the target array. The difference might be noticeable for large values of m and n.



 Comments   
Comment by Kevin Matulef [ 29/Jun/12 ]

Technically the performance is now O(m + n*log m), but the difference is marginal for most practical purposes. Achieving O(n+m) is a bigger change, since we'd need to use the TR1 unordered_set type and provide a hash function that correctly squashes numbers.

Comment by auto [ 29/Jun/12 ]

Author:

{u'date': u'2012-06-29T08:06:46-07:00', u'email': u'matulef@10gen.com', u'name': u'Kevin Matulef'}

Message: SERVER-6204 fix inefficiency in pullAll
Branch: master
https://github.com/mongodb/mongo/commit/eb7e375dd3c6331f0cc30a42fb6cb86d557b68ac

Comment by Kevin Matulef [ 28/Jun/12 ]

I'm classifying this as a performance "bug", even though functionality is OK.

Generated at Thu Feb 08 03:11:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.