[SERVER-6204] make $pullAll more efficient Created: 25/Jun/12 Updated: 11/Jul/16 Resolved: 29/Jun/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 2.1.1 |
| Fix Version/s: | 2.2.0-rc0 |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Kevin Matulef | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Participants: | |||||
| Description |
|
It appears that the $pullAll operator is implemented inefficiently. The current implementation loops over the target array and for each element, loops over the set of things to pull. So if you are pulling m elements from an array of size n, this takes O(n*m) time. It is possible to do this in O(n+m) by using a hash table to loop once over things to pull, then once over the target array. The difference might be noticeable for large values of m and n. |
| Comments |
| Comment by Kevin Matulef [ 29/Jun/12 ] |
|
Technically the performance is now O(m + n*log m), but the difference is marginal for most practical purposes. Achieving O(n+m) is a bigger change, since we'd need to use the TR1 unordered_set type and provide a hash function that correctly squashes numbers. |
| Comment by auto [ 29/Jun/12 ] |
|
Author: {u'date': u'2012-06-29T08:06:46-07:00', u'email': u'matulef@10gen.com', u'name': u'Kevin Matulef'}Message: |
| Comment by Kevin Matulef [ 28/Jun/12 ] |
|
I'm classifying this as a performance "bug", even though functionality is OK. |