[SERVER-8192] Optimize btree key generation Created: 16/Jan/13 Updated: 08/Dec/15 Resolved: 31/Mar/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance, Write Ops |
| Affects Version/s: | None |
| Fix Version/s: | 3.1.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Aaron Staple | Assignee: | David Storch |
| Resolution: | Done | Votes: | 32 |
| Labels: | graphquery | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||
| Sprint: | Quint 3.1.0, Quint Iteration 3.1.1 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Description |
|
IndexSpec::getKeys() finds the set of index keys for a given document and index key spec. It's used when inserting / updating / deleting a document to update the index entries, and also for performing in memory sorts, deduping $or clauses and for other purposes. Right now extracting 10k elements from a nested object field within an array takes on the order of seconds on a decently fast machine. We could see how much we can optimize the implementation. |
| Comments |
| Comment by Githook User [ 31/Mar/15 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 31/Mar/15 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 17/Mar/15 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 27/May/14 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 24/May/14 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: Revert " This reverts commit 441b3c183c76399f248205989dec757708601394. | ||||||||||||||||||||||||
| Comment by Githook User [ 13/May/14 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 21/Apr/14 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 02/Apr/13 ] | ||||||||||||||||||||||||
|
The above experiment in 2.4.0-rc0, on my Mac:
Where "ms" is the time, and "predicted" is just 0.000082*(number of elements squared). I'm claiming that since "ms" is close to "predicted", there's a quadratic algorithm in the code path. In 2.0.9, the performance seems slightly better but still quadratic:
In this case the constant factor is .00006416 (about 20% better), but the shape of the curve is the same. | ||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 02/Apr/13 ] | ||||||||||||||||||||||||
|
Some code to test the duration of an update with two indexes and N subdocuments:
| ||||||||||||||||||||||||
| Comment by Asya Kamsky [ 20/Mar/13 ] | ||||||||||||||||||||||||
|
Related tests and comments in On my Mac update of a different indexed field (in place) takes 45 seconds when there is an indexed array with 20,000 entries that getKeys() traverses. |