[SERVER-29817] Optimize incremental update performance of ChunkManager and CollectionMetadata Created: 23/Jun/17 Updated: 30/Oct/23 Resolved: 21/Jul/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.4.7, 3.5.11 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Andy Schwerin | Assignee: | Andy Schwerin |
| Resolution: | Fixed | Votes: | 3 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Backport Requested: |
v3.4
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||
| Description |
|
The ChunkManager and CollectionMetadata types are effectively in-memory routing tables used by map shard key values to chunks and shards. These data structures are read only, and updated copies must be made after chunk migrations. Because the data structures are updated by copy, their update time is proportional to the total number of chunks in a given collection. When the number of chunks in a collection is large (10s of thousands), the update time is significant, and increases the latency of chunk migration commit. This ticket tracks improvements to the constant factors affecting update time. Prototyping indicates that a speedup of >4x is possible by replacing woCompare with KeyString comparison, and by replacing all but one instance of std::map with instances of std::vector that are constructed in sorted order. |
| Comments |
| Comment by Githook User [ 26/Jul/17 ] | |||||||||
|
Author: {'email': 'schwerin@mongodb.com', 'username': 'andy10gen', 'name': 'Andy Schwerin'}Message: | |||||||||
| Comment by Githook User [ 26/Jul/17 ] | |||||||||
|
Author: {'email': 'schwerin@mongodb.com', 'username': 'andy10gen', 'name': 'Andy Schwerin'}Message: | |||||||||
| Comment by Andy Schwerin [ 21/Jul/17 ] | |||||||||
|
I'm going to wrap this ticket up with these minor optimizations, which are suitable for backport and provide substantial speed up to incremental routing table refresh in systems with thousands of chunks. The following summarizes the time to refresh the routing data structures in a system with 50,000 chunks. "rebuild" is for a full build of the tables from scratch, and "move 1" is for updating the structures after moving 1 chunk.
Most of the structures are completely rebuilt after even incremental updates, and that process takes time proportional to the number of chunks. You can see the cost of those changes in the "move 1" column. The changes in this ticket transform those rebuilds from O(n log n) to O(n) by leveraging the fact that we construct those structures in sorted-order. Further optimization is possible, but they are more invasive than these changes. | |||||||||
| Comment by Githook User [ 21/Jul/17 ] | |||||||||
|
Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}Message: | |||||||||
| Comment by Githook User [ 21/Jul/17 ] | |||||||||
|
Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}Message: | |||||||||
| Comment by Githook User [ 12/Jul/17 ] | |||||||||
|
Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}Message: | |||||||||
| Comment by Githook User [ 12/Jul/17 ] | |||||||||
|
Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}Message: | |||||||||
| Comment by Githook User [ 05/Jul/17 ] | |||||||||
|
Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}Message: | |||||||||
| Comment by Githook User [ 05/Jul/17 ] | |||||||||
|
Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}Message: Previously, it was initialized to a pseudorandom value that was less than |