[SERVER-29817] Optimize incremental update performance of ChunkManager and CollectionMetadata Created: 23/Jun/17  Updated: 30/Oct/23  Resolved: 21/Jul/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.4.7, 3.5.11

Type: Improvement Priority: Major - P3
Reporter: Andy Schwerin Assignee: Andy Schwerin
Resolution: Fixed Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-19295 Chunk::mkDataWritten() should use a P... Closed
Related
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.4
Participants:
Case:

 Description   

The ChunkManager and CollectionMetadata types are effectively in-memory routing tables used by map shard key values to chunks and shards. These data structures are read only, and updated copies must be made after chunk migrations. Because the data structures are updated by copy, their update time is proportional to the total number of chunks in a given collection. When the number of chunks in a collection is large (10s of thousands), the update time is significant, and increases the latency of chunk migration commit.

This ticket tracks improvements to the constant factors affecting update time. Prototyping indicates that a speedup of >4x is possible by replacing woCompare with KeyString comparison, and by replacing all but one instance of std::map with instances of std::vector that are constructed in sorted order.



 Comments   
Comment by Githook User [ 26/Jul/17 ]

Author:

{'email': 'schwerin@mongodb.com', 'username': 'andy10gen', 'name': 'Andy Schwerin'}

Message: SERVER-29817 Use hinted insert when building CollectionMetadata for performance.
Branch: v3.4
https://github.com/mongodb/mongo/commit/84a658596bc62107539a07556b9c066af2fec683

Comment by Githook User [ 26/Jul/17 ]

Author:

{'email': 'schwerin@mongodb.com', 'username': 'andy10gen', 'name': 'Andy Schwerin'}

Message: SERVER-29817 Use hinted insert when building chunkRangeMap
Branch: v3.4
https://github.com/mongodb/mongo/commit/01a38984d5c7b134b177d1b78023ed779c8b1631

Comment by Andy Schwerin [ 21/Jul/17 ]

I'm going to wrap this ticket up with these minor optimizations, which are suitable for backport and provide substantial speed up to incremental routing table refresh in systems with thousands of chunks.

The following summarizes the time to refresh the routing data structures in a system with 50,000 chunks. "rebuild" is for a full build of the tables from scratch, and "move 1" is for updating the structures after moving 1 chunk.

  rebuild move 1
without optimization 451ms 189ms
this ticket 312ms 47ms

Most of the structures are completely rebuilt after even incremental updates, and that process takes time proportional to the number of chunks. You can see the cost of those changes in the "move 1" column. The changes in this ticket transform those rebuilds from O(n log n) to O(n) by leveraging the fact that we construct those structures in sorted-order.

Further optimization is possible, but they are more invasive than these changes.

Comment by Githook User [ 21/Jul/17 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-29817 Use hinted insert when building chunkRangeMap
Branch: master
https://github.com/mongodb/mongo/commit/43441676092ae87f4d0cc1bf81877f9610149454

Comment by Githook User [ 21/Jul/17 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-29817 Use hinted insert when building CollectionMetadata for performance.
Branch: master
https://github.com/mongodb/mongo/commit/a045e38ca392f8354ea85ec5cebfb6d52892f444

Comment by Githook User [ 12/Jul/17 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-29817 Move construction of CollectionMetadata data structures out of ShardingState.
Branch: master
https://github.com/mongodb/mongo/commit/4b9d69eb00361083ce835d42c4107a4caa52f6fc

Comment by Githook User [ 12/Jul/17 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-29817 Move all ChunkManager construction logic into chunk_manager.cpp; hide implementation details.
Branch: master
https://github.com/mongodb/mongo/commit/4c15828d7bd7222fbcb5dc5b3c2060ea2c136dc7

Comment by Githook User [ 05/Jul/17 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-29817 Do not use chunk map data structure in ChunkManager interface.
Branch: master
https://github.com/mongodb/mongo/commit/74ad9fa5538c82478aced69a1969266752b1d7a8

Comment by Githook User [ 05/Jul/17 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-29817 Always intialize bytesWritten to 0 for chunks.

Previously, it was initialized to a pseudorandom value that was less than
some fraction of the maximum chunk size. This introduced a depencency between
the chunk constuctor and the balancer, and provided no particular value.
Branch: master
https://github.com/mongodb/mongo/commit/9d803b41497eb54864361a3877097d8e1dd55dc2

Generated at Thu Feb 08 04:21:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.