[SERVER-56786] There are three routing info refreshes and two chunk scans on the mergeChunks path Created: 10/May/21  Updated: 29/Oct/23  Resolved: 02/Jun/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.0.24, 4.2.14, 4.4.6
Fix Version/s: 4.2.15, 4.4.7, 5.0.0-rc1, 4.0.26, 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Paolo Polato
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
causes SERVER-59120 Create unit tests for commitChunksMerge Closed
Related
related to SERVER-58109 The new '_configsvrMergeChunks' path ... Closed
related to SERVER-56779 Do not use the collection distributed... Closed
related to SERVER-57057 Reduce routing info refreshes on the ... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0, v4.4, v4.2, v4.0
Sprint: Sharding EMEA 2021-05-31, Sharding EMEA 2021-06-14
Participants:

 Description   

The current mergeChunks path is very inefficient because:

  1. It performs three sequential refreshes (router, shard-pre-merge and shard-post-merge)
  2. It performs sequential scan of the merge bounds on the cached routing info on the shard, only to generate a config server command with size proportional to the number of chunks being merged (which theoretically can exceed the max BSON size)
  3. The config server repeats the chunks scan that the shard did, this time directly against config.chunks, just to check that all the bounds that the shard sent match.

All this makes the mergeChunks command very expensive both from latency and from impact on the config server points of view.

It would be much better if:

  • The router command:
    • Didn't do a refresh on entry, but relied on the cached information and the shardVersion (this has backwards compatibility implications)
  • The shard command (in order of importance):
    • Instead of sending all the chunks which fall within a certain range to be merged, just send the ends of the range so that the size of the command to the config server is constant and the scan is done just once
    • Just checked the major shardVersion (for routing correctness, i.e., to make sure this shard owns that range)
    • Only did a refresh on the shard if the chunk bounds that the router sent didn't match with the cached info (this can only happen if a previous merge committed against the ConfigServer, but failed to refresh)


 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 18/Jun/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-56786 expand the bounds parameter of mergeChunk in the config server

(cherry picked from commit 93c36da06fa36454a7f8ff77ce2c86a07ba97e4f)
Branch: v4.0
https://github.com/mongodb/mongo/commit/15df22d6ba6fd42d4ac869767f7cfb0770cc79d0

Comment by Githook User [ 11/Jun/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-56786 expand the bounds parameter of mergeChunk in the config server
Branch: v4.2
https://github.com/mongodb/mongo/commit/93c36da06fa36454a7f8ff77ce2c86a07ba97e4f

Comment by Githook User [ 04/Jun/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-56786 expand the bounds parameter of mergeChunk in the config server
Branch: v4.4
https://github.com/mongodb/mongo/commit/4d58bea897115e86b17c20a808bd392f663c2552

Comment by Githook User [ 03/Jun/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-56786 expand the bounds parameter of mergeChunk in the config server (BACKPORT-9250)
Branch: v5.0
https://github.com/mongodb/mongo/commit/f9d83b669cc832802319005c5b2a5caf2ab18591

Comment by Githook User [ 02/Jun/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-56786 Ensure that the query on config.collections succeeds
Branch: master
https://github.com/mongodb/mongo/commit/8dc7a2363f71c00f79354ca961197692201a5100

Comment by Githook User [ 02/Jun/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-56786 expand the bounds parameter of mergeChunk in the config server
Branch: master
https://github.com/mongodb/mongo/commit/469ba14218e31ce9756888593d88fada001bf6f0

Comment by Githook User [ 20/May/21 ]

Author:

{'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}

Message: SERVER-56786: defining a separate function to invoke mergeChunks in shard
Branch: master
https://github.com/mongodb/mongo/commit/54c1d66aa860dd94ec0508c8ca104bb7b3b1224a

Generated at Thu Feb 08 05:40:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.