[SERVER-42141] Implement sharded metadata updates for refineCollectionShardKey without using a transaction Created: 10/Jul/19  Updated: 29/Oct/23  Resolved: 25/Jul/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.3.1

Type: Task Priority: Major - P3
Reporter: Jack Mulrow Assignee: James Heppenstall (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done before SERVER-42142 Targeted test that a jumbo chunk that... Closed
has to be done before SERVER-42143 Convert refineCollectionShardKey meta... Closed
Problem/Incident
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2019-07-15, Sharding 2019-07-29, Sharding 2019-08-12
Participants:
Linked BF Score: 50

 Description   

Without using a transaction, implement the sharded metadata changes necessary for refining a shard key during the refineCollectionShardKeyCommand to enable testing without failover or concurrent refreshes.

Add linear jstests that verify the new sharded collection works as expected.

Proposed Implementation

  1. Add a new method to ShardingCatalogManager called refineCollectionShardKey in sharding_catalog_manager_collection_operations.cpp and call it at the end of _configsvrRefineCollectionShardKey
  2. This method should:
    1. Lock the chunk and zone op resource mutexes
    2. Update the config.collections entry for the given namespace by setting its shard key to the refined key and its epoch to a newly generated object id.
    3. Use a multi-update that sets a default value of MinKey in the bounds for each chunk for the given namespace in config.chunks for each new field in the refined key, sets each chunk's epoch to the new collection epoch, and unsets the jumbo field.
      1. e.g. when refining the key from {a: 1} to {a: 1, b: 1, c: 1}:

         config.chunks.update({ns: <given ns>}, {$set: {lastmodEpoch: <new epoch>}, $max: {min.b: MinKey, min.c: MinKey, max.b: MinKey, max.c: MinKey}, $unset: {jumbo: ""}});

    4. Fix the off-by-one error from the previous query by setting the values of max bound for the new shard key fields of the global max chunk to MaxKey (i.e. the chunk with MaxKey values for every shard key field from before the refine).
    5. Update the bounds of each zone document for the given collection in config.tags using a similar query to the one from step 3.
    6. If the max bound for a zone range is the global max, set the default values for its max bound to MaxKey.
  3. Add integration testing that:
    1. Before and after a refine, CRUD operations work as expected and no data is lost.
      1. Including versioned reads against secondaries.
    2. Before and after a refine move, split, and merge operations work as expected and require the full shard key if the user manually specifies bounds.
    3. After a refine, forcing a refresh on each shard (to simulate the asynchronous setShardVersion task completing) does not corrupt data and after the refine inserts of documents without the full shard key are rejected.

 



 Comments   
Comment by Githook User [ 25/Jul/19 ]

Author:

{'name': 'Jamie Heppenstall', 'email': 'jamie.heppenstall@mongodb.com', 'username': 'JamesHeppenstall'}

Message: SERVER-42141 Implement sharded metadata updates for refineCollectionShardKey without using a transaction
Branch: master
https://github.com/mongodb/mongo/commit/d3d8a901b72c3088c360d9c72fad1b4fc08e5eda

Generated at Thu Feb 08 04:59:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.