[SERVER-54279] Primary shard may end up with inconsistent collection catalog entry after resharding Created: 04/Feb/21  Updated: 29/Oct/23  Resolved: 26/Mar/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Alexander Taskov (Inactive)
Resolution: Fixed Votes: 0
Labels: PM-234-M3, PM-234-T-lifecycle
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-54231 Resharding can leave behind local col... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

python buildscripts/resmoke.py run --suite=sharding jstests/sharding/resharding_allowMigrations.js

diff --git a/jstests/sharding/libs/resharding_test_fixture.js b/jstests/sharding/libs/resharding_test_fixture.js
index d445b42fdf..396f06c1c7 100644
--- a/jstests/sharding/libs/resharding_test_fixture.js
+++ b/jstests/sharding/libs/resharding_test_fixture.js
@@ -151,11 +151,10 @@ var ReshardingTest = class {
 
         this._tempNs = `${sourceDB.getName()}.system.resharding.${sourceCollectionUUIDString}`;
 
-        // mongos won't know about the temporary resharding collection and will therefore assume the
-        // collection is unsharded. We configure one of the recipient shards to be the primary shard
-        // for the database so mongos still ends up routing operations to a shard which owns the
-        // temporary resharding collection.
-        this._st.ensurePrimaryShard(sourceDB.getName(), this.recipientShardNames[0]);
+        // XXX: Force one of the non-recipient shards be the primary shard for the database to
+        // demonstrate the issue. resharding_allowMigrations.js doesn't attempt to read from the
+        // temporary resharding collection within the duringReshardingFn() callback anyway.
+        this._st.ensurePrimaryShard(sourceDB.getName(), this.donorShardNames[0]);
 
         return sourceCollection;
     }
diff --git a/jstests/sharding/resharding_allowMigrations.js b/jstests/sharding/resharding_allowMigrations.js
index c61a946cc4..06da3a5bff 100644
--- a/jstests/sharding/resharding_allowMigrations.js
+++ b/jstests/sharding/resharding_allowMigrations.js
@@ -22,6 +22,9 @@ const sourceCollection = reshardingTest.createShardedCollection({
     chunks: [{min: {oldKey: MinKey}, max: {oldKey: MaxKey}, shard: donorShardNames[0]}],
 });
 
+const originalCollInfo = sourceCollection.exists();
+assert.neq(originalCollInfo, null, "failed to find sharded collection before resharding");
+
 const recipientShardNames = reshardingTest.recipientShardNames;
 reshardingTest.withReshardingInBackground(
     {
@@ -46,5 +49,9 @@ reshardingTest.withReshardingInBackground(
             ErrorCodes.ConflictingOperationInProgress);
     });
 
+const newCollInfo = sourceCollection.exists();
+assert.neq(newCollInfo, null, "failed to find sharded collection after resharding");
+assert.neq(newCollInfo.info.uuid, originalCollInfo.info.uuid, {newCollInfo, originalCollInfo});
+
 reshardingTest.teardown();
 })();

Sprint: Sharding 2021-04-05
Participants:
Story Points: 1

 Description   

Only recipient shards construct the temporary resharding collection and replace the collection catalog entry when the resharding operation succeeds. If the primary shard for the database for the collection being resharded isn't also a recipient shard, then it won't have a collection catalog entry that's consistent with the recipient shards. It'll either

  • (a) not have a collection catalog entry whatsoever if the primary shard was a donor shard, or
  • (b) have an inconsistent collection UUID because it will have retained the one from before the resharding operation started.

One solution to this problem would be to have the coordinator also consider the primary shard for the database for the collection being resharded to be a recipient shard.

[js_test:resharding_allowMigrations] 2021-02-04T02:30:11.179+0000 uncaught exception: Error: [null] != [null] are equal : failed to find sharded collection after resharding :
[js_test:resharding_allowMigrations] 2021-02-04T02:30:11.180+0000 doassert@src/mongo/shell/assert.js:20:14
[js_test:resharding_allowMigrations] 2021-02-04T02:30:11.180+0000 assert.neq@src/mongo/shell/assert.js:270:9
[js_test:resharding_allowMigrations] 2021-02-04T02:30:11.180+0000 @jstests/sharding/resharding_allowMigrations.js:53:1
[js_test:resharding_allowMigrations] 2021-02-04T02:30:11.180+0000 @jstests/sharding/resharding_allowMigrations.js:10:2
[js_test:resharding_allowMigrations] 2021-02-04T02:30:11.180+0000 failed to load: jstests/sharding/resharding_allowMigrations.js
[js_test:resharding_allowMigrations] 2021-02-04T02:30:11.180+0000 exiting with code -3



 Comments   
Comment by Githook User [ 26/Mar/21 ]

Author:

{'name': 'Alex Taskov', 'email': 'alex.taskov@mongodb.com', 'username': 'alextaskov'}

Message: SERVER-54279 Add database primary as recipient in resharding
Branch: master
https://github.com/mongodb/mongo/commit/28c0c1be5617a6c806593b5ac0ad4f90d2544985

Generated at Thu Feb 08 05:33:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.