Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
None
-
None
-
0.25
Description
The downgrade procedure from CSRS to SCCC is missing a step to clear the 'minOpTimeRecovery' document.
The following two steps should be added:
1. Following step 1:
On the primary of each shard, find the minOpTimeRecovery document and ensure that the minOpTimeUpdaters value is zero. The minOpTimeRecovery document is what ensures that shards running against a CSRS config server always see the latest chunks metadata after a migration. If the value of minOpTimeUpdaters is not zero, this means there is either an active migration, which should be waited to complete or that a previous migration failed midway and the shard should be recovered. Recovery happens by stepping down the primary, after which the new newly elected primary should clear the minOpTimeUpdaters field successfully.
This step must be done on each shard's primary.
2. As part of step 6:
Before stepping down a 3.2 shard with the intention to downgrade it, the minOpTimeRecovery document mentioned in step 1 must be cleared from it. If this step is not performed, the newly elected 3.2 primary will try to contact the already downgraded CSRS config server and fail to become a primary and the shard's replica set will become stuck.
In order to do this, on the primary of each shard, remove the minOpTimeRecovery document with the following command:
use admin;
|
db.system.version.remove({_id: 'minOpTimeRecovery'}, {writeConcern: {w: 'majority', wtimeout: 30000}});
|
... the rest of step 6 ...