-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
None
-
Cluster Scalability
-
ALL
Current setting is majority write concern with a 60 sec wtimeout. However, the clone can potentially generate lots of writes and index builds, which can cause it to timeout waiting for replication. In the current master, _movePrimary will attempt to retry because writeConcern errors are treated as a retryable error, but since the collections were already cloned already earlier, it will get a namespace already exists error, which is not retryable and causing the entire _movePrimary command to fail. This can lead to the data ending up as orphans and eventually causing issue described in SERVER-32142
- is related to
-
SERVER-32142 `movePrimary` can leave orphaned data when it aborts after cloning
- Closed
-
SERVER-46424 _cloneCatalogData remote call is labeled as idempotent and retryable although it isn't
- Closed