Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Sharding
Labels:
None

Assigned Teams:

Cluster Scalability
Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Current setting is majority write concern with a 60 sec wtimeout. However, the clone can potentially generate lots of writes and index builds, which can cause it to timeout waiting for replication. In the current master, _movePrimary will attempt to retry because writeConcern errors are treated as a retryable error, but since the collections were already cloned already earlier, it will get a namespace already exists error, which is not retryable and causing the entire _movePrimary command to fail. This can lead to the data ending up as orphans and eventually causing issue described in ~~SERVER-32142~~

is related to

SERVER-32142 `movePrimary` can leave orphaned data when it aborts after cloning

Closed

SERVER-46424 _cloneCatalogData remote call is labeled as idempotent and retryable although it isn't

Closed

Assignee:: [DO NOT USE] Backlog - Cluster Scalability
Reporter:: Randolph Tan
Participants:: [DO NOT USE] Backlog - Cluster Scalability, Randolph Tan
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Feb 26 2020 04:58:42 PM UTC
Updated:: Dec 12 2023 03:49:26 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates