-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Cluster Scalability
-
Fully Compatible
-
ALL
-
v8.1, v8.0
-
ClusterScalability Mar31-Apr14
-
None
-
None
-
None
-
None
-
None
-
None
-
None
a successful resharding captures the following information in atlas log ingestion:
numberofdestinationshards numberofindexes numberofsourceshards numberoftotaldocuments Operation duration (min) total collec size (gb) totalappliedtime totalapplytimeelapsedsecs totalcopytimeelapsedsecs totalcriticalsectiontimeelapsedsecs totalindexbuildtimeelapsedsecs
but when resharding fails or is manually aborted by a user, we input Zeroes in those fields. We should insert correct values in case of failure.
Recapture the stats in side the failure in case of a failure and ingest in the atlas log ingestion rule.
In addition to verifying the issue, we should also enhance the logs to include the last coordinator phase and investigate how hard it is to include the failure code.
- causes
-
SERVER-103191 getCoordinatorDoc May Fail If Called From Retry Loop Which Deletes It
-
- Closed
-
- duplicates
-
SERVER-102045 Add 'AbortReason' to resharding complete log
-
- Closed
-
- is related to
-
SERVER-102183 Include provenance info in resharding coordinator completion logs
-
- Closed
-
-
SERVER-103172 Log Critical Section Start and End Times on Resharding Completion
-
- Open
-
-
SERVER-103174 Verify WritesDuringCriticalSection Is Properly Tracked During Resharding
-
- Needs Scheduling
-