-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Cluster Scalability
-
ALL
-
None
-
3
-
TBD
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Resharding donors and recipients start change stream monitors but they do not immediately handle monitor failures. Instead, errors remain unchecked until participants later call, causing delayed error detection and potentially allowing operations to continue only to fail later.
Current Behavior:
- When the participants start monitoring, the changeStreamsMonitor returns a semi future.
- The returned future is stored but not immediately checked for errors.
- If the monitor fails (e.g., due to network issues, cursor problems, etc.), the error sits in the fulfilled future.
- Donor/Recipient continues with other operations, unaware of the monitor failure.
- Error is only surfaced much later when awaitChangeStreamsMonitorCompleted() is called.
SERVER-104946 will be committing an unit test demonstrating this behavior on the donor. A possible solution is to add a background task that monitors the change streams monitor future and immediately handles errors.s.
- is related to
-
SERVER-104946 Create unit tests to trigger an unrecoverable error at every phase of resharding
-
- Closed
-