[SERVER-65991] Improve observability of imminent initial sync failure on the source node Created: 26/Apr/22  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Edwin Zhou Assignee: Backlog - Replication Team
Resolution: Unresolved Votes: 0
Labels: former-quick-wins
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Replication
Participants:

 Description   

Initial sync can time out when the syncing node is repeatedly unable to open a cursor for an oplog fetcher on the source node.

When we timeout because the syncing node is unable to create a cursor for the oplog fetcher, the source node is responsible for preventing this cursor from being created. We want to know when initial sync is at risk of failing on the source node so we can collect stack traces. Currently, it is difficult to diagnose why we're unable to create this cursor on the source node.


Generated at Thu Feb 08 06:04:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.