-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
ALL
-
None
-
3
-
TBD
-
None
-
None
-
None
-
None
-
None
-
None
The OplogCapMaintainerThread deletes oplog with a method called reclaimOplog(), with global and RSTL locks held. This method can take a long time if there is a lot of oplog to reclaim, longer than 30 seconds, resulting in stepup or stepdown crashing due to not being able to obtain the RSTL.
Fix might be to make reclaimOplog() interruptible, or to take the global lock without the RSTL if this is safe, or both.
- is related to
-
SERVER-104856 Move oplog truncation/sampling/reclaiming code to replication
-
- Backlog
-
- related to
-
SERVER-104441 Revisit decision to crash on RSTL timeout
-
- Needs Scheduling
-