[SERVER-42175] WiredTigerRecordStore::reclaimOplog can block for extended periods of time Created: 11/Jul/19 Updated: 27/Oct/23 Resolved: 03/Feb/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Storage |
| Affects Version/s: | 3.4.3, 3.6.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Sprint: | Execution Team 2019-09-09 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
In the case where we're truncating a very large amount of oplog entries, WiredTigerRecordStore::reclaimOplog may hold its locks (Global IX since 3.6, DB/Collection IX for the oplog in 3.4) for an extended period of time, blocking operations such as stepdown which may require a global X lock. One potential solution to this problem is for reclaimOplog to yield all locks periodically. |
| Comments |
| Comment by Geert Bosch [ 31/Jan/20 ] |
|
Given that we believe this to be fixed, shall we close this ticket? |
| Comment by Maria van Keulen [ 20/Sep/19 ] |
|
Given the performance improvements to oplog truncation in I am putting this ticket back in Needs Scheduling for it to be triaged as a pre-3.6.4 issue. We will file additional tickets as necessary if there are issues post-3.6.4. |
| Comment by Maria van Keulen [ 16/Sep/19 ] |
|
This issue may be the root cause of |
| Comment by Maria van Keulen [ 13/Sep/19 ] |
|
Putting this ticket on hold in favor of doing |
| Comment by Maria van Keulen [ 06/Sep/19 ] |
|
Per the discussions in geert.bosch and I brainstormed alternate solutions to this issue, among which being to have a larger total number of oplog stones when the oplog is very large. This way, individual stones will be smaller, so truncating the contents of one stone would not necessitate holding the global IX lock for as long. |