[SERVER-80140] Use the $currentOp to verify that fsyncLockWorker threads are waiting for the lock Created: 16/Aug/23 Updated: 29/Oct/23 Resolved: 01/Sep/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0, 4.4.25, 7.0.2, 5.0.22, 6.0.11 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Nandini Bhartiya | Assignee: | Nandini Bhartiya |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-nyc-subteam1 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v7.1, v7.0, v6.0, v5.0, v4.4
|
||||||||
| Sprint: | Sharding NYC 2023-09-18 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 29 | ||||||||
| Story Points: | 1 | ||||||||
| Description |
|
Use the $currentOp output to verify that fsyncLockWorker threads are waiting for the lock (enqueued in the conflict queue) before unpausing the coordinator. |
| Comments |
| Comment by Githook User [ 17/Aug/23 ] | |
|
Author: {'name': 'Nandini Bhartiya', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}Message: | |
| Comment by Nandini Bhartiya [ 16/Aug/23 ] | |
|
max.hirschhorn@mongodb.com mentioned to set idleConnections to false in the $currentOp query. The fsyncLockWorker threads are now visible in the $currentOp output.
| |
| Comment by Nandini Bhartiya [ 16/Aug/23 ] | |
|
We experimented with checking the value for $currentOp, but the fsyncLockWorker thread did not show up in the output. We also tried to add a fail point in the concurrency code (so we know when the fsyncLockWorker threads are in the conflict queue) instead of using a sleep(), but were discouraged from doing so because of possible performance issues (slack thread). | |
| Comment by Max Hirschhorn [ 16/Aug/23 ] | |
|
Can we instead use $currentOp within an assert.soon() callback to explicitly wait on the fsyncLockWorker thread having enqueued the lock (for example)? Generally a sleep of a fixed number isn't a good synchronization tactic. |