[SERVER-80140] Use the $currentOp to verify that fsyncLockWorker threads are waiting for the lock Created: 16/Aug/23  Updated: 29/Oct/23  Resolved: 01/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0, 4.4.25, 7.0.2, 5.0.22, 6.0.11

Type: Bug Priority: Major - P3
Reporter: Nandini Bhartiya Assignee: Nandini Bhartiya
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0, v6.0, v5.0, v4.4
Sprint: Sharding NYC 2023-09-18
Participants:
Linked BF Score: 29
Story Points: 1

 Description   

Increase the amount to sleep before invoking the fsync (lock:true) command, such that the fsyncLockWorker threads are enqueued in the lock  conflict queue before unpausing the coordinator.

Use the $currentOp output to verify that fsyncLockWorker threads are waiting for the lock (enqueued in the conflict queue) before unpausing the coordinator.



 Comments   
Comment by Githook User [ 17/Aug/23 ]

Author:

{'name': 'Nandini Bhartiya', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}

Message: SERVER-80140: Use $currentOp to ensure fsyncLockWorker threads are in the conflict queue.
Branch: master
https://github.com/mongodb/mongo/commit/ce447db31ad61f18f45502e728b4ef4fc138b44f

Comment by Nandini Bhartiya [ 16/Aug/23 ]

max.hirschhorn@mongodb.com mentioned to set idleConnections to false in the $currentOp query. The fsyncLockWorker threads are now visible in the $currentOp output.

db.aggregate([{$currentOp: {allUsers: true, idleConnections: true}}]).toArray()

Comment by Nandini Bhartiya [ 16/Aug/23 ]

We experimented with checking the value for $currentOp, but the fsyncLockWorker thread did not show up in the output. We also tried to add a fail point in the concurrency code (so we know when the fsyncLockWorker threads are in the conflict queue) instead of using a sleep(), but were discouraged from doing so because of possible performance issues (slack thread).

Comment by Max Hirschhorn [ 16/Aug/23 ]

Can we instead use $currentOp within an assert.soon() callback to explicitly wait on the fsyncLockWorker thread having enqueued the lock (for example)? Generally a sleep of a fixed number isn't a good synchronization tactic.

Generated at Thu Feb 08 06:42:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.