[SERVER-33888] Enabling fsyncLock on the config server primary may cause operations to block behind the Balancer thread Created: 14/Mar/18 Updated: 27/Oct/23 Resolved: 07/Jun/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.7.3 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Sara Golemon | Assignee: | Marcos José Grillo Ramirez |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Sharding EMEA
|
| Sprint: | Sharding EMEA 2023-05-29, Sharding EMEA 2023-06-12 |
| Participants: |
| Description |
|
Per conversation with Kal, I've been running into deadlocks while trying to replace our TLS transport, specifically during ReplSetTest shutdown sequence, the fsync lock is set, but shortly thereafter, the Balancer attempts to start a round. https://github.com/mongodb/mongo/blob/cdb8f2f7ad472416c579c6c13292d3fb361d94cb/src/mongo/db/s/balancer/balancer.cpp#L347 Meanwhile, the ReplSetTest shutdown sequence gets stuck behind a read lock attempting to fetch collStats, but can't because the Balancer's write lock is still pending. https://github.com/mongodb/mongo/blob/cdb8f2f7ad472416c579c6c13292d3fb361d94cb/src/mongo/shell/replsettest.js#L1633 See also the following stack: https://gist.github.com/sgolemon/f957e2e2f38e14c0d3a0a661991c7a94 |
| Comments |
| Comment by Garaudy Etienne [ 31/May/23 ] |
|
nandini.bhartiya@mongodb.com jack.mulrow@mongodb.com isn't this what we're going to do for usable backups for sharding on community? lol cc ratika.gandhi@mongodb.com |
| Comment by Steve Briskin (Inactive) [ 21/Mar/18 ] |
|
alyson.cabral, Backup doesn't use fsyncLock so no impact. Thanks for checking! |
| Comment by Alyson Cabral (Inactive) [ 21/Mar/18 ] |
|
steve.briskin Is this important for backup? |
| Comment by Kaloian Manassiev [ 16/Mar/18 ] |
|
Marking it 3.7 Desired, because it is not a deadlock. |