[SERVER-41613] FSyncLockThread::run() should not hold a lock while calling waitUntilDurable() Created: 10/Jun/19 Updated: 29/Oct/23 Resolved: 27/Sep/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Dianna Hohensee (Inactive) | Assignee: | Daniel Gottlieb (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | execution_intern_2019 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Sprint: | Execution Team 2019-09-09, Execution Team 2019-09-23, Execution Team 2019-10-07 | ||||||||
| Participants: | |||||||||
| Description |
|
The GlobalRead lock is acquired in fsync. However, we only call waitUntilDurable() if _allowFsyncFailure is set. _allowFsyncFailure is only set if getTestCommandsEnabled() and the "allowFsyncFailure" flag was set on the cmd request, which was added to make fsync+lock block writes against the server regardless of fsync success. That was added in this commit for the backup work. |
| Comments |
| Comment by Githook User [ 27/Sep/19 ] |
|
Author: {'name': 'Daniel Gottlieb', 'username': 'dgottlieb', 'email': 'daniel.gottlieb@mongodb.com'}Message: . |
| Comment by Dianna Hohensee (Inactive) [ 20/Sep/19 ] |
|
This ticket was spun off of Looking at the code, it seems like WiredTigerSessionCache has a shuttingDown() function wherein we will wait until there are no more callers, just spinning until that is the case before returning. waitUntilDurable() makes sure that WiredTigerSessionCache knows it's running (registers itself) with this check. Most callers of waitUntilDurable() don't hold locks today, I believe: there were only 3 places that |
| Comment by Daniel Gottlieb (Inactive) [ 11/Sep/19 ] |
|
Maybe I'm missing something. If code calls waitUntilDurable without locks, what's protecting the storage engine from being closed? Additionally, there are two other spots in fsync that call WiredTigerSessionCache::waitUntilDurable: here and here. It just happens that those bypass RecoveryUnit::waitUntilDurable. Where RecoveryUnit::waitUntilDurable just forwards to the session cache implementation. |
| Comment by Eric Milkie [ 28/Jun/19 ] |
|
After some investigation, I think it's ok to simply remove the waitUntilDurable() call. The only jstest that is using the allowFsyncFailure flag is only calling fsyncLock to get the lock behavior, not the fsync behavior, and we already called flushAllFiles() in that code path anyway. |
| Comment by Dianna Hohensee (Inactive) [ 14/Jun/19 ] |
|
Done. |
| Comment by Eric Milkie [ 14/Jun/19 ] |
|
We should link here what the code location is that is locking and calling waitUntilDurable(). |