[SERVER-59005] Storage engine clean shutdown can race with startup Created: 02/Aug/21 Updated: 29/Oct/23 Resolved: 27/Aug/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Benety Goh |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Sprint: | Execution Team 2021-09-06 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 10 | ||||||||||||||||||||
| Description |
|
In certain circumstances, storage engine startup can race with clean shutdown, and lead to the following invariant failure:
The shutdown task that is called from the signal handler to cleanly shut down the storage engine holds a Global X lock. But the initAndListen thread, which initializes the storage engine, and which registers the TimestampMonitor listener, does not hold this lock. The shutdown path assumes that the storage engine has been completely initialized, but that is not the case. So the server can crash if it is shut down cleanly before the storage engine finishes starting up. I'm surprised we don't already hold the Global X lock during storage engine initialization, but perhaps we should. An alternative to taking a global lock would be to keep shutdown expeditious and permit this type of race by relaxing the existing invariant. |
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] |
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! |
| Comment by Githook User [ 27/Aug/21 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 27/Aug/21 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 27/Aug/21 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 26/Aug/21 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Benety Goh [ 24/Aug/21 ] |
|
|
| Comment by Benety Goh [ 24/Aug/21 ] |
|
Each server instance registers a single TimestampListener to observe changes in TimestampMonitor::TimestampType::kMinOfCheckpointAndOldest. We register the listener at process startup and remove it at shutdown. This was a new TimestampType constant introduced in |
| Comment by Benety Goh [ 24/Aug/21 ] |
|
The StorageEngineImpl::removeListener() invariant was added in |