[SERVER-41937] Add a try-catch block in TimestampMonitor::startup() or notifyAll() to suppress exceptions Created: 26/Jun/19 Updated: 29/Oct/23 Resolved: 09/Aug/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.1, 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Xiangyu Yao (Inactive) | Assignee: | Gregory Wlodarek |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v4.2
|
||||||||
| Sprint: | Execution Team 2019-08-12 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 61 | ||||||||
| Description |
|
On stepdown, methods called by notifyAll() could throw exceptions. However, there is no try-catch block at any level for TimestampMonitor. |
| Comments |
| Comment by Githook User [ 14/Aug/19 ] |
|
Author: {'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}Message: (cherry picked from commit 181ad8eeaaf0a0c636713699f8e110a3e94af125) |
| Comment by Githook User [ 09/Aug/19 ] |
|
Author: {'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}Message: |
| Comment by Githook User [ 09/Aug/19 ] |
|
Author: {'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}Message: Revert " This reverts commit 1893c8771d00d27d6f4d4a3bf1e3193232d6672f. |
| Comment by Githook User [ 08/Aug/19 ] |
|
Author: {'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}Message: |
| Comment by Dianna Hohensee (Inactive) [ 01/Aug/19 ] |
|
So the failure is encountering an InterruptedAtShutdown error, and as noted in the diagnosis we think it is throwing from here when we try to grab a GlobalLock. We'll need a try-catch block, log a msg to the user on error about giving up and startup will handle the dropping the idents – which I think is true --, and we'll want a unit test, which will require exploring GlobalLock and figuring out how to make it throw. I'd also like to make verify the error is coming from there, which can be done by running the test without the fix to make sure it crashes like in JS test failures. |