[SERVER-48068] assert.soon() with hang analyzer enabled inside a try/finally can lead to an invariant failure in ProgramRegistry Created: 08/May/20 Updated: 29/Oct/23 Resolved: 26/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Shell, Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 4.7.0, 4.4.13 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Richard Samuels (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | tig-hanganalyzer | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||||||||||
| Steps To Reproduce: | The sleep(5000) is to ensure the parallel shell has already exited by the time MongoRunner.runningChildPids() is called.
|
||||||||||||||||||||
| Sprint: | STM 2020-06-29, STM 2020-07-13 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 50 | ||||||||||||||||||||
| Description |
|
The MongoRunner.runningChildPids() function calls through to getRunningMongoChildProcessIds() in shell_utils_launcher.cpp, which in turn calls wait_for_pid() on all processes in ProgramRegistry::_registeredPids. A JavaScript test of the form
(for example) would end up calling wait_for_pid() multiple times. The first call to wait_for_pid() via getRunningMongoChildProcessIds() would unregister the pid from ProgramRegistry::_registeredPids and the second call would trigger this invariant. |
| Comments |
| Comment by Githook User [ 01/Feb/22 ] |
|
Author: {'name': 'Richard Samuels', 'email': 'richard.l.samuels@gmail.com', 'username': 'richardsamuels'}Message: Revert " This reverts commit dc354f0715784aff08d407f095eb9fd65217411f. |
| Comment by Githook User [ 31/Jan/22 ] |
|
Author: {'name': 'Richard Samuels', 'email': 'richard.samuels@mongodb.com', 'username': 'richardsamuels'}Message: (cherry picked from commit 1af88ecaa952995d44e545708633196b2cd2bbb2) |
| Comment by Githook User [ 26/Jun/20 ] |
|
Author: {'name': 'Richard Samuels', 'email': 'richard.samuels@mongodb.com', 'username': 'richardsamuels'}Message: |
| Comment by Max Hirschhorn [ 21/May/20 ] |
|
In the general case, calling wait_for_pid() more than once is problematic because once waitpid() has returned the child process's exit code to the parent process (or the last Handle object has been closed on Windows), it becomes possible for that pid to be reused. In this particular case, we're expecting the mongo shell process to exit shortly after assert.soon() has failed. I think it is acceptable to do as you've proposed. |
| Comment by Brooke Miller [ 18/May/20 ] |
|
Hey Max, Robert and I reviewed this. Does it seem appropriate to call a version of wait_for_pid() that doesn't un-register the pid? |