Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.4.1, 4.7.0
Affects Version/s: None
Component/s: Testing Infrastructure
Labels:
- bkp
- tig-resmoke

Backwards Compatibility:
Fully Compatible
Backport Requested:

v4.4
Sprint:
STM 2020-03-23, STM 2020-04-20, STM 2020-05-04
Story Points:
1
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

resmoke.py ordinarily checks that a test didn't cause the server to crash by calling self.fixture.is_running() after the test finishes. However, due to the stepdown thread and the job thread only being synchronized by calling ContinuousStepdown.after_test(), it isn't safe to check whether the fixture is still running immediately after the test finishes.

# Don't check fixture.is_running() when using the ContinuousStepdown hook, which kills
# and restarts the primary. Even if the fixture is still running as expected, there is a
# race where fixture.is_running() could fail if called after the primary was killed but
# before it was restarted.
self._check_if_fixture_running = not any(
    isinstance(hook, stepdown.ContinuousStepdown) for hook in self.hooks)

Skipping this check causes resmoke.py to continue to run the other data consistency checks, even when a process in the MongoDB cluster has crashed. While misleading for Server engineers in terms of causing them to click on the "wrong" link in Evergreen for the task failure, it also have a severe negative impact on our automated log extraction tool by preventing it from finding relevant information. We should ensure process crashes in test suites using the ContinuousStepdown hook prevent other tests and hooks from running. I suspect having _StepdownThread.pause() check that fixture is still running as the last thing it does would accomplish this.

Assignee:: Mikhail Shchatko
Reporter:: Max Hirschhorn
Participants:: David Bradford, Githook User, Ian Whalen, Max Hirschhorn, Mikhail Shchatko, Robert Guo, Siyuan Zhou
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Mar 13 2020 12:07:39 AM UTC
Updated:: Oct 29 2023 10:10:50 PM UTC
Resolved:: Apr 22 2020 12:46:01 PM UTC
Confidence Status Last Update:: 06/Apr/20 2:47 PM

Details

Description

Attachments

Forms

Activity

People

Dates