Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.1.0-rc0
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

DevProd Test Infrastructure
Backwards Compatibility:
Fully Compatible
Sprint:
2024-10-29
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Right now, after a test fails we still try to do clean shutdown of all servers in the cluster under test. This results in slower turnaround and a lot of useless log messages after the failures that we need to scroll past to find the failure message. This is particularly bad for tests that use a larger number of servers, like sharding tests. There is also a risk that the servers were left in a state where they hang while shutting down (eg if some failpoints were left active). Instead, once we know we have a definite failure, we should just abort the servers as quickly as possible using SIGKILL.

I think this applies both to servers launched by the test itself (eg by ShardingTest), as well as for externally managed servers launched by resmoke.

is duplicated by

SERVER-39173 Remove unnecessary sleep when MongoRunner kills mongod node with SIGKILL

Closed

related to

SERVER-122360 Remove unintended wait timeouts in tests

Backlog

SERVER-45342 Send an abort signal instead of a kill signal when archiving

Closed

Assignee:: Mikhail Shchatko
Reporter:: Mathias Stearn
Participants:: Githook User, Mathias Stearn, Mikhail Shchatko, Steve McClure
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Sep 26 2024 12:40:32 PM UTC
Updated:: Dec 21 2024 03:57:14 PM UTC
Resolved:: Oct 25 2024 01:16:37 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates

PagerDuty