[SERVER-43003] Add an option to archive data files when shell spawned processes fail consistency checks Created: 23/Aug/19  Updated: 06/Dec/22  Resolved: 03/Dec/21

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Robert Guo (Inactive) Assignee: Backlog - Server Tooling and Methods (STM) (Inactive)
Resolution: Done Votes: 0
Labels: tig-qwin-eligible
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Server Tooling & Methods
Participants:
Linked BF Score: 120
Story Points: 2

 Description   

The data archival hook currently has an option to archive data when data consistency checks fail, but it only works for resmoke spawned processes. We should have it also pick up shell spawned processes, which do consistency checks automatically during shutdown.



 Comments   
Comment by Brooke Miller [ 03/Dec/21 ]

We now archive all data files for all mainline tasks now.

Comment by Raiden Worley (Inactive) [ 10/Mar/21 ]

One approach, considering that Python's unit-testing framework is build on a boolean pass/fail model, could be to create a convention for resmoke to pass in a temporary path in `TestData`, and after test execution resmoke would read that file (if it exists) and construct an object containing metadata from the test run. One of these metadata fields could be a request from the test itself to perform archiving, orthogonal to whether the test passed or failed based on return code. This approach would be extensible for any test-reported metadata.

We actually already do something almost identical to this in the resmoke end2end tests here.

Comment by Robert Guo (Inactive) [ 17/Oct/19 ]

This ticket is more complex than I originally thought after going through the archival code. At the moment, the archival hook communicates with the shell through the success boolean and there there are established conventions in the Python unit-testing framework for treating the result of every test as either pass or fail. Building out a notion of failure mode for a test case is going to be more than a ticket's worth of work. At a high-level, I think the following pieces are necessary.

1. Allow test cases to handle user-configurable error conditions, beyond "Errors" and "Failures" that are based on guidelines provided by Python's unit testing framework. Based on discussions in SERVER-19895, it seems like the distinction is already confusing.

2. Generalize report.py to support user-configurable failure modes. With the way resmoke.py is integrated with Evergreen currently, there needs to be a mapping of testcase failures to statuses in Evergreen (i.e. resmoke.py return codes)

3. Hardcode a map of error codes for when the archival hook should run. This is a bit ugly but we don't have a use case for a more generalized solution at the moment.

Generated at Thu Feb 08 05:01:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.