[SERVER-43003] Add an option to archive data files when shell spawned processes fail consistency checks Created: 23/Aug/19 Updated: 06/Dec/22 Resolved: 03/Dec/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Robert Guo (Inactive) | Assignee: | Backlog - Server Tooling and Methods (STM) (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | tig-qwin-eligible | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Server Tooling & Methods
|
||||
| Participants: | |||||
| Linked BF Score: | 120 | ||||
| Story Points: | 2 | ||||
| Description |
|
The data archival hook currently has an option to archive data when data consistency checks fail, but it only works for resmoke spawned processes. We should have it also pick up shell spawned processes, which do consistency checks automatically during shutdown. |
| Comments |
| Comment by Brooke Miller [ 03/Dec/21 ] |
|
We now archive all data files for all mainline tasks now. |
| Comment by Raiden Worley (Inactive) [ 10/Mar/21 ] |
|
One approach, considering that Python's unit-testing framework is build on a boolean pass/fail model, could be to create a convention for resmoke to pass in a temporary path in `TestData`, and after test execution resmoke would read that file (if it exists) and construct an object containing metadata from the test run. One of these metadata fields could be a request from the test itself to perform archiving, orthogonal to whether the test passed or failed based on return code. This approach would be extensible for any test-reported metadata. We actually already do something almost identical to this in the resmoke end2end tests here. |
| Comment by Robert Guo (Inactive) [ 17/Oct/19 ] |
|
This ticket is more complex than I originally thought after going through the archival code. At the moment, the archival hook communicates with the shell through the success boolean and there there are established conventions in the Python unit-testing framework for treating the result of every test as either pass or fail. Building out a notion of failure mode for a test case is going to be more than a ticket's worth of work. At a high-level, I think the following pieces are necessary. 1. Allow test cases to handle user-configurable error conditions, beyond "Errors" and "Failures" that are based on guidelines provided by Python's unit testing framework. Based on discussions in 2. Generalize report.py to support user-configurable failure modes. With the way resmoke.py is integrated with Evergreen currently, there needs to be a mapping of testcase failures to statuses in Evergreen (i.e. resmoke.py return codes) 3. Hardcode a map of error codes for when the archival hook should run. This is a bit ugly but we don't have a use case for a more generalized solution at the moment. |