[SERVER-29522] exit code reported by resmoke.py for concurrent fuzzer should be more meaningful Created: 08/Jun/17  Updated: 30/Oct/23  Resolved: 13/Oct/17

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: 3.5.8
Fix Version/s: 3.6.0-rc1

Type: Improvement Priority: Minor - P4
Reporter: Kimberly Hou Assignee: Ian Boros
Resolution: Fixed Votes: 0
Labels: tig-resmoke
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-26443 Clean up error reporting in jstestfuz... Closed
Backwards Compatibility: Fully Compatible
Sprint: TIG 2017-09-11, TIG 2017-10-02, TIG 2017-10-23
Participants:

 Description   

When running multi-threaded tests with the JSTestCase class, resmoke.py eventually returns a single exit code for that test case. The exit code is currently set to the return code of the process that finishes last. Ideally, the exit code should be changed so that if one of the threads had failed (and not necessarily the last thread), then the final exit code from resmoke.py should be nonzero as well regardless of what the last thread returned.



 Comments   
Comment by Githook User [ 13/Oct/17 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: SERVER-29522 better error codes when running concurrent tests in resmoke
Branch: master
https://github.com/mongodb/mongo/commit/a7b2cf5c4b8202eb5344939302fc8ef5f088611f

Comment by Ian Boros [ 11/Oct/17 ]

Oh, wow. Thanks for catching this!

I started a patch build for a fix here:
https://evergreen.mongodb.com/version/59de2cc32a60ed58210023de

The ETA for the test that was timing out is back to ~50 minutes based on the progress it's made so far.

If it works, I'll post another code review.

Comment by Max Hirschhorn [ 11/Oct/17 ]

ian.boros, I've revert the changes from 1e26264 because the concurrent fuzzer is timing out after the 4-hour exec_timeout_secs time limit. I suspect things are taking longer to run because we're implicitly using _config.NUM_CLIENTS_PER_FIXTURE copies of the data consistency checks because jsfile.py uses JSTestCase.

https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_jstestfuzz_concurrent_replication_1e2626463b5a7c22484c4556b77da149f4ad1ef9_17_10_10_14_11_44

Comment by Githook User [ 11/Oct/17 ]

Author:

{'email': 'max.hirschhorn@mongodb.com', 'name': 'Max Hirschhorn', 'username': 'visemet'}

Message: Revert "SERVER-29522 better error codes when running concurrent tests in resmoke"

This reverts commit 1e2626463b5a7c22484c4556b77da149f4ad1ef9.
Branch: master
https://github.com/mongodb/mongo/commit/8b3694d704d4c472adba87e8fb0827372324c215

Comment by Githook User [ 10/Oct/17 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: SERVER-29522 better error codes when running concurrent tests in resmoke
Branch: master
https://github.com/mongodb/mongo/commit/1e2626463b5a7c22484c4556b77da149f4ad1ef9

Comment by Max Hirschhorn [ 05/Sep/17 ]

I think we should use this ticket as an opportunity to define a separate class from JSTestCase (e.g. called MultipleCopyJSTestCase for running multiple copies of the same JavaScript test in different threads. The MultipleCopyJSTestCase#run_test() method should create multiple JSTestCase instances (still configuring TestData.isMainTest and TestData.numTestClients as happens now) and set its MultiCopyJSTestCase#return_code attribute to the first non-zero value from any of the underlying JSTestCase instances.

I think we could also remove the --numClientsPerFixture command line option in favor of a num_copies parameter to MultiCopyJSTestCase#__init__().

Generated at Thu Feb 08 04:21:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.