[SERVER-58200] Asserting clauses do not cause a jstest to fail when it is run through startParallelShell() Created: 01/Jul/21  Updated: 29/Oct/23  Resolved: 31/Aug/21

Status: Closed
Project: Core Server
Component/s: Shell
Affects Version/s: None
Fix Version/s: 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Paolo Polato Assignee: Richard Samuels (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done before SERVER-59686 Investigate failing parallel shell in... Backlog
has to be done before SERVER-59685 Investigate failing parallel shell in... Closed
has to be done before SERVER-59687 Investigate failing parallel shell in... Closed
has to be done before SERVER-59688 Investigate failing parallel shell in... Closed
Problem/Incident
Related
related to SERVER-58325 Failures in txn_two_phase_commit_coor... Backlog
Backwards Compatibility: Fully Compatible
Sprint: STM 2021-09-06
Participants:
Linked BF Score: 142
Story Points: 2

 Description   

Updated Description:
startParallelShell returns a function that needs to be called to assert the output of the shell. This is not intuitive and error prone. We should be able to enforce that the returned function is called by storing and removing pids in a dict and checking that the dict is empty on shell shutdown.

See impl sketch here: https://github.com/mongodb/mongo/commit/bd70fe9b24fbefbb2d7a483d15a5a7d3a326b315

Original Description:
When executing an asserting function through startParallelShell() , the failure will appear in tests.log as an uncaught exception, but the test will still pass.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 31/Aug/21 ]

Author:

{'name': 'Richard Samuels', 'email': 'richard.l.samuels@gmail.com', 'username': 'richardsamuels'}

Message: SERVER-58200 Require calling parallel shell callback for every spawned parallel shell
Branch: master
https://github.com/mongodb/mongo/commit/61efca02fadc86178a0effd8ad6d60a77b265dca

Comment by Robert Guo (Inactive) [ 07/Jul/21 ]

Got it! Thanks for the clarification Max and Paolo.

Comment by Max Hirschhorn [ 07/Jul/21 ]

Just to clarify - yes, there is a problem in the txn_two_phase_commit_coordinator_shutdown_and_restart.js test in that it ignores the return value of the startParallelShell() function. This issue in the txn_two_phase_commit_coordinator_shutdown_and_restart.js test should be fixed separately from SERVER-58200.

I'd like SERVER-58200 to be making it so ignoring the return value of the startParallelShell() function is considered a programmer error that causes the test to fail. We already do something like this for MongoRunner.stopMongod(), ReplSetTest#stopSet(), and ShardingTest#stop() through TestData.failIfUnterminatedProcesses to guarantee that we run the validation and data consistency checks in every test. TestData.failIfUnterminatedProcesses doesn't work as-is for startParallelShell() because the mongo shell will have already exited by the time the main shell process exits.

Comment by Paolo Polato [ 07/Jul/21 ]

Hi Robert,

And thanks for the explanation.

I was not familiar with the contract defined by startParallelShell() - and based on the implementation of jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js, I had just assumed that by it would raise an exception when executing some asserting code by default.

If my understanding is correct, the issue lies actually in the jstest - and the outcome of the function run through startParallelShell() should be verified in a way similar to this other example... Am i right?

Comment by Robert Guo (Inactive) [ 06/Jul/21 ]

Hi paolo.polato, would you mind clarifying your expected outcome here? As you had mentioned above, you need to set checkExitSuccess on the return object of startParallelShell to assert the exit code of the child shell process. This would be a change in runCommitThroughMongosInParallelShellExpectTimeOut(), not in the shell code.

Generated at Thu Feb 08 05:43:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.