[SERVER-59685] Investigate failing parallel shell in rollback index tests Created: 31/Aug/21  Updated: 29/Oct/23  Resolved: 04/Oct/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Richard Samuels (Inactive) Assignee: Adi Zaimi
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done after SERVER-58200 Asserting clauses do not cause a jste... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Remove checkExitSuccess from jstests/replsets/libs/rollback_index_builds_test.js:138 and run it.

Sprint: Repl 2021-10-04, Repl 2021-10-18
Participants:

 Description   

SERVER-58200 modified startParallelShell to ensure that whenever a parallel shell is started, the cleanup function returned by startParallelShell must be called in that shell, or the whole test fails.

As a part of this ticket, we modified the rollback index tests to run the function, which checks the exit code of the parallel shell, and throws an error if it's not 0.

The following tests began to fail:

	modified:   jstests/replsets/rollback_index_build_and_create.js
	modified:   jstests/replsets/rollback_index_build_start.js
	modified:   jstests/replsets/rollback_index_build_start_abort.js
	modified:   jstests/replsets/rollback_index_build_start_abort_not_create.js
	modified:   jstests/replsets/rollback_waits_for_bgindex_completion.js

with the following assertion:

[js_test:rollback_index_build_and_create] Error: [0] != [252] are not equal : encountered an error in the parallel shell :
[js_test:rollback_index_build_and_create] doassert@src/mongo/shell/assert.js:20:14
[js_test:rollback_index_build_and_create] assert.eq@src/mongo/shell/assert.js:179:9
[js_test:rollback_index_build_and_create] startParallelShell/<@src/mongo/shell/servers_misc.js:182:13
[js_test:rollback_index_build_and_create] runSchedules/</<@jstests/replsets/libs/rollback_index_builds_test.js:138:47
[js_test:rollback_index_build_and_create] runSchedules/<@jstests/replsets/libs/rollback_index_builds_test.js:138:13
[js_test:rollback_index_build_and_create] runSchedules@jstests/replsets/libs/rollback_index_builds_test.js:61:9
[js_test:rollback_index_build_and_create] @jstests/replsets/rollback_index_build_and_create.js:26:1
[js_test:rollback_index_build_and_create] @jstests/replsets/rollback_index_build_and_create.js:4:2
[js_test:rollback_index_build_and_create] failed to load: jstests/replsets/rollback_index_build_and_create.js

The parallel shell is launched in jstests/replsets/libs/rollback_index_builds_test.js:116. (Off commit e502f2d3965ac4147d303e956a582b7c4eef8232) Here's the whole stack trace of when the parallel shell is launched:

[js_test:rollback_index_build_and_create] printStackTrace@src/mongo/shell/utils.js:138:15
[js_test:rollback_index_build_and_create] startParallelShell@src/mongo/shell/servers_misc.js:114:5
[js_test:rollback_index_build_and_create] startIndexBuild@jstests/noPassthrough/libs/index_build.js:39:16
[js_test:rollback_index_build_and_create] runSchedules/</<@jstests/replsets/libs/rollback_index_builds_test.js:116:42
[js_test:rollback_index_build_and_create] runSchedules/<@jstests/replsets/libs/rollback_index_builds_test.js:76:13
[js_test:rollback_index_build_and_create] runSchedules@jstests/replsets/libs/rollback_index_builds_test.js:61:9
[js_test:rollback_index_build_and_create] @jstests/replsets/rollback_index_build_and_create.js:26:1
[js_test:rollback_index_build_and_create] @jstests/replsets/rollback_index_build_and_create.js:4:2

 

STM has set the checkExitSuccess flag to false on the cleanup function of the parallel shell to prevent the error from causing these tests to go red and to preserve existing semantics. We'd like someone to investigate if the parallel shell failure is expected (in which case checkExitSuccess should remain false), or if it's unexpected and the test needs to be modified. 



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 30/Sep/21 ]

Author:

{'name': 'Adi Zaimi', 'email': 'adizaimi@yahoo.com', 'username': 'adizaimi'}

Message: SERVER-59685 Expect appropriate errors for index build
Branch: master
https://github.com/mongodb/mongo/commit/74d89fadc189076d9de10349e1ed11cf5b66dab9

Comment by Adi Zaimi [ 30/Sep/21 ]

Some of the js files do not fail, so the error has to be finetuned depending on the test.

Comment by Adi Zaimi [ 29/Sep/21 ]

Error from shell:

[js_test:rollback_index_build_and_create] sh30290| uncaught exception: Error: command did not fail with any of the following codes [ ]
{
[js_test:rollback_index_build_and_create] sh30290|      "topologyVersion" : {
[js_test:rollback_index_build_and_create] sh30290|              "processId" : ObjectId("61549ea2985549ff966159e5"),
[js_test:rollback_index_build_and_create] sh30290|              "counter" : NumberLong(12)
[js_test:rollback_index_build_and_create] sh30290|      },
[js_test:rollback_index_build_and_create] sh30290|      "ok" : 0,
[js_test:rollback_index_build_and_create] sh30290|      "errmsg" : "Index build failed: d8c86e00-c323-4808-b202-880d044b8e48: Collectio
n test.coll_0 ( 6564baf6-915a-4a2b-8790-f686d1054213 ) :: caused by :: operation was interrupted",
[js_test:rollback_index_build_and_create] sh30290|      "code" : 11602,
[js_test:rollback_index_build_and_create] sh30290|      "codeName" : "InterruptedDueToReplStateChange",
[js_test:rollback_index_build_and_create] sh30290|      "$clusterTime" : {
[js_test:rollback_index_build_and_create] sh30290|              "clusterTime" : Timestamp(1632935619, 4),
[js_test:rollback_index_build_and_create] sh30290|              "signature" : {
[js_test:rollback_index_build_and_create] sh30290|                      "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:rollback_index_build_and_create] sh30290|                      "keyId" : NumberLong(0)
[js_test:rollback_index_build_and_create] sh30290|              }
[js_test:rollback_index_build_and_create] sh30290|      },
[js_test:rollback_index_build_and_create] sh30290|      "operationTime" : Timestamp(1632935619, 4)
[js_test:rollback_index_build_and_create] sh30290| } :
[js_test:rollback_index_build_and_create] sh30290| _getErrorWithCode@src/mongo/shell/utils.js:24:13
[js_test:rollback_index_build_and_create] sh30290| doassert@src/mongo/shell/assert.js:18:14
[js_test:rollback_index_build_and_create] sh30290| _assertCommandFailed@src/mongo/shell/assert.js:805:21
[js_test:rollback_index_build_and_create] sh30290| assert.commandFailedWithCode@src/mongo/shell/assert.js:851:16
[js_test:rollback_index_build_and_create] sh30290| commandWorkedOrFailedWithCode@src/mongo/shell/assert.js:822:20
[js_test:rollback_index_build_and_create] sh30290| @(shell eval):62:17

This is the same failure as SERVER-59687 where the fix was for jstests/replsets/rollback_waits_for_bgindex_completion.js to expect 'InterruptedDueToReplStateChange' Error when launching the parallel shell.
I believe we should expect the same error for this ticket.

Generated at Thu Feb 08 05:47:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.