[SERVER-45986] index_shutdown_cmd_primary.js runs shutdown command without waiting for IndexBuildsCoordinator to start index build Created: 05/Feb/20  Updated: 29/Oct/23  Resolved: 12/Feb/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.3.4

Type: Bug Priority: Major - P3
Reporter: Benety Goh Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-02-10, Execution Team 2020-02-24
Participants:
Linked BF Score: 21

 Description   

The index_shutdown_cmd_primary.js test is supposed to wait for the createIndexes command to start the index build in the IndexBuildsCoordinator before invoking the shutdown command. However, this test uses the IndexBuildTest.waitForIndexBuildToStart() test helper which may return early if it observes the createIndexes client thread in the db.currentOp() results. This may cause the test to invoke the shutdown command before the createIndexes command has a chance to hand over control of the index build to the IndexBulidsCoordinator. In this scenario, the shutdown command logic does not detect any active builds in the IndexBuildsCoordinator and proceeds to step down the node, leading to the test failure observed below.

In a production setting, this shutdown command behavior should be acceptable because it would have interrupted the two-phased index build before it has a chance to replicate the startIndexBuild command to the secondaries.

[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.720+0000 [jsTest] ----
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.720+0000 [jsTest] found in progress index build: \{ "key" : { "a" : 1 }, "name" : "a_1" } on namespace test.$cmd opi
d: 241
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.720+0000 [jsTest] ----
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.720+0000 
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.720+0000 
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.733+0000 d20770| 2020-01-25T15:34:28.734+0000 I REPL [RstlKillOpThread] Starting to kill user operations
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.733+0000 d20770| 2020-01-25T15:34:28.734+0000 I REPL [RstlKillOpThread] Stopped killing user operations
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.733+0000 d20770| 2020-01-25T15:34:28.734+0000 I REPL [RstlKillOpThread] State transition ops metrics: { lastSt
ateTransition: "stepDown", userOpsKilled: 1, userOpsRunning: 1 }
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.736+0000 sh5080| assert: command failed: {
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.736+0000 sh5080| "topologyVersion" : {
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.736+0000 sh5080| "processId" : ObjectId("5e2c5fff805bdae6fea48f93"),
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.736+0000 sh5080| "counter" : NumberLong(7)
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.736+0000 sh5080| },
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.736+0000 sh5080| "operationTime" : Timestamp(1579966468, 1),
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "ok" : 0,
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "errmsg" : "operation was interrupted",
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "code" : 11602,
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "codeName" : "InterruptedDueToReplStateChange",
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "$clusterTime" : {
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "clusterTime" : Timestamp(1579966468, 1),
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "signature" : {
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| "keyId" : NumberLong(0)
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| }
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| }
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.737+0000 sh5080| }
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.738+0000 sh5080| _getErrorWithCode@src/mongo/shell/utils.js:25:13
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.738+0000 sh5080| doassert@src/mongo/shell/assert.js:18:14
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.738+0000 sh5080| _assertCommandWorked@src/mongo/shell/assert.js:612:17
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.738+0000 sh5080| assert.commandWorked@src/mongo/shell/assert.js:702:16
[js_test:index_shutdown_cmd_primary] 2020-01-25T15:34:28.738+0000 sh5080| @(shell eval):40:57



 Comments   
Comment by Githook User [ 12/Feb/20 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-45986 index_shutdown_cmd_primary.js waits for IndexBuildsCoordinator before invoking shutdown command
Branch: master
https://github.com/mongodb/mongo/commit/ecd815e6d3d09bfffbbc4bf0c8a57c8edb54a0a7

Generated at Thu Feb 08 05:10:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.