[SERVER-34597] shardedcluster.py does not wait correctly on shards initialization Created: 20/Apr/18  Updated: 29/Oct/23  Resolved: 15/Mar/21

Status: Closed
Project: Core Server
Component/s: Sharding, Testing Infrastructure
Affects Version/s: None
Fix Version/s: 5.0.0, 4.4.11, 4.4.10

Type: Bug Priority: Major - P3
Reporter: Misha Tyulenev Assignee: Kshitij Gupta
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-32572 Run causally consistent resmoke suite... Closed
Related
is related to SERVER-32927 Assert sharded commands can be accept... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0, v4.9, v4.4
Sprint: Sharding 2020-09-07, Sharding 2021-01-11, Sharding 2021-01-25, Sharding 2021-02-22, Sharding 2021-03-08, Sharding 2021-03-22
Participants:
Linked BF Score: 129

 Description   

This test

var t = db.count_test;
t.drop();
assert.eq(0, t.find().count());

will fail if run as

resmoke.py --suites causally_consistent_jscore_passthrough test.js 

when

shard_options:
      voting_secondaries: false

This happens because the https://github.com/mongodb/mongo/blob/r3.7.5/buildscripts/resmokelib/testing/fixtures/replicaset.py#L214 does not wait for sharding initialization to be completed.
Now it waits for isMaster command to return. Instead it should wait for getShardVersion command to return.

The error usually manifests as the following output in a log:

[js_test:test] 2018-04-20T14:53:45.437-0400 2018-04-20T14:53:45.436-0400 E QUERY    [js] Error: count failed: {
[js_test:test] 2018-04-20T14:53:45.437-0400 	"shards" : {
[js_test:test] 2018-04-20T14:53:45.437-0400 
[js_test:test] 2018-04-20T14:53:45.437-0400 	},
[js_test:test] 2018-04-20T14:53:45.437-0400 	"ok" : 0,
[ShardedClusterFixture:job0:shard0:secondary] 2018-04-20T14:53:45.435-0400 I COMMAND  [conn16] command test.$cmd appName: "MongoDB Shell" command: count { count: "count11", query: {}, readConcern: { afterClusterTime: Timestamp(1524250425, 13) }, shardVersion: [ Timestamp(0, 0), ObjectId('000000000000000000000000') ], databaseVersion: { uuid: UUID("ca2ebfda-a474-45a6-be00-e34b1fd3b8d3"), lastMod: 1 }, allowImplicitCollectionCreation: false, $readPreference: { mode: "secondary" }, $clusterTime: { clusterTime: Timestamp(1524250425, 13), signature: { hash: BinData(0, 2F88278346C7769F7FDC385B699AF171BA769BF4), keyId: 6546605683339427852 } }, $client: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "0.0.0" }, os: { type: "Linux", name: "LinuxMint", architecture: "x86_64", version: "18.1" }, mongos: { host: "greyparrot:20003", client: "127.0.0.1:40580", version: "0.0.0" } }, $configServerState: { opTime: { ts: Timestamp(1524250425, 13), t: 1 } }, $db: "test" } numYields:0 ok:0 errMsg:"Cannot accept sharding commands if sharding state has not been initialized with a shardIdentity document" errName:ShardingStateNotInitialized errCode:203 reslen:332 locks:{} protocol:op_msg 0ms
[js_test:test] 2018-04-20T14:53:45.437-0400 	"errmsg" : "failed on: shard-rs0 :: caused by :: Cannot accept sharding commands if sharding state has not been initialized with a shardIdentity document",
[js_test:test] 2018-04-20T14:53:45.437-0400 	"code" : 203,
[js_test:test] 2018-04-20T14:53:45.438-0400 	"codeName" : "ShardingStateNotInitialized",
[js_test:test] 2018-04-20T14:53:45.438-0400 	"$clusterTime" : {
[js_test:test] 2018-04-20T14:53:45.438-0400 		"clusterTime" : Timestamp(1524250425, 13),
[ShardedClusterFixture:job0:mongos] 2018-04-20T14:53:45.438-0400 I NETWORK  [conn13] end connection 127.0.0.1:40580 (0 connections now open)
[js_test:test] 2018-04-20T14:53:45.438-0400 		"signature" : {
[js_test:test] 2018-04-20T14:53:45.438-0400 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:test] 2018-04-20T14:53:45.438-0400 			"keyId" : NumberLong(0)
[js_test:test] 2018-04-20T14:53:45.438-0400 		}
[js_test:test] 2018-04-20T14:53:45.439-0400 	},
[js_test:test] 2018-04-20T14:53:45.439-0400 	"operationTime" : Timestamp(1524250420, 3)
[js_test:test] 2018-04-20T14:53:45.439-0400 } :
[js_test:test] 2018-04-20T14:53:45.439-0400 _getErrorWithCode@src/mongo/shell/utils.js:25:13
[js_test:test] 2018-04-20T14:53:45.439-0400 DBQuery.prototype.count@src/mongo/shell/query.js:375:11
[js_test:test] 2018-04-20T14:53:45.439-0400 @test.js:10:14
[js_test:test] 2018-04-20T14:53:45.439-0400 failed to load: test.js



 Comments   
Comment by Githook User [ 08/Oct/21 ]

Author:

{'name': 'Kshitij Gupta', 'email': 'kshitij.gupta@mongodb.com', 'username': 'kshitijng'}

Message: SERVER-34597 Wait for sharding initialization in ShardingTest

(cherry picked from commit 1f84ef5e4ca8b5be22d038ea2a6cc3e5e6863194)
Branch: v4.4
https://github.com/mongodb/mongo/commit/ed680e1843fb4fd2e1b1249dc6c0d897ac66ca9a

Comment by Githook User [ 10/Mar/21 ]

Author:

{'name': 'Kshitij Gupta', 'email': 'kshitij.gupta@mongodb.com', 'username': 'kshitijng'}

Message: SERVER-34597 Wait for sharding initialization in ShardingTest
Branch: master
https://github.com/mongodb/mongo/commit/1f84ef5e4ca8b5be22d038ea2a6cc3e5e6863194

Comment by Lamont Nelson [ 13/Nov/20 ]

Code review: https://mongodbcr.appspot.com/726130001

Comment by Misha Tyulenev [ 24/Apr/20 ]

The issue has reoccured in BF-16730

Comment by Lamont Nelson [ 12/Dec/19 ]

The error was not duplicated with the latest code.

Comment by Lamont Nelson [ 12/Dec/19 ]

I tested this with the latest code, and the error does not occur.

Comment by Max Hirschhorn [ 27/Apr/18 ]

kaloian.manassiev, to put it another way, there should be a single code review that simultaneously updates the mongo shell's ShardingTest and resmoke.py's ShardedClusterFixture to use the "getShardVersion" command (or whatever the mechanism is going to be to wait for the sharding state to have been initialized).

Comment by Kaloian Manassiev [ 27/Apr/18 ]

Assigning to Esha to figure out with Max about whether we should be copying the changes that are made to the fixture or whether TIG is waiting for us to make changes that they would emulate.

Comment by Max Hirschhorn [ 20/Apr/18 ]

We should make the equivalent changes to the mongo shell's ShardingTest if we are going to update resmoke.py's ShardedClusterFixture.

Generated at Thu Feb 08 04:37:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.