[SERVER-27077] Do not attempt to reload ShardRegistry on CSRS until after replset is initialized Created: 16/Nov/16  Updated: 06/Dec/22  Resolved: 21/Oct/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.4.14, 3.6.23, 4.0.27, 5.0.3, 4.4.9, 4.2.17
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: [DO NOT USE] Backlog - Sharding EMEA
Resolution: Won't Fix Votes: 0
Labels: ShardingRoughEdges
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding EMEA
Sprint: Sharding 2017-01-02
Participants:

 Description   

The ShardRegistry reload ends up failing and logging verbosely (see below).

When reloading the shards through ShardLocal, RecoveryUnit::setReadFromMajorityCommittedSnapshot() returns ErrorCodes::ReadConcernMajorityNotAvailableYet:
https://github.com/mongodb/mongo/blob/r3.4.0-rc3/src/mongo/s/client/shard_local.cpp#L162

This happens even if replSetInitiate is called as long as it doesn't complete, for example as in BF-3957.

Steps to repro:

Run:

var configRS = new ReplSetTest({nodes: 3});
configRS.startSet({configsvr: '', storageEngine: 'wiredTiger'});

Example output:

ReplSetTest starting set
...
2016-11-16T16:34:07.529-0500 I -        [main] shell: started program (sh21675):  /home/eshamaharishi/code/mongo/mongod --oplogSize 40 --port 20000 --noprealloc --smallfiles --replSet testReplSet --dbpath /data/db/testReplSet-0 --configsvr --storageEngine wiredTiger --setParameter writePeriodicNoops=false --setParameter numInitialSyncAttempts=1 --setParameter numInitialSyncConnectAttempts=60 --setParameter enableTestCommands=1 --setParameter logComponentVerbosity={tracking:1}
...
2016-11-16T16:34:10.805-0500 I -        [main] shell: started program (sh21753):  /home/eshamaharishi/code/mongo/mongod --oplogSize 40 --port 20001 --noprealloc --smallfiles --replSet testReplSet --dbpath /data/db/testReplSet-1 --configsvr --storageEngine wiredTiger --setParameter writePeriodicNoops=false --setParameter numInitialSyncAttempts=1 --setParameter numInitialSyncConnectAttempts=60 --setParameter enableTestCommands=1 --setParameter logComponentVerbosity={tracking:1}
...
2016-11-16T16:34:13.452-0500 I -        [main] shell: started program (sh21829):  /home/eshamaharishi/code/mongo/mongod --oplogSize 40 --port 20002 --noprealloc --smallfiles --replSet testReplSet --dbpath /data/db/testReplSet-2 --configsvr --storageEngine wiredTiger --setParameter writePeriodicNoops=false --setParameter numInitialSyncAttempts=1 --setParameter numInitialSyncConnectAttempts=60 --setParameter enableTestCommands=1 --setParameter logComponentVerbosity={tracking:1}
...
c20000| 2016-11-16T16:34:40.493-0500 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
c20001| 2016-11-16T16:34:43.214-0500 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
c20002| 2016-11-16T16:34:46.090-0500 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
c20000| 2016-11-16T16:35:10.494-0500 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
c20001| 2016-11-16T16:35:13.215-0500 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s
c20002| 2016-11-16T16:35:16.090-0500 I SHARDING [shard registry reload] Periodic reload of shard registry failed  :: caused by :: 134 could not get updated shard list from config server due to Read concern majority reads are currently not possible.; will retry after 30s



 Comments   
Comment by Kaloian Manassiev [ 21/Oct/21 ]

The impact of this ticket is very small and the fix is not trivial, so we are closing it.

Comment by Kaloian Manassiev [ 12/Oct/21 ]

From looking at one of the latest evergreen runs, it looks like the logging is still happening. Putting this ticket back in Needs Triage, so we can decide whether it is worth investing the time to improve the behaviour.

Generated at Thu Feb 08 04:14:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.