[SERVER-43093] Concurrent calls to ShardingReplicaSetChangeListener::onConfirmedSet can cause starvation in the fixed executor Created: 30/Aug/19  Updated: 29/Oct/23  Resolved: 25/Sep/19

Status: Closed
Project: Core Server
Component/s: Upgrade/Downgrade
Affects Version/s: 4.2.0
Fix Version/s: 4.2.1, 4.3.1

Type: Bug Priority: Critical - P2
Reporter: Rui Ribeiro Assignee: Randolph Tan
Resolution: Fixed Votes: 2
Labels: KP42
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongo_shell.txt     Text File mongod_config_cmodb803.log     Zip Archive mongod_config_cmodb803.zip     Text File mongos.log     File mongos.log.2019-08-30T11-20-05     Text File mongoshell.txt    
Issue Links:
Backports
Duplicate
is duplicated by SERVER-43094 Problem to set up a Cluster with Mong... Closed
Related
related to SERVER-43581 Create sharding passthrough with fixe... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Participants:

 Description   

The issue is that the callback is running in the fixed executor and then also tries doing a blocking call inside the fixed executor here. So if the change callback for different replica sets happened roughly at the same time, all of them will be blocked waiting for a worker thread to be available (which are none, since all of them are waiting for the same thing)

Original description:

Hi

I setup the cluster with MongoDB 4.0 without problems, but  I am struggling to setup a cluster with MongoDB 4.2.

As soon as I add more than 3 shards  to the cluster (I would like to have up to 24), I am not able to run a mongos anymore. It looks fine until the first restart of the mongos process. When I remove all shards and only keep 3 of them, everything is fine again. It does not matter which shard I keep, it seems something with the number of > 3. 

I tried a lot of things, removed auth, removed encryption, removed compression: no luck.

I cannot believe that this is a bug in 4.2.0 because even in the highest loglevel with exception tracing I cannot see any errors, the mongos is just not starting up until the point when it should start listening on all interfaces (with net.bindIpAll true) or on localhost (started with no special interface binding).

Can you give me some help regarding this problem I am facing.

Thank you

 



 Comments   
Comment by Cezary Bartosiak [ 02/Oct/19 ]

Thank you so much for the fix! May I ask when you expect version 4.2.1 to be released?

Comment by Githook User [ 26/Sep/19 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@mongodb.com'}

Message: SERVER-43093 Temporarily change back fixed executor to have unlimited threads

(cherry picked from commit a944732ce9a31d68b54c9617c307dd868e3343ec)
Branch: v4.2
https://github.com/mongodb/mongo/commit/aac9c1975c93db073e1cf397c7e9f4d370bf4735

Comment by Githook User [ 25/Sep/19 ]

Author:

{'email': 'randolph@mongodb.com', 'name': 'Randolph Tan'}

Message: SERVER-43093 Temporarily change back fixed executor to have unlimited threads
Branch: master
https://github.com/mongodb/mongo/commit/a944732ce9a31d68b54c9617c307dd868e3343ec

Comment by Randolph Tan [ 23/Sep/19 ]

esha.maharishi was right. The ReplicaSetMontor's first scan was causing the thread pools to get "starved". What was happening was the initial scan can cause the replica set change callback for config and shards to happen at roughly the same time. The issue is that the callback is running in the fixed executor and then also tries doing a blocking call inside the fixed executor here. So if the change callback for different replica sets happened roughly at the same time, all of them will be blocked waiting for a worker thread to be available (which are none, since all of them are waiting for the same thing). That is why this happens more often when a mongos is restarted.

Comment by Esha Maharishi (Inactive) [ 20/Sep/19 ]

Reassigning to renctan.

One thing I noticed was the mongos starts only 4 threads in the fixed executor's thread pool:

$ egrep -n "starting thread in pool Sharding-Fixed|RESTARTED" mongos.log
20:2019-09-04T06:33:47.185+0000 D1 EXECUTOR [Sharding-Fixed-0] starting thread in pool Sharding-Fixed
84:2019-09-04T06:33:47.188+0000 D1 EXECUTOR [Sharding-Fixed-1] starting thread in pool Sharding-Fixed
153:2019-09-04T06:33:47.190+0000 D1 EXECUTOR [Sharding-Fixed-2] starting thread in pool Sharding-Fixed
164:2019-09-04T06:33:47.191+0000 D1 EXECUTOR [Sharding-Fixed-3] starting thread in pool Sharding-Fixed
978:2019-09-04T06:34:50.842+0000 I  CONTROL  [main] ***** SERVER RESTARTED *****
999:2019-09-04T06:34:50.848+0000 D1 EXECUTOR [Sharding-Fixed-0] starting thread in pool Sharding-Fixed
1131:2019-09-04T06:34:50.853+0000 D1 EXECUTOR [Sharding-Fixed-1] starting thread in pool Sharding-Fixed
1145:2019-09-04T06:34:50.853+0000 D1 EXECUTOR [Sharding-Fixed-2] starting thread in pool Sharding-Fixed
1148:2019-09-04T06:34:50.853+0000 D1 EXECUTOR [Sharding-Fixed-3] starting thread in pool Sharding-Fixed

This may be related to this change.

Since the issue only happens after restarting the mongos, maybe something in the ReplicaSetMonitor's first scan is clogging up all 4 threads in the fixed executor simultaneously. It might help to see if the mongos responds to a request that uses one of the the arbitrary executors...

Comment by Pawel Gawel [ 20/Sep/19 ]

THANK GOD THAT I FOUND THIS ISSUE! I was losing my mind for a few days!

 

1. I thought that I was doing something wrong when deploying new sharded cluster (7 nodes on AWS). I'am having the same results as Mr Martin Gasser and Mr Cezary Bartosiak.  It seems that mongos 4.2 binary is broken. It won't bind on configured port and I cannot turn it off. Additionally sh.status() freezes!

 

2. Moreover I have a production cluster which I've recently upgraded to 4.2 (and it works now - but I've haven't restarted mongoses yet!), so I'm thinking about downgrading this cluster to 4.0.12

 

This issue is really critical, please Mongo Team investigate it.

Comment by Cezary Bartosiak [ 19/Sep/19 ]

Once it is restarted, it does not work anymore until the number of shards is below 4.

To be honest I was able to make it work again (with 5 shards) by issuing kill -9 and starting again. It helped in some cases and in some it didn't. This behavior was random to me.

Comment by Martin Gasser [ 19/Sep/19 ]

Hi all
We could nail down this a little bit further:

Every new router joining the cluster (even if the cluster has more than 4 shards) is working as expected as long it is not restarted.
Once it is restarted, it does not work anymore until the number of shards is below 4.
It does not matter if it is a windows or linux based router, but it must be fresh (hostname, id, ???).

Hope this will help.
Martin

Comment by Cezary Bartosiak [ 19/Sep/19 ]

I was able to reproduce this issue on GCP (using ubuntu-1604-xenial-v20190913 OS image). I created the following cluster (using version 4.2.0):

1. One config server with mongod running on port 27017.
2. Five shard servers - each one with mongod running on port 27018 and mongos running on port 27017.

Each server was a single member replica set.

Everything was fine until I started messing things up by restarting VM instances and/or mongoses. I observed the following (it was RANDOM - sometimes everything worked as expected!):

1. Mongos was running according to systemctl.
2. It was not listening on the configured port 27017.
3. It couldn't be restarted. It was hanged on closing I/O operations related with a connection to the config server, even after no connections were reported by netstat.
4. I was able to open the mongos shell without problems, but sh.status() was not responsive (even before issuing a restart).
5. I set the log verbosity level to 5, but nothing interesting was logged.

Sorry about not attaching any logs, but I've already deleted the cluster and created a new one using version 4.0.12 and it is now working like a charm! It is able to survive killing VM instances randomly, restarting mongoses and so on.

Thus, I believe there is a critical bug in version 4.2.0 and, since it's so easy to reproduce using virtual machines, I hope to see some progress in fixing this issue soon...

Comment by Martin Gasser [ 08/Sep/19 ]

Aeh, sorry, I was too quick with the good news...
This worked only for the first start of the router running on windows. After a second start, it is acting the same way than the router on Linux, not starting in a cluster with more than 3 shards.
BR,
Martin

Comment by Martin Gasser [ 08/Sep/19 ]

Hi all

After trying a lot of things, I realized that the windows version of 4.2 router is working fine in our environment! The setting is the same (Config servers and shards running Centos7) only the router running on windows makes the difference.
Does this make sense to you?

Regards
Martin

Comment by Martin Gasser [ 05/Sep/19 ]

Please let us know when we can support with additional informations or logfiles.
BTW: shards are in PSA mode:

/* 1 */
{
"set" : "shard0003",
"date" : ISODate("2019-09-05T08:36:50.943Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" :

{ "ts" : Timestamp(1567672608, 1), "t" : NumberLong(1) }

,
"lastCommittedWallTime" : ISODate("2019-09-05T08:36:48.660Z"),
"readConcernMajorityOpTime" :

{ "ts" : Timestamp(1567672608, 1), "t" : NumberLong(1) }

,
"readConcernMajorityWallTime" : ISODate("2019-09-05T08:36:48.660Z"),
"appliedOpTime" :

{ "ts" : Timestamp(1567672608, 1), "t" : NumberLong(1) }

,
"durableOpTime" :

{ "ts" : Timestamp(1567672608, 1), "t" : NumberLong(1) }

,
"lastAppliedWallTime" : ISODate("2019-09-05T08:36:48.660Z"),
"lastDurableWallTime" : ISODate("2019-09-05T08:36:48.660Z")
},
"members" : [
{
"_id" : 0,
"name" : "cmodb812.togewa.com:27018",
"ip" : "10.108.2.42",
"health" : 1.0,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 1210127,
"optime" :

{ "ts" : Timestamp(1567672608, 1), "t" : NumberLong(1) }

,
"optimeDate" : ISODate("2019-09-05T08:36:48.000Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1566462529, 1),
"electionDate" : ISODate("2019-08-22T08:28:49.000Z"),
"configVersion" : 2,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "cmodb813.togewa.com:27018",
"ip" : "10.108.2.43",
"health" : 1.0,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 1210092,
"optime" :

{ "ts" : Timestamp(1567672608, 1), "t" : NumberLong(1) }

,
"optimeDurable" :

{ "ts" : Timestamp(1567672608, 1), "t" : NumberLong(1) }

,
"optimeDate" : ISODate("2019-09-05T08:36:48.000Z"),
"optimeDurableDate" : ISODate("2019-09-05T08:36:48.000Z"),
"lastHeartbeat" : ISODate("2019-09-05T08:36:50.708Z"),
"lastHeartbeatRecv" : ISODate("2019-09-05T08:36:49.104Z"),
"pingMs" : NumberLong(3),
"lastHeartbeatMessage" : "",
"syncingTo" : "cmodb812.togewa.com:27018",
"syncSourceHost" : "cmodb812.togewa.com:27018",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 2
},

{ "_id" : 2, "name" : "cmodb805.togewa.com:27025", "ip" : "10.108.2.35", "health" : 1.0, "state" : 7, "stateStr" : "ARBITER", "uptime" : 1210092, "lastHeartbeat" : ISODate("2019-09-05T08:36:49.979Z"), "lastHeartbeatRecv" : ISODate("2019-09-05T08:36:49.855Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "configVersion" : 2 }

],
"ok" : 1.0,
"$gleStats" :

{ "lastOpTime" : Timestamp(0, 0), "electionId" : ObjectId("7fffffff0000000000000001") }

,
"lastCommittedOpTime" : Timestamp(1567672608, 1),
"$configServerState" : {
"opTime" :

{ "ts" : Timestamp(1567672605, 1), "t" : NumberLong(1) }

},
"$clusterTime" : {
"clusterTime" : Timestamp(1567672608, 1),
"signature" : {
"hash" :

{ "$binary" : "AAAAAAAAAAAAAAAAAAAAAAAAAAA=", "$type" : "00" }

,
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1567672608, 1)
}

Comment by Benjamin Caimano (Inactive) [ 04/Sep/19 ]

This doesn't look like a network or topology failure to me. I'm handing it back to kaloian.manassiev to analyze how far the mongos got into its initialization.

Comment by Benjamin Caimano (Inactive) [ 04/Sep/19 ]

Thanks for those details, martin.gasser@comfone.com!

The following line is the start of mongos accepting connections. It shows up once before the restart at 06:34:50 GMT. This means the mongos doesn't finish its initialization after the restart.

2019-09-04T06:33:49.192+0000 I  NETWORK  [mongosMain] waiting for connections on port 27017

We noticeably don't get to FTDC init (example log line below) which means we haven't made it through sharding initialization.

2019-09-04T06:33:49.191+0000 I  FTDC     [mongosMain] Initializing full-time diagnostic data capture with directory '/var/log/mongodb/mongos.diagnostic.data'

I do see that we have lines like:

2019-09-04T06:35:20.851+0000 D1 NETWORK  [ReplicaSetMonitor-TaskExecutor] Refreshing replica set shard0000 took 0ms
2019-09-04T06:35:20.851+0000 D1 NETWORK  [ReplicaSetMonitor-TaskExecutor] Refreshing replica set shard0001 took 0ms
2019-09-04T06:35:20.856+0000 D1 NETWORK  [ReplicaSetMonitor-TaskExecutor] Refreshing replica set shard0003 took 4ms
2019-09-04T06:35:20.856+0000 D1 NETWORK  [ReplicaSetMonitor-TaskExecutor] Refreshing replica set shard0002 took 5ms
2019-09-04T06:35:21.348+0000 D1 NETWORK  [ReplicaSetMonitor-TaskExecutor] Refreshing replica set configrs took 0ms

These continue well after the restart, so the topology scanning is in good shape.

Comment by Martin Gasser [ 04/Sep/19 ]

found this:
[root@cmodb801 admin]# cat /var/log/mongodb/mongos.log | grep maximum
{{2019-09-04T06:33:47.192+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum
2019-09-04T06:33:47.192+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum
2019-09-04T06:34:50.856+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum
2019-09-04T06:34:50.856+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum
2019-09-04T06:34:50.856+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum
2019-09-04T06:34:50.856+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum
2019-09-04T06:34:50.856+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum
2019-09-04T06:34:52.852+0000 D2 EXECUTOR [ShardRegistry] Not starting new thread in pool Sharding-Fixed because it already has 4, its maximum}}

But this is probably not an error.
Cheers
Martin

Comment by Martin Gasser [ 04/Sep/19 ]

Added the config server log mongod_config_cmodb803.zip

Comment by Martin Gasser [ 04/Sep/19 ]

Hi Benjamin
Thanks for investigating on this issue.
This is correct, I was not aware running with the appendlog:false option. Everything is now in the mongos log.

Now I repeated the steps with the logging options you suggested:

  • Started the mongos
  • connected with mongo shell and executed sh.status()
  • Added shard0003 : sh.addShard("shard0003/cmodb812.togewa.com:27018,cmodb813.togewa.com:27018")
  • again a sh.status which gives the correct output with 4 shards
  • quit shell and restart mongos service
  • Not able to connect anymore to router

You can find the attached logs of mongos router (cmodb801) , the command from the shell and the log of one of the config servers during that time (cmodb803).

mongos.log mongoshell.txt mongod_config_cmodb803.log

Maybe this will help you to see any hints in this files.
Thanks,
Martin

Comment by Benjamin Caimano (Inactive) [ 03/Sep/19 ]

Hey martin.gasser@comfone.com,

I only see one startup event and one shutdown event in that mongos log. Given that there is a "command: addShard" line in mongos.log.2019-08-30T11-20-05, I'm assuming that is a before restart log. I'm particularly interested in the log for mongos after your restart in this procedure. You should see a sequence of messages about it contacting each shard and updating its topology information.

On the note of the "D3 CONNPOOL [ShardRegistry] ..." lines, I personally added these to track our internal connection pool when at a high verbosity level. They actually indicate that your mongos is maintaining healthy connections to cmodb802:27019 and cmodb804:27019 during that period. Fascinatingly, cmodb803:27019 isn't listed there. If it happens to be inaccessible, then your issue may be SERVER-43015.

SERVER-42790 makes these log lines even more restricted to D4. D3 and above for NETWORK, ASIO, and CONNPOOl are very verbose to help use diagnose complicated network behavior. If you want to avoid seeing these lines, you can modify your mongos.conf to be less verbose on network events:

systemLog:
  destination: file
  logAppend: false
  path: /var/log/mongodb/mongos.log
  traceAllExceptions: true
  verbosity: 5
  component:
    network:
      verbosity: 2

Comment by Martin Gasser [ 30/Aug/19 ]

Hi Kal
Thanks for the lightning fast answer.

Attached you will find
a) the shell log
b) the mongos log

In the shell log you can see the steps done

Add shard does start in the logs at line
2019-08-30T11:19:56.595+0000 D1 TRACKING [conn29] Cmd: addShard, TrackingId: 5d69065c3c00392d854ff620

Thanks a lot!
Martin

Comment by Kaloian Manassiev [ 30/Aug/19 ]

Hi rui.ribeiro@comfone.com and martin.gasser@comfone.com,

Thank you for the report and for including the repro steps. I don't have access to 3 different machines, but I tried a quick repro on my laptop and wasn't able to reproduce this problem.

In order to help us investigate what is going on, would it be possible to attach the complete mongos log, which includes the entire timeline of when you added the shards and also after the restart.

Best regards,
-Kal.

Comment by Martin Gasser [ 30/Aug/19 ]

I can provide additional info for this setting. No special configs done, just install 3 config servers (Centos7.6), started with:

 

/etc/mongod.conf:

net:
 bindIp: 0.0.0.0
 port: 27019
 
processManagement:
 fork: true
 pidFilePath: /var/run/mongodb/mongod.pid
 timeZoneInfo: /usr/share/zoneinfo
 
storage:
 dbPath: /data/db
 directoryPerDB: true
 journal:
 enabled: true
 wiredTiger:
 engineConfig:
 directoryForIndexes: true
 
systemLog:
 destination: file
 logAppend: true
 path: /var/log/mongodb/mongod.log
 
security:
 authorization: disabled
 
replication:
 oplogSizeMB: 1024
 replSetName: configrs
 
sharding:
 clusterRole: configsvr

 

Configured config RS:

rs.initiate(
 {
 _id: "configrs",
 configsvr: true,
 members: [
   { _id : 0, host : "cmodb802.togewa.com:27019" },
   { _id : 1, host : "cmodb803.togewa.com:27019" },
   { _id : 2, host : "cmodb804.togewa.com:27019" }
  ] }
)
 
var cfg = rs.conf();
cfg.members[0].priority = 2
cfg.members[1].priority = 1;
cfg.members[2].priority = 1;
cfg.members[0].votes = 1;
cfg.members[1].votes = 1;
cfg.members[2].votes = 1;
rs.reconfig(cfg);

 
Then started mongos with /etc/mongos.conf:

# where to write logging data.
systemLog:
  destination: file
  logAppend: false
  path: /var/log/mongodb/mongos.log
  verbosity: 5
  traceAllExceptions: true
 
# how the process runs
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongos.pid  # location of pidfile
 
# network interfaces
net:
  port: 27017
  bindIp: localhost, 10.108.2.15
 
sharding:
  configDB: configrs/cmodb802.togewa.com:27019,cmodb803.togewa.com:27019,cmodb804.togewa.com:27019

Added the shards:

sh.addShard("shard0000/cmodb806.togewa.com:27018,cmodb807.togewa.com:27018")
..
(4 shards more)

And after adding more than three shards, the mongos does not bind to IP anymore. No hints or messages in log.

After adding shard no4 (shard0003):

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("5d5e45a7ac9313827bdd8ab9")
  }
  shards:
        {  "_id" : "shard0000",  "host" : "shard0000/cmodb806.togewa.com:27018,cmodb807.togewa.com:27018",  "state" : 1 }
        {  "_id" : "shard0001",  "host" : "shard0001/cmodb808.togewa.com:27018,cmodb809.togewa.com:27018",  "state" : 1 }
        {  "_id" : "shard0002",  "host" : "shard0002/cmodb810.togewa.com:27018,cmodb811.togewa.com:27018",  "state" : 1 }
        {  "_id" : "shard0003",  "host" : "shard0003/cmodb812.togewa.com:27018,cmodb813.togewa.com:27018",  "state" : 1 }
  active mongoses:
        "4.2.0" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard0000       1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard0000 Timestamp(1, 0)

Then do:

[root@cmodb801 mongodb]# systemctl restart mongos
[root@cmodb801 mongodb]# systemctl status mongos
● mongos.service - High-performance, schema-free document-oriented database
   Loaded: loaded (/usr/lib/systemd/system/mongos.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2019-08-30 10:54:47 UTC; 6s ago
     Docs: https://docs.mongodb.org/manual
  Process: 28418 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 28414 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 28412 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)
 Main PID: 28421 (mongos)
   CGroup: /system.slice/mongos.service
           ├─28421 /usr/bin/mongos -f /etc/mongos.conf
           ├─28423 /usr/bin/mongos -f /etc/mongos.conf
           └─28424 /usr/bin/mongos -f /etc/mongos.conf
 
Aug 30 10:54:47 cmodb801.togewa.com systemd[1]: Starting High-performance, schema-free document-oriented database...
Aug 30 10:54:47 cmodb801.togewa.com systemd[1]: Started High-performance, schema-free document-oriented database.
Aug 30 10:54:47 cmodb801.togewa.com mongos[28421]: about to fork child process, waiting until server is ready for connections.
Aug 30 10:54:47 cmodb801.togewa.com mongos[28421]: forked process: 28424

And not able to connect anymore to mongos.

Log of mongos only repeating:

2019-08-30T10:55:48.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb802.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:48.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb802.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:49.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb804.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:49.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb804.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:49.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb802.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:49.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb802.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:49.155+0000 D3 CONNPOOL [ShardRegistry] Triggered refresh timeout for cmodb804.togewa.com:27019
2019-08-30T10:55:49.155+0000 D3 CONNPOOL [ShardRegistry] Refreshing connection to cmodb804.togewa.com:27019
2019-08-30T10:55:49.155+0000 D3 NETWORK  [ShardRegistry] Compressing message with snappy
2019-08-30T10:55:49.155+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb804.togewa.com:27019 with State: { requests: 0, ready: 0, pending: 1, active: 0, isExpired: false }
2019-08-30T10:55:49.155+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb804.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:49.155+0000 D3 NETWORK  [ShardRegistry] Decompressing message with snappy
2019-08-30T10:55:49.155+0000 D3 CONNPOOL [ShardRegistry] Finishing connection refresh for cmodb804.togewa.com:27019
2019-08-30T10:55:49.155+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb804.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:49.155+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb804.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:50.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb804.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:50.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb804.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:50.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb802.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:50.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb802.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:51.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb804.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:51.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb804.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:51.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb802.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:51.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb802.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:52.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb804.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:52.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb804.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:52.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb802.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:52.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb802.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:53.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb804.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:53.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb804.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }
2019-08-30T10:55:53.153+0000 D3 CONNPOOL [ShardRegistry] Updating controller for cmodb802.togewa.com:27019 with State: { requests: 0, ready: 1, pending: 0, active: 0, isExpired: false }
2019-08-30T10:55:53.153+0000 D3 CONNPOOL [ShardRegistry] Comparing connection state for cmodb802.togewa.com:27019 to Controls: { maxPending: 2, target: 1, }

 

Generated at Thu Feb 08 05:02:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.