Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26701

MongoS stalls when it cannot access one of the CSRS server

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.5
    • Component/s: Sharding
    • None
    • ALL

      Running a sharded cluster using MongoDb3.2.5 with WiredTiger and of late, mongos on some machine just stalls when it cannot access one of the CSRS server.

      CSRS is a 3 node replicaset and only a couple of mongos processes get stuck in this scenario - we can ignore the reason for one CSRS server not being available for this particular problem

      Here is the mogos log snippet when this happened - the mongos tried to switch the order of the CSRS servers and then after a couple of attempts, it just hung. Any queries using that mongos process would not return.

      2016-10-19T09:02:30.381-0400 I SHARDING [Balancer] distributed lock 'balancer' acquired for 'doing balance round', ts : 58076ee6087c1d793ad3b986
      2016-10-19T09:02:32.657-0400 I SHARDING [Balancer] distributed lock with ts: 58076ee6087c1d793ad3b986' unlocked.
      2016-10-19T09:03:47.065-0400 I SHARDING [Balancer] distributed lock 'balancer' acquired for 'doing balance round', ts : 58076f33087c1d793ad3b98d
      2016-10-19T09:03:49.341-0400 I SHARDING [Balancer] distributed lock with ts: 58076f33087c1d793ad3b98d' unlocked.
      2016-10-19T09:05:28.947-0400 I NETWORK  [ReplicaSetMonitorWatcher] changing hosts to csReplSet/mongoconfigserver1:29102,mongoconfigserver3:29102,mongoconfigserver2:29102 from csReplSet/mongoconfigserver1:29102,mongoconfigserver2:29102
      2016-10-19T09:05:28.947-0400 I SHARDING [ReplicaSetMonitorWatcher] Updating config server connection string to: csReplSet/mongoconfigserver1:29102,mongoconfigserver3:29102,mongoconfigserver2:29102
      2016-10-19T09:05:28.947-0400 I SHARDING [ReplicaSetMonitorWatcher] Updating ShardRegistry connection string for shard config from: csReplSet/mongoconfigserver1:29102,mongoconfigserver2:29102 to: csReplSet/mongoconfigserver1:29102,mongoconfigserver3:29102,mongoconfigserver2:29102
      2016-10-19T09:06:15.520-0400 W SHARDING [Balancer] ExceededTimeLimit: Couldn't get a connection within the time limit
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            darshan.shah@interactivedata.com Darshan Shah
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: