Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4821

Mongos signal 11 (sigsegv) after DBClientCursor::init failure

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.0.2
    • Component/s: None
    • Environment:
      Debian Linux with official 10gen mongodb 2.0.2 debs
    • Linux

      11Setup is 3 shards. One collection was just rebalancing or had just been to the 3rd shard (xxxB2).
      The primary of this shard had been rs.stepDown():ed and no new primary had been elected yet.

      Mongos crashed on one machine leaving this in the log:

      Mon Jan 30 16:30:28 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo1:27027

      { setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], me: "mongo1:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

      Mon Jan 30 16:30:28 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo4:27027

      { setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], passive: true, me: "mongo4:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

      Mon Jan 30 16:30:30 [conn38697] end connection 172.16.49.111:35486
      Mon Jan 30 16:30:30 [mongosMain] connection accepted from 172.16.49.111:35489 #38698
      Mon Jan 30 16:30:34 [conn38698] end connection 172.16.49.111:35489
      Mon Jan 30 16:30:34 [mongosMain] connection accepted from 172.16.49.111:35492 #38699
      Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config1:27019]
      Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config2:27019]
      Mon Jan 30 16:30:38 [Balancer] SyncClusterConnection connecting to [config3:27019]
      Mon Jan 30 16:30:38 [Balancer] could not acquire lock 'balancer/mongo2:27020:1327595773:1804289383' (another update won)
      Mon Jan 30 16:30:38 [Balancer] distributed lock 'balancer/mongo2:27020:1327595773:1804289383' was not acquired.
      Mon Jan 30 16:30:40 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo1:27027

      { setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], me: "mongo1:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

      Mon Jan 30 16:30:40 [ReplicaSetMonitorWatcher] ReplicaSetMonitor::_checkConnection: mongo4:27027

      { setName: "xxxB2", ismaster: false, secondary: true, hosts: [ "mongo1:27027" ], passives: [ "mongo4:27027" ], arbiters: [ "task2:27027" ], passive: true, me: "mongo4:27027", maxBsonObjectSize: 16777216, ok: 1.0 }

      Mon Jan 30 16:30:45 [conn38699] ~ScopedDbConnection: _conn != null
      Mon Jan 30 16:30:45 [conn38699] DBException in process: socket exception
      Mon Jan 30 16:30:45 [conn38699] SocketException handling request, closing client connection: 9001 socket exception [2] server [172.16.49.111:35492]
      Mon Jan 30 16:30:46 [mongosMain] connection accepted from 172.16.49.111:35496 #38700
      Mon Jan 30 16:30:47 [conn38700] end connection 172.16.49.111:35496
      Mon Jan 30 16:30:47 [mongosMain] connection accepted from 172.16.49.111:35497 #38701
      Mon Jan 30 16:30:47 [conn38701] Socket recv() errno:104 Connection reset by peer 172.16.49.111:27027
      Mon Jan 30 16:30:47 [conn38701] SocketException: remote: 172.16.49.111:27027 error: 9001 socket exception [1] server [172.16.49.111:27027]
      Mon Jan 30 16:30:47 [conn38701] DBClientCursor::init call() failed
      Mon Jan 30 16:30:47 [conn38701] end connection 172.16.49.111:35497
      Mon Jan 30 16:30:47 [mongosMain] connection accepted from 172.16.49.111:35498 #38702
      Mon Jan 30 16:30:47 [conn38702] got not master for: mongo1:27027
      Received signal 11
      Backtrace:

            Assignee:
            Unassigned Unassigned
            Reporter:
            balboah Johnny Boy
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: