Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-6839

SECONDARY keeps crashing

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.0.6
    • Component/s: Replication
    • Labels:
      None
    • Linux

      A SECONDARY in our replica set keeps crashing for an unknown reason.

      Our setup looks like this:

      • PRIMARY (2.0.6)
      • SECONDARY (2.0.6)
      • SECONDARY (2.2RC0)
      • ARBITER (2.0.6)

      The first secondary (2.0.6) keeps crashing at, seemingly, random moments. A look at the log file only shows this as the final part:

      set1
      hu Aug 23 13:18:44 [PeriodicTask::Runner] task: WriteBackManager::cleaner took: 5ms
      Thu Aug 23 13:18:45 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 48ms
      Thu Aug 23 13:18:45 [conn295] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db3.set1:27018", $auth: {} } ntoreturn:1 reslen:120 130ms
      Thu Aug 23 13:18:45 [conn296] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db1.set1:27018" } ntoreturn:1 reslen:120 130ms
      Thu Aug 23 13:18:45 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 5ms
      Thu Aug 23 13:18:50 [conn297] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db0.set1:27018" } ntoreturn:1 reslen:120 229ms
      Thu Aug 23 13:18:50 [conn295] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db3.set1:27018", $auth: {} } ntoreturn:1 reslen:120 229ms
      Thu Aug 23 13:18:50 [conn296] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db1.set1:27018" } ntoreturn:1 reslen:120 229ms
      Thu Aug 23 13:18:52 [initandlisten] connection accepted from 10.59.62.168:47834 #298
      Thu Aug 23 13:18:53 [conn296] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db1.set1:27018" } ntoreturn:1 reslen:120 340ms
      Thu Aug 23 13:18:53 [conn297] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db0.set1:27018" } ntoreturn:1 reslen:120 340ms
      Thu Aug 23 13:18:53 [conn295] end connection 10.59.62.168:47831
      Thu Aug 23 13:18:56 [conn296] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db1.set1:27018" } ntoreturn:1 reslen:120 372ms
      Thu Aug 23 13:18:56 [conn297] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db0.set1:27018" } ntoreturn:1 reslen:120 372ms
      Thu Aug 23 13:18:57 [conn298] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db3.set1:27018", $auth: {} } ntoreturn:1 reslen:120 326ms
      Thu Aug 23 13:19:00 [conn296] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db1.set1:27018" } ntoreturn:1 reslen:120 522ms
      Thu Aug 23 13:19:00 [conn297] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db0.set1:27018" } ntoreturn:1 reslen:120 522ms
      Thu Aug 23 13:19:00 [conn298] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db3.set1:27018", $auth: {} } ntoreturn:1 reslen:120 522ms
      		42625600/44589575	95%
      Thu Aug 23 13:19:04 [conn296] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db1.set1:27018" } ntoreturn:1 reslen:120 142ms
      Thu Aug 23 13:19:04 [conn298] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db3.set1:27018", $auth: {} } ntoreturn:1 reslen:120 142ms
      Thu Aug 23 13:19:04 [conn297] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db0.set1:27018" } ntoreturn:1 reslen:120 142ms
      Thu Aug 23 13:19:07 [initandlisten] connection accepted from 10.230.41.239:52398 #299
      Thu Aug 23 13:19:07 [conn296] end connection 10.230.41.239:52393
      Thu Aug 23 13:19:07 [conn298] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db3.set1:27018", $auth: {} } ntoreturn:1 reslen:120 101ms
      Thu Aug 23 13:19:07 [conn297] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db0.set1:27018" } ntoreturn:1 reslen:120 101ms
      Thu Aug 23 13:19:08 [conn299] command admin.$cmd command: { replSetHeartbeat: "set1", v: 9, pv: 1, checkEmpty: false, from: "db1.set1:27018" } ntoreturn:1 reslen:120 326ms
      

      Here's some more information:

      • all instances run on EC2 w/ a couple of extra volumes each for storage
      • all installs of MongoDB (except for 2.2) were done through the official YUM repo
      • nothing in our config changed in the past days
      • our processed data volume did not increase in the past days

        1. mongod-shardsvr.log
          1016 kB
        2. mongod-shardsvr.log
          76 kB

            Assignee:
            Unassigned Unassigned
            Reporter:
            k.satirli Kerim Satirli
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: