Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9868

heartbeats not responded to during mmap flushing on Windows

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 2.4.3
    • Fix Version/s: None
    • Component/s: Replication
    • Environment:

      Description

      We have a environment with 2 nodes and 1 arbiter with the following configuration:

      {
      "_id" : "rssc01",
      "version" : 3,
      "members" : [

      { "_id" : 0, "host" : "LOG-MNGSC11:27017" }

      ,

      { "_id" : 1, "host" : "log-mngsc21:27017" }

      ,

      { "_id" : 2, "host" : "log-mngsc22:27018", "arbiterOnly" : true }

      ]
      }

      LOG-MNGSC11 is the primary and LOG-MNGSC21 is the secondary.

      Suddenly, the replication fails with the following message on secondary:

      Thu Jun 06 11:56:50.656 [rsHealthPoll] replset info LOG-MNGSC11:27017 thinks that we are down
      Thu Jun 06 11:56:52.310 [rsHealthPoll] replset info log-mngsc22:27018 thinks that we are down
      Thu Jun 06 11:56:52.481 [conn12160] command admin.$cmd command:

      { writebacklisten: ObjectId('51afe4406127103ae52a9fc0') }

      ntoreturn:1 keyUpdates:0 reslen:44 300005ms
      Thu Jun 06 11:56:52.668 [rsHealthPoll] replset info LOG-MNGSC11:27017 thinks that we are down
      Thu Jun 06 11:56:52.793 [rsBackgroundSync] Socket recv() timeout 172.29.106.92:27017
      Thu Jun 06 11:56:52.793 [rsBackgroundSync] SocketException: remote: 172.29.106.92:27017 error: 9001 socket exception [3] server [172.29.106.92:27017]
      Thu Jun 06 11:56:52.793 [rsBackgroundSync] DBClientCursor::init call() failed

      At primary, I see the following messages:

      Thu Jun 06 11:56:52.524 [initandlisten] connection accepted from 172.29.106.95:56714 #16239 (66 connections now open)
      Thu Jun 06 11:56:55.847 [rsHealthPoll] DBClientCursor::init call() failed
      Thu Jun 06 11:56:57.329 [conn16236] query local.oplog.rs query: { ts:

      { $gte: Timestamp 1370511563000|895 }

      } cursorid:479480611067557781 ntoreturn:0 ntoskip:0 nscanned:102 keyUpdates:0 numYields: 2264 locks(micros) r:727945 nreturned:101 reslen:12039 35319ms
      Thu Jun 06 11:56:57.329 [conn16236] end connection 172.29.106.95:56704 (65 connections now open)

        Attachments

        1. mongo_logmngsc11.zip
          4.73 MB
        2. mongo_logmngsc21.zip
          205 kB

          Issue Links

            Activity

              People

              • Votes:
                2 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: