Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9868

heartbeats not responded to during mmap flushing on Windows

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.3
    • Component/s: Replication
    • Environment:
    • Windows

      We have a environment with 2 nodes and 1 arbiter with the following configuration:

      {
      "_id" : "rssc01",
      "version" : 3,
      "members" : [

      { "_id" : 0, "host" : "LOG-MNGSC11:27017" }

      ,

      { "_id" : 1, "host" : "log-mngsc21:27017" }

      ,

      { "_id" : 2, "host" : "log-mngsc22:27018", "arbiterOnly" : true }

      ]
      }

      LOG-MNGSC11 is the primary and LOG-MNGSC21 is the secondary.

      Suddenly, the replication fails with the following message on secondary:

      Thu Jun 06 11:56:50.656 [rsHealthPoll] replset info LOG-MNGSC11:27017 thinks that we are down
      Thu Jun 06 11:56:52.310 [rsHealthPoll] replset info log-mngsc22:27018 thinks that we are down
      Thu Jun 06 11:56:52.481 [conn12160] command admin.$cmd command:

      { writebacklisten: ObjectId('51afe4406127103ae52a9fc0') }

      ntoreturn:1 keyUpdates:0 reslen:44 300005ms
      Thu Jun 06 11:56:52.668 [rsHealthPoll] replset info LOG-MNGSC11:27017 thinks that we are down
      Thu Jun 06 11:56:52.793 [rsBackgroundSync] Socket recv() timeout 172.29.106.92:27017
      Thu Jun 06 11:56:52.793 [rsBackgroundSync] SocketException: remote: 172.29.106.92:27017 error: 9001 socket exception [3] server [172.29.106.92:27017]
      Thu Jun 06 11:56:52.793 [rsBackgroundSync] DBClientCursor::init call() failed

      At primary, I see the following messages:

      Thu Jun 06 11:56:52.524 [initandlisten] connection accepted from 172.29.106.95:56714 #16239 (66 connections now open)
      Thu Jun 06 11:56:55.847 [rsHealthPoll] DBClientCursor::init call() failed
      Thu Jun 06 11:56:57.329 [conn16236] query local.oplog.rs query: { ts:

      { $gte: Timestamp 1370511563000|895 }

      } cursorid:479480611067557781 ntoreturn:0 ntoskip:0 nscanned:102 keyUpdates:0 numYields: 2264 locks(micros) r:727945 nreturned:101 reslen:12039 35319ms
      Thu Jun 06 11:56:57.329 [conn16236] end connection 172.29.106.95:56704 (65 connections now open)

        1. mongo_logmngsc11.zip
          4.73 MB
        2. mongo_logmngsc21.zip
          205 kB

            Assignee:
            mark.benvenuto@mongodb.com Mark Benvenuto
            Reporter:
            david.verdejo@logitravel.com David Verdejo
            Votes:
            2 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: