Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-6762

Assertion failure cursor.get() db/repl/../oplogreader.h 93

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • 2.3.0
    • 2.0.6
    • Replication
    • None
    • SLES 11
    • Fully Compatible
    • Linux

    Description

      We have a 5 servers replset in our production enviroment. But recently, it got a huge replication lag frequently.
      I checked the mongodb log file, each time the replication lag happens, I saw this in the log file:

      Mon Aug 13 22:46:36 [rsSync] Socket recv() timeout 10.20.1.18:27017
      Mon Aug 13 22:46:36 [rsSync] SocketException: remote: 10.20.1.18:27017 error: 9001 socket exception [3] server [10.20.1.18:27017]
      Mon Aug 13 22:46:36 [rsSync] DBClientCursor::init call() failed
      Mon Aug 13 22:46:37 [rsSync] replSet syncing to: 10.20.1.18:27017
      Mon Aug 13 22:46:49 [rsGhostSync] Socket recv() timeout 10.20.1.18:27017
      Mon Aug 13 22:46:49 [rsGhostSync] SocketException: remote: 10.20.1.18:27017 error: 9001 socket exception [3] server [10.20.1.18:27017]
      Mon Aug 13 22:46:49 [rsGhostSync] DBClientCursor::init call() failed
      Mon Aug 13 22:46:49 [rsGhostSync] Assertion failure cursor.get() db/repl/../oplogreader.h 93
      0x57a8a6 0x5853eb 0x8254f1 0x58fc23 0x58d7f4 0x58ce23 0x5742ef 0x576664 0xaabca0 0x7f9d48025070 0x7f9d4761e10d
      /usr/local/mongodb/bin/mongod(_ZN5mongo12sayDbContextEPKc+0x96) [0x57a8a6]
      /usr/local/mongodb/bin/mongod(_ZN5mongo8assertedEPKcS1_j+0xfb) [0x5853eb]
      /usr/local/mongodb/bin/mongod(_ZN5mongo9GhostSync9percolateERKNS_7BSONObjERKNS_6OpTimeE+0xbb1) [0x8254f1]
      /usr/local/mongodb/bin/mongod(_ZNK5boost9function0IvEclEv+0x243) [0x58fc23]
      /usr/local/mongodb/bin/mongod(_ZN5mongo4task6Server6doWorkEv+0x254) [0x58d7f4]
      /usr/local/mongodb/bin/mongod(_ZN5mongo4task4Task3runEv+0x33) [0x58ce23]
      /usr/local/mongodb/bin/mongod(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE+0xbf) [0x5742ef]
      /usr/local/mongodb/bin/mongod(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv+0x74) [0x576664]
      /usr/local/mongodb/bin/mongod(thread_proxy+0x80) [0xaabca0]
      /lib64/libpthread.so.0 [0x7f9d48025070]
      /lib64/libc.so.6(clone+0x6d) [0x7f9d4761e10d]

      Once the connection timeout, mongodb stopped pulling oplog from the primary node for some time until the connection is re-established. Because the replset is using a Chained replication, if the first secondary node in the chain has a timeout connection with the primary node, then all the other secondary nodes connect to it became lag too.

      This happened so often and we didn't manage to find anything about this, can you help us checking if it's a bug or not.

      Attachments

        Activity

          People

            kristina Kristina Chodorow (Inactive)
            rodericliu Roderic Liu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: