Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31283

Unable to complete rollback after falling off the oplog

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Works as Designed
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Replication
    • None
    • ALL

    Description

      I have mongo replication set. One of the servers crashed with the next info in log file:

      2017-09-26T11:44:51.714-0700 I INDEX    [repl index builder 132373] build index done.  scanned 0 total records. 0 secs
      2017-09-26T11:44:51.718-0700 I INDEX    [repl index builder 132374] build index on: site_7053.visitor_offers properties: { v: 2, key: { origin_id: 1 }, name: "origin_id_1", ns: "site_7053.visitor_offers", background: true }
      2017-09-26T11:44:51.718-0700 I INDEX    [repl index builder 132374] build index done.  scanned 0 total records. 0 secs
      2017-09-26T11:44:51.724-0700 I INDEX    [repl index builder 132375] build index on: site_7053.visitor_offers properties: { v: 2, key: { offer_id: 1 }, name: "offer_id_1", ns: "site_7053.visitor_offers", background: true }
      2017-09-26T11:44:51.724-0700 I INDEX    [repl index builder 132375] build index done.  scanned 0 total records. 0 secs
      2017-09-26T11:44:52.027-0700 I REPL     [rsBackgroundSync] rollback 0
      2017-09-26T11:44:52.029-0700 I REPL     [rsBackgroundSync] transition to ROLLBACK
      2017-09-26T11:44:52.029-0700 I NETWORK  [rsBackgroundSync] legacy transport layer closing all connections
      2017-09-26T11:44:52.034-0700 I NETWORK  [rsBackgroundSync] Skip closing connection for connection # 55139
      2017-09-26T11:44:52.036-0700 I NETWORK  [rsBackgroundSync] Skip closing connection for connection # 54921
      2017-09-26T11:44:52.039-0700 I -        [conn52753] end connection 10.0.6.27:39961 (967 connections now open)
      2017-09-26T11:44:52.039-0700 I -        [conn52775] end connection 10.0.6.141:55674 (967 connections now open)
      2017-09-26T11:44:52.039-0700 I -        [conn55717] end connection 10.0.2.59:60926 (967 connections now open)
      ...
      2017-09-26T11:44:52.042-0700 I -        [conn54912] end connection 10.0.2.59:51880 (724 connections now open)
      2017-09-26T11:44:52.042-0700 I -        [conn54923] end connection 10.0.2.59:51613 (716 connections now open)
      2017-09-26T11:44:52.042-0700 I -        [conn54891] end connection 10.0.6.141:44850 (713 connections now open)
      2017-09-26T11:44:52.094-0700 I REPL     [rsBackgroundSync] beginning rollback
      2017-09-26T11:44:52.094-0700 I REPL     [rsBackgroundSync] rollback 1
      2017-09-26T11:44:52.102-0700 I REPL     [rsBackgroundSync] rollback 2 FindCommonPoint
      2017-09-26T11:44:52.106-0700 I -        [conn55037] AssertionException handling request, closing client connection: 172 Operation attempted on a closed transport Session.
      2017-09-26T11:44:52.106-0700 I -        [conn55028] AssertionException handling request, closing client connection: 172 Operation attempted on a closed transport Session.
      2017-09-26T11:44:52.106-0700 I -        [conn55037] end connection 10.0.6.141:44957 (4 connections now open)
      2017-09-26T11:44:52.106-0700 I -        [conn55028] end connection 10.0.6.141:44944 (4 connections now open)
      2017-09-26T11:44:52.108-0700 I REPL     [rsBackgroundSync] rollback our last optime:   Sep 26 09:00:18:7cd
      2017-09-26T11:44:52.108-0700 I REPL     [rsBackgroundSync] rollback their last optime: Sep 26 11:44:51:3e8
      2017-09-26T11:44:52.108-0700 I REPL     [rsBackgroundSync] rollback diff in end of log times: -9873 seconds
      2017-09-26T11:44:58.672-0700 F REPL     [rsBackgroundSync] rollback error RS100 reached beginning of remote oplog
      2017-09-26T11:44:58.673-0700 I REPL     [rsBackgroundSync]   them:      mng01.production:27017 (10.0.6.103)
      2017-09-26T11:44:58.673-0700 I REPL     [rsBackgroundSync]   theirTime: Sep 26 10:51:58 59ca93be:341
      2017-09-26T11:44:58.676-0700 I REPL     [rsBackgroundSync] rollback finished
      2017-09-26T11:44:58.676-0700 F REPL     [rsBackgroundSync] Unable to complete rollback. A full resync may be needed: UnrecoverableRollbackError: need to rollback, but unable to determine common point between local and remote oplog: NoMatchingDocument: RS100 reached beginning of remote oplog [1] @ 18752
      2017-09-26T11:44:58.676-0700 I -        [rsBackgroundSync] Fatal Assertion 28723 at src/mongo/db/repl/rs_rollback.cpp 915
      2017-09-26T11:44:58.676-0700 I -        [rsBackgroundSync]
       
      ***aborting after fassert() failure
      
      

      Attachments

        Activity

          People

            kelsey.schubert@mongodb.com Kelsey Schubert
            bguban Bogdan Guban
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: