Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23841

Mongod always complain "Fatal assertion 18750 UnrecoverableRollbackError" after mongod abnormal termination

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 3.2.5
    • Fix Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      1) kill primary mongod of replica-set during crud operation,
      2) restart primary mongod

      Show
      1) kill primary mongod of replica-set during crud operation, 2) restart primary mongod
    • Sprint:
      Repl 15 (06/03/16), Repl 16 (06/24/16)
    • Case:

      Description

      MongoDB server always complain "Fatal assertion 18750" after abnormal mongod termination.

      Sometimes mongod server is hang and no response, at that time mongod server almost not response (SERVER-23778). So I can't terminate mongod normally.
      So I kill mongod with SIGKILL. Or sometimes linux kill mongod because excessive memory usage. Anyway always mongod complain this assertion error and stop replicating during startup after mongod abnormal shutdown.

      2016-04-20T22:51:04.192+0900 I REPL     [ReplicationExecutor] This node is test02-mongo3:27017 in the config
      2016-04-20T22:51:04.193+0900 I REPL     [ReplicationExecutor] Member test02-mongo1:27017 is now in state SECONDARY
      2016-04-20T22:51:04.193+0900 I REPL     [ReplicationExecutor] Member test02-mongo2:27017 is now in state PRIMARY
      2016-04-20T22:51:05.183+0900 I REPL     [ReplicationExecutor] syncing from: test02-mongo2:27017
      2016-04-20T22:51:05.184+0900 I REPL     [SyncSourceFeedback] setting syncSourceFeedback to test02-mongo2:27017
      2016-04-20T22:51:05.185+0900 I ASIO     [NetworkInterfaceASIO-BGSync-0] Successfully connected to test02-mongo2:27017
      2016-04-20T22:51:05.187+0900 I REPL     [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: (term: 15, timestamp: Apr 20 10:20:00:7). source's GTE: (term: 16, timestamp: Apr 20 10:20:00:7) hashes: (-157997914050892151/-8410528485294631657)
      2016-04-20T22:51:05.187+0900 I -        [rsBackgroundSync] Fatal assertion 18750 UnrecoverableRollbackError: need to rollback, but in inconsistent state. minvalid: (term: 16, timestamp: Apr 20 10:20:01:2) > our last optime: (term: 15, timestamp: Apr 20 10:20:00:7)
      2016-04-20T22:51:05.187+0900 I -        [rsBackgroundSync]
       
      ***aborting after fassert() failure
      

      Is this normal behavior or there might be something wrong on my configuration.
      And is there any fast way to recover without full re-sync ?

      I found a few bug-report from JIRA, but they say it's already fixed.
      So I report this issue again.

      Thanks.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                3 Vote for this issue
                Watchers:
                23 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: