Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23841

Mongod always complain "Fatal assertion 18750 UnrecoverableRollbackError" after mongod abnormal termination

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.5
    • Component/s: Replication
    • Labels:
      None
    • Replication
    • Fully Compatible
    • ALL
    • Hide

      1) kill primary mongod of replica-set during crud operation,
      2) restart primary mongod

      Show
      1) kill primary mongod of replica-set during crud operation, 2) restart primary mongod
    • Repl 15 (06/03/16), Repl 16 (06/24/16)

      MongoDB server always complain "Fatal assertion 18750" after abnormal mongod termination.

      Sometimes mongod server is hang and no response, at that time mongod server almost not response (SERVER-23778). So I can't terminate mongod normally.
      So I kill mongod with SIGKILL. Or sometimes linux kill mongod because excessive memory usage. Anyway always mongod complain this assertion error and stop replicating during startup after mongod abnormal shutdown.

      2016-04-20T22:51:04.192+0900 I REPL     [ReplicationExecutor] This node is test02-mongo3:27017 in the config
      2016-04-20T22:51:04.193+0900 I REPL     [ReplicationExecutor] Member test02-mongo1:27017 is now in state SECONDARY
      2016-04-20T22:51:04.193+0900 I REPL     [ReplicationExecutor] Member test02-mongo2:27017 is now in state PRIMARY
      2016-04-20T22:51:05.183+0900 I REPL     [ReplicationExecutor] syncing from: test02-mongo2:27017
      2016-04-20T22:51:05.184+0900 I REPL     [SyncSourceFeedback] setting syncSourceFeedback to test02-mongo2:27017
      2016-04-20T22:51:05.185+0900 I ASIO     [NetworkInterfaceASIO-BGSync-0] Successfully connected to test02-mongo2:27017
      2016-04-20T22:51:05.187+0900 I REPL     [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: (term: 15, timestamp: Apr 20 10:20:00:7). source's GTE: (term: 16, timestamp: Apr 20 10:20:00:7) hashes: (-157997914050892151/-8410528485294631657)
      2016-04-20T22:51:05.187+0900 I -        [rsBackgroundSync] Fatal assertion 18750 UnrecoverableRollbackError: need to rollback, but in inconsistent state. minvalid: (term: 16, timestamp: Apr 20 10:20:01:2) > our last optime: (term: 15, timestamp: Apr 20 10:20:00:7)
      2016-04-20T22:51:05.187+0900 I -        [rsBackgroundSync]
      
      ***aborting after fassert() failure
      

      Is this normal behavior or there might be something wrong on my configuration.
      And is there any fast way to recover without full re-sync ?

      I found a few bug-report from JIRA, but they say it's already fixed.
      So I report this issue again.

      Thanks.

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            sunguck.lee@gmail.com 아나 하리
            Votes:
            3 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: