Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23366

rsBackgroundSync failure

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.4
    • Component/s: Admin
    • None
    • ALL
    • None
    • 0
    • None
    • None
    • None
    • None
    • None
    • None

      Hi,

      After one of my replica set members had a corrupted data (WiredTiger) I started the mongod with --repair option as suggested here and other places. It took 2 days since the database is big. And now when I try to start it I get this error that I saw it was fixed in 3.2 but I am in the latest version:

      2016-03-27T23:25:21.371+0200 I NETWORK  [conn43] end connection 10.0.0.6:59190 (3 connections now open)
      2016-03-27T23:25:21.372+0200 I REPL     [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 468949, protocolVersion: 1, members: [ { _id: 2, host: "mongodb-replica2:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 3, host: "mongodb-arbiter:30000", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 4, host: "mongodb-replica1:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }
      2016-03-27T23:25:21.372+0200 I REPL     [ReplicationExecutor] This node is mongodb-replica1:27017 in the config
      2016-03-27T23:25:21.372+0200 I REPL     [ReplicationExecutor] transition to RECOVERING
      2016-03-27T23:25:21.374+0200 I REPL     [ReplicationExecutor] Member mongodb-replica2:27017 is now in state PRIMARY
      2016-03-27T23:25:21.376+0200 I NETWORK  [initandlisten] connection accepted from 10.0.0.6:59191 #44 (4 connections now open)
      2016-03-27T23:25:21.377+0200 I ASIO     [NetworkInterfaceASIO-Replication-0] Successfully connected to mongodb-arbiter:30000
      2016-03-27T23:25:21.380+0200 I REPL     [ReplicationExecutor] Member mongodb-arbiter:30000 is now in state ARBITER
      2016-03-27T23:25:21.509+0200 I REPL     [ReplicationExecutor] syncing from: mongodb-replica2:27017
      2016-03-27T23:25:21.515+0200 I REPL     [SyncSourceFeedback] setting syncSourceFeedback to mongodb-replica2:27017
      2016-03-27T23:25:21.519+0200 I ASIO     [NetworkInterfaceASIO-BGSync-0] Successfully connected to mongodb-replica2:27017
      2016-03-27T23:25:21.525+0200 I REPL     [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: (term: 22, timestamp: Mar 25 05:28:38:29). source's GTE: (term: 21, timestamp: Mar 25 05:28:38:29) hashes: (-484014505077360402/3917758058131207127)
      2016-03-27T23:25:21.525+0200 I -        [rsBackgroundSync] Fatal assertion 18750 UnrecoverableRollbackError: need to rollback, but in inconsistent state. minvalid: (term: 23, timestamp: Mar 25 05:29:07:22) > our last optime: (term: 22, timestamp: Mar 25 05:28:38:29)
      2016-03-27T23:25:21.525+0200 I -        [rsBackgroundSync]
      
      ***aborting after fassert() failure
      

      Any idea why this is happening? I am still within my oplogWindow.

      Should I do full sync or there is a way to bring this back to life.

      Many thanks,
      Maziyar

        1. WiredTiger.turtle
          0.9 kB
        2. WiredTiger.wt
          760 kB

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            maziyar Maziyar Panahi
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: