Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17055

secondary cannot keep up once oplog hit cap with wiredTiger

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical - P2
    • Resolution: Duplicate
    • 3.0.0-rc6
    • None
    • Replication, WiredTiger
    • None
    • ALL

    Description

      during long concurrent write, secondary cannot keep up once oplog hit cap. and eventually secondary fall into recovery mode, and stop replication

      some log, before hit the oplog cap

      +++++++
      ts: Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
              0 secs (0 hrs) behind the primary      <=== second is keeping up
      configured oplog size:   6515.767382621765MB
      log length start to end: 358secs (0.1hrs)     <==== oplog
      oplog first event time:  Sun Jan 25 2015 17:48:03 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
       
      ...
       
      +++++++
      ts: Sun Jan 25 2015 18:06:14 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:13 GMT+0000 (UTC)
              1 secs (0 hrs) behind the primary       <=== secondary behind still minimal
      configured oplog size:   6515.767382621765MB
      log length start to end: 1091secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:48:03 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:06:14 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:06:14 GMT+0000 (UTC)
       

      once oplog hit cap, secondary canon keep up, and then eventually enter recovery mode

      +++++++
      ts: Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:52 GMT+0000 (UTC)
              2 secs (0 hrs) behind the primary
      configured oplog size:   6515.767382621765MB
      log length start to end: 1073secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:49:01 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
       
      +++++++
      ts: Sun Jan 25 2015 18:07:04 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
              10 secs (0 hrs) behind the primary   <=== secondary start fall behind
      configured oplog size:   6515.767382621765MB
      log length start to end: 1081secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:49:16 GMT+0000 (UTC)  <<== oplog hit cap, start roll over
      oplog last event time:   Sun Jan 25 2015 18:07:17 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:07:17 GMT+0000 (UTC)
       
      +++++++
      ts: Sun Jan 25 2015 18:07:27 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:58 GMT+0000 (UTC)
              29 secs (0.01 hrs) behind the primary
      configured oplog size:   6515.767382621765MB
      log length start to end: 1082secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:49:25 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:07:27 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:07:27 GMT+0000 (UTC)
       
      ...
       
      +++++++
      ts: Sun Jan 25 2015 18:17:29 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:08:22 GMT+0000 (UTC)
              547 secs (0.15 hrs) behind the primary  <<=== secondary behind grows fast
      configured oplog size:   6515.767382621765MB
      log length start to end: 1131secs (0.31hrs)
      oplog first event time:  Sun Jan 25 2015 17:58:53 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:17:44 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:17:44 GMT+0000 (UTC)
       
      +++++++
      ts: Sun Jan 25 2015 18:17:54 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:08:24 GMT+0000 (UTC)
              570 secs (0.16 hrs) behind the primary
      configured oplog size:   6515.767382621765MB
      log length start to end: 1133secs (0.31hrs)
      oplog first event time:  Sun Jan 25 2015 17:59:01 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:17:54 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:17:54 GMT+0000 (UTC)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rui.zhang Rui Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: