Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17055

secondary cannot keep up once oplog hit cap with wiredTiger

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 3.0.0-rc6
    • Component/s: Replication, WiredTiger
    • Labels:
      None
    • ALL

      during long concurrent write, secondary cannot keep up once oplog hit cap. and eventually secondary fall into recovery mode, and stop replication

      some log, before hit the oplog cap

      +++++++
      ts: Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
              0 secs (0 hrs) behind the primary      <=== second is keeping up
      configured oplog size:   6515.767382621765MB
      log length start to end: 358secs (0.1hrs)     <==== oplog
      oplog first event time:  Sun Jan 25 2015 17:48:03 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 17:54:01 GMT+0000 (UTC)
      
      ...
      
      +++++++
      ts: Sun Jan 25 2015 18:06:14 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:13 GMT+0000 (UTC)
              1 secs (0 hrs) behind the primary       <=== secondary behind still minimal
      configured oplog size:   6515.767382621765MB
      log length start to end: 1091secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:48:03 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:06:14 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:06:14 GMT+0000 (UTC)
      
      

      once oplog hit cap, secondary canon keep up, and then eventually enter recovery mode

      +++++++
      ts: Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:52 GMT+0000 (UTC)
              2 secs (0 hrs) behind the primary
      configured oplog size:   6515.767382621765MB
      log length start to end: 1073secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:49:01 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
      
      +++++++
      ts: Sun Jan 25 2015 18:07:04 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:54 GMT+0000 (UTC)
              10 secs (0 hrs) behind the primary   <=== secondary start fall behind
      configured oplog size:   6515.767382621765MB
      log length start to end: 1081secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:49:16 GMT+0000 (UTC)  <<== oplog hit cap, start roll over
      oplog last event time:   Sun Jan 25 2015 18:07:17 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:07:17 GMT+0000 (UTC)
      
      +++++++
      ts: Sun Jan 25 2015 18:07:27 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:06:58 GMT+0000 (UTC)
              29 secs (0.01 hrs) behind the primary
      configured oplog size:   6515.767382621765MB
      log length start to end: 1082secs (0.3hrs)
      oplog first event time:  Sun Jan 25 2015 17:49:25 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:07:27 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:07:27 GMT+0000 (UTC)
      
      ...
      
      +++++++
      ts: Sun Jan 25 2015 18:17:29 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:08:22 GMT+0000 (UTC)
              547 secs (0.15 hrs) behind the primary  <<=== secondary behind grows fast
      configured oplog size:   6515.767382621765MB
      log length start to end: 1131secs (0.31hrs)
      oplog first event time:  Sun Jan 25 2015 17:58:53 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:17:44 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:17:44 GMT+0000 (UTC)
      
      +++++++
      ts: Sun Jan 25 2015 18:17:54 GMT+0000 (UTC)
      source: 172.31.35.229:27017
              syncedTo: Sun Jan 25 2015 18:08:24 GMT+0000 (UTC)
              570 secs (0.16 hrs) behind the primary
      configured oplog size:   6515.767382621765MB
      log length start to end: 1133secs (0.31hrs)
      oplog first event time:  Sun Jan 25 2015 17:59:01 GMT+0000 (UTC)
      oplog last event time:   Sun Jan 25 2015 18:17:54 GMT+0000 (UTC)
      now:                     Sun Jan 25 2015 18:17:54 GMT+0000 (UTC)
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            rui.zhang Rui Zhang (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: