Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-33812

First initial sync oplog read batch fetched may be empty; do not treat as an error.

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 3.6.6, 3.7.6
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • ALL
    • v3.6, v3.4
    • Hide
      diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp
      index f8c6c70196..13330a526e 100644
      --- a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp
      +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp
      @@ -198,6 +198,7 @@ void WiredTigerOplogManager::_oplogJournalThreadLoop(WiredTigerSessionCache* ses
               // Publish the new timestamp value.
               _setOplogReadTimestamp(lk, newTimestamp);
               lk.unlock();
      +        sleepmillis(1500);
       
               // Wake up any await_data cursors and tell them more data might be visible now.
               oplogRecordStore->notifyCappedWaitersIfNeeded();
      

      Cherry-picking f23bcbfa6d08c24b5570b3b29641f96babfc6a34 onto v3.6 also reproduces the bug with the RHEL-62 enterprise builder (the required one on evergreen), though I haven't been able to reproduce locally without inserting extra delays.

      Show
      diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp index f8c6c70196..13330a526e 100644 --- a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp @@ -198,6 +198,7 @@ void WiredTigerOplogManager::_oplogJournalThreadLoop(WiredTigerSessionCache* ses // Publish the new timestamp value. _setOplogReadTimestamp(lk, newTimestamp); lk.unlock(); + sleepmillis(1500); // Wake up any await_data cursors and tell them more data might be visible now. oplogRecordStore->notifyCappedWaitersIfNeeded(); Cherry-picking f23bcbfa6d08c24b5570b3b29641f96babfc6a34 onto v3.6 also reproduces the bug with the RHEL-62 enterprise builder (the required one on evergreen), though I haven't been able to reproduce locally without inserting extra delays.
    • Repl 2018-04-23
    • 52

      Currently we depend on an initial sync not receiving an initial batch that is empty. However, this is a possibility, depending on timing in the _oplogJournalThreadLoop. The fix for SERVER-31679 exacerbates this issue, blocking its backport to 3.6. So, currently this issue is blocking the backport of that ticket.

            Assignee:
            benety.goh@mongodb.com Benety Goh
            Reporter:
            geert.bosch@mongodb.com Geert Bosch
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: