Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-33812

First initial sync oplog read batch fetched may be empty; do not treat as an error.

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical - P2
    • Resolution: Fixed
    • None
    • 3.6.6, 3.7.6
    • Replication
    • None
    • Fully Compatible
    • ALL
    • v3.6, v3.4
    • Hide

      diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp
      index f8c6c70196..13330a526e 100644
      --- a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp
      +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp
      @@ -198,6 +198,7 @@ void WiredTigerOplogManager::_oplogJournalThreadLoop(WiredTigerSessionCache* ses
               // Publish the new timestamp value.
               _setOplogReadTimestamp(lk, newTimestamp);
               lk.unlock();
      +        sleepmillis(1500);
       
               // Wake up any await_data cursors and tell them more data might be visible now.
               oplogRecordStore->notifyCappedWaitersIfNeeded();
      

      Cherry-picking f23bcbfa6d08c24b5570b3b29641f96babfc6a34 onto v3.6 also reproduces the bug with the RHEL-62 enterprise builder (the required one on evergreen), though I haven't been able to reproduce locally without inserting extra delays.

      Show
      diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp index f8c6c70196..13330a526e 100644 --- a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp @@ -198,6 +198,7 @@ void WiredTigerOplogManager::_oplogJournalThreadLoop(WiredTigerSessionCache* ses // Publish the new timestamp value. _setOplogReadTimestamp(lk, newTimestamp); lk.unlock(); + sleepmillis(1500); // Wake up any await_data cursors and tell them more data might be visible now. oplogRecordStore->notifyCappedWaitersIfNeeded(); Cherry-picking f23bcbfa6d08c24b5570b3b29641f96babfc6a34 onto v3.6 also reproduces the bug with the RHEL-62 enterprise builder (the required one on evergreen), though I haven't been able to reproduce locally without inserting extra delays.
    • Repl 2018-04-23
    • 52

    Description

      Currently we depend on an initial sync not receiving an initial batch that is empty. However, this is a possibility, depending on timing in the _oplogJournalThreadLoop. The fix for SERVER-31679 exacerbates this issue, blocking its backport to 3.6. So, currently this issue is blocking the backport of that ticket.

      Attachments

        Issue Links

          Activity

            People

              benety.goh@mongodb.com Benety Goh
              geert.bosch@mongodb.com Geert Bosch
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: