Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Critical - P2
Fix Version/s: 3.6.6, 3.7.6
Affects Version/s: None
Component/s: Replication
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v3.6, v3.4
Steps To Reproduce:
Hide

diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp index f8c6c70196..13330a526e 100644 --- a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp @@ -198,6 +198,7 @@ void WiredTigerOplogManager::_oplogJournalThreadLoop(WiredTigerSessionCache* ses // Publish the new timestamp value. _setOplogReadTimestamp(lk, newTimestamp); lk.unlock(); + sleepmillis(1500); // Wake up any await_data cursors and tell them more data might be visible now. oplogRecordStore->notifyCappedWaitersIfNeeded();

Cherry-picking f23bcbfa6d08c24b5570b3b29641f96babfc6a34 onto v3.6 also reproduces the bug with the RHEL-62 enterprise builder (the required one on evergreen), though I haven't been able to reproduce locally without inserting extra delays.
Show
diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp index f8c6c70196..13330a526e 100644 --- a/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp @@ -198,6 +198,7 @@ void WiredTigerOplogManager::_oplogJournalThreadLoop(WiredTigerSessionCache* ses // Publish the new timestamp value. _setOplogReadTimestamp(lk, newTimestamp); lk.unlock(); + sleepmillis(1500); // Wake up any await_data cursors and tell them more data might be visible now. oplogRecordStore->notifyCappedWaitersIfNeeded(); Cherry-picking f23bcbfa6d08c24b5570b3b29641f96babfc6a34 onto v3.6 also reproduces the bug with the RHEL-62 enterprise builder (the required one on evergreen), though I haven't been able to reproduce locally without inserting extra delays.
Sprint:
Repl 2018-04-23
Linked BF Score:
52
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Currently we depend on an initial sync not receiving an initial batch that is empty. However, this is a possibility, depending on timing in the _oplogJournalThreadLoop. The fix for ~~SERVER-31679~~ exacerbates this issue, blocking its backport to 3.6. So, currently this issue is blocking the backport of that ticket.

is related to

SERVER-30927 Use readConcern afterClusterTime for initsync oplog queries

Closed

SERVER-30977 Need to sign cluster times for unsharded replica sets

Closed

SERVER-31007 Calculate rollback time limit correctly

Closed

SERVER-34768 Rollback can fail if run against a lagged node that catches up

Closed

SERVER-35256 Do not treat it as an error if the first batch returned by an oplog query comes back empty in master-slave

Closed

SERVER-42910 Oplog query with higher timestamp but lower term than the sync source shouldn't time out due to afterClusterTime

Closed

SERVER-29213 Have KVWiredTigerEngine implement StorageEngine::recoverToStableTimestamp

Closed

related to

SERVER-31679 Increase in disk i/o for writes to replica set

Closed

(2 is related to, 1 related to)

Assignee:: Benety Goh
Reporter:: Geert Bosch
Participants:: Benety Goh, Daniel Gottlieb, Eric Milkie, Geert Bosch, Githook User, Siyuan Zhou, Spencer Brody
Votes:: 0 Vote for this issue
Watchers:: 9 Start watching this issue

Created:: Mar 12 2018 01:58:38 PM UTC
Updated:: Oct 29 2023 10:33:53 PM UTC
Resolved:: Apr 18 2018 01:05:46 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates