Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.3.1
Affects Version/s: None
Component/s: Replication
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Repl 2019-10-07, Repl 2019-10-21
Linked BF Score:
13
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

A hang was observed in the DeleteOpIsIdBased unittest in repltests.cpp. The test performs several deletes (which create delete oplog entries) and immediately queries the oplog, triggering a call to waitForAllEarlierOplogWritesToBeVisible. The stack trace is approximately:

Thread 1: "testsuite" (Thread 0x7fdf3e0f7ac0 (LWP 67799))
 .
#10 0x000055f6e9f94225 in mongo::Interruptible::waitForConditionOrInterrupt
#11 mongo::WiredTigerOplogManager::waitForAllEarlierOplogWritesToBeVisible
 .
#16 0x000055f6eb3e54fb in mongo::(anonymous namespace)::FindCmd::Invocation::run
 .
#28 0x000055f6e97c822d in ReplTests::Base::applyAllOperations
#29 0x000055f6e9824b59 in ReplTests::DeleteOpIsIdBased::run

This hang was observed approximately once in Evergreen. It seems likely to be a race involving the WTOplogJournalThread and the main thread, where the main thread is expecting the WTOplogJournalThread to call _setOplogReadTimestamp but it already has / never does. As lingzhi.deng showed me, it may be because waitForAllEarlierOplogWritesToBeVisible increments _opsWaitingForVisibility tell this thread that someone is waiting for it, but the thread checks a different member, _opsWaitingForJournal, to determine if there are any waiters.

related to

SERVER-44196 Complete TODO listed in SERVER-43399

Closed

Assignee:: A. Jesse Jiryu Davis
Reporter:: A. Jesse Jiryu Davis
Participants:: A. Jesse Jiryu Davis, Daniel Gottlieb, Githook User, Matthew Russotto
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Sep 20 2019 06:47:15 PM UTC
Updated:: Oct 29 2023 10:17:04 PM UTC
Resolved:: Oct 17 2019 04:09:54 PM UTC
Confidence Status Last Update:: 01/Oct/19 7:25 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates