Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Works as Designed
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.4.9
Component/s: None
Labels:
- Crash
- WiredTiger

Operating System:
ALL
Steps To Reproduce:

Hide

I'm afraid a straightforward repro case will have to wait for the release of LazyFS–hopefully in a few weeks–but if you have the opportunity to maybe stub out the loss of un-fsynced writes yourself, that might be one possible way to reproduce this!

Show
I'm afraid a straightforward repro case will have to wait for the release of LazyFS–hopefully in a few weeks–but if you have the opportunity to maybe stub out the loss of un-fsynced writes yourself, that might be one possible way to reproduce this!
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

When mongod 4.4.9's data directory loses writes which were not explicitly fsynced, attempts to restart mongod can throw WT_PANIC during the WAL recovery process.

LazyFS is a new FUSE filesystem (presently unreleased, but we hope to publish it shortly) which simulates the effects of power loss by maintaining an in-memory page cache for all writes, and only flushing pages to its backing store when explicitly asked to fsync. We've integrated LazyFS into Jepsen by mounting /var/lib/mongodb on LazyFS, and after killing mongod, we ask LazyFS to drop any unfsynced writes. Here's an example of this test in action. Nodes n1, n3, and n6 all panicked on restart. Here's n3 beginning recovery:

{"t":{"$date":"2022-05-05T10:59:26.822-04:00"},"s":"W",  "c":"STORAGE",  "id":22302,   "ctx":"initandlisten","msg":"Recovering data f
rom the last clean checkpoint."}
...
 {"t":{"$date":"2022-05-05T10:59:59.581-04:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger messag
e","attr":{"message":"[1651762799:581115][1200823:0x7f257df12cc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 1 through 
2"}}
{"t":{"$date":"2022-05-05T11:00:00.043-04:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger messag
e","attr":{"message":"[1651762800:43098][1200823:0x7f257df12cc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 2 through 2
"}}
{"t":{"$date":"2022-05-05T11:00:01.001-04:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger messag
e","attr":{"message":"[1651762801:623][1200823:0x7f257df12cc0], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Main reco
very loop: starting at 1/92800 to 2/256"}}

Roughly 35 seconds later, the process crashed, complaining of pread failing to read a byte at a specific offset:

{"t":{"$date":"2022-05-05T11:00:01.475-04:00"},"s":"E",  "c":"STORAGE",  "id":22435,   "ctx":"initandlisten","msg":"WiredTiger error","attr":{"error":-31802,"message":"[1651762801:475745][1200823:0x7f257df12cc0], file:WiredTiger.wt, WT_CURSOR.next: __posix_file_read, 428: /var/lib/mongodb/WiredTiger.turtle: handle-read: pread: failed to read 1 bytes at offset 1456: WT_ERROR: non-specific WiredTiger error"}}
{"t":{"$date":"2022-05-05T11:00:01.475-04:00"},"s":"E",  "c":"STORAGE",  "id":22435,   "ctx":"initandlisten","msg":"WiredTiger error","attr":{"error":-31809,"message":"[1651762801:475901][1200823:0x7f257df12cc0], file:WiredTiger.wt, WT_CURSOR.next: __wt_turtle_read, 387: WiredTiger.turtle: fatal turtle file read error: WT_TRY_SALVAGE: database corruption detected"}}
{"t":{"$date":"2022-05-05T11:00:01.475-04:00"},"s":"E",  "c":"STORAGE",  "id":22435,   "ctx":"initandlisten","msg":"WiredTiger error","attr":{"error":-31804,"message":"[1651762801:475935][1200823:0x7f257df12cc0], file:WiredTiger.wt, WT_CURSOR.next: __wt_turtle_read, 387: the process must exit and restart: WT_PANIC: WiredTiger library panic"}}

Since this is a new filesystem and we haven't sanded off all the bugs yet, I'm not certain whether we haven't introduced some sort of unexpected corruption in the filesystem beyond what would be caused by the loss of unsynced writes–but it does seem odd that the whole process had to crash given at least some of the WAL was apparently readable. Does this sound like expected behavior?

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

mongod.tar.bz2
3.27 MB
May 23 2022 06:32:42 PM UTC

Assignee:: Keith Smith
Reporter:: Kyle Kingsbury
Participants:: Edwin Zhou, Keith Smith, Kyle Kingsbury
Votes:: 0 Vote for this issue
Watchers:: 17 Start watching this issue

Created:: May 05 2022 05:08:49 PM UTC
Updated:: Oct 27 2023 01:52:03 PM UTC
Resolved:: Jun 02 2022 12:04:12 AM UTC

Details

Description

Attachments

Attachments

Forms

Activity

People

Dates