[SERVER-77937] Wiredtiger panic error in batch_writes.js under FCV upgrade/downgrade suite (TSAN variant) Created: 09/Jun/23 Updated: 05/Feb/24 Resolved: 05/Feb/24 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adi Zaimi | Assignee: | Adi Zaimi |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Sprint: | Sharding NYC 2023-06-12, Sharding NYC 2023-06-26, Sharding NYC 2023-07-10, Sharding NYC 2023-07-24, Sharding NYC 2023-08-07, Sharding NYC 2023-08-21, Sharding NYC 2023-09-04, Sharding NYC 2023-09-18, Sharding NYC 2023-10-02, Sharding NYC 2023-10-16, Sharding NYC 2023-10-30, Cluster Scalability 2023-11-13, Cluster Scalability 2023-11-27, Cluster Scalability 2023-12-11, Cluster Scalability 2023-12-25, Cluster Scalability 2024-1-8, Cluster Scalability 2024-1-22, Cluster Scalability 2024-2-5, Cluster Scalability 2024-2-19 |
| Participants: |
| Description |
|
batch_writes.js fails with wiredtiger error message: "WiredTiger error message","attr": } [j6:prim] | 2023-06-02T16:00:41.567+00:00 E WT 22435 [thread341] "WiredTiger error message","attr": |
| Comments |
| Comment by Alexander Gorrod [ 16/Jun/23 ] | ||||||||||||||||||||
|
WiredTiger has many different things that are called "pages", and I think your definition of a page being 4k is probably based on a disk page concept. WiredTiger generally uses variable sized pages, with configuration options that give a guideline about how big they should ideally be. The particular page type I'm referring to here is an in-memory page. WiredTiger by default allows in-memory pages to grow up to 10MB before they are split into multiple pages. I say "by default", since for applications that create large documents, the page size can grow larger than that, since we also ensure that there are a minimum number of documents on each page. To further confuse things - WiredTiger is an MVCC engine, so each time a document is updated, and new copy of that document is kept. So: A workload that creates a single (large) document, and then makes updates to the document might be using a lot of space for that in-memory page. And one more bit of relevant information - I believe that the collection size reported by coll.stats() corresponds to the on-disk space used for the collection. Which will be the size needed after compression is applied. Compression is usually effective with large documents - so the on-disk size might not be reflective of the data volume in the collection. | ||||||||||||||||||||
| Comment by Adi Zaimi [ 15/Jun/23 ] | ||||||||||||||||||||
|
Thank you alexander.gorrod@mongodb.com – I have been running the test on repeat for about 20 hours and it has not gotten in the same place. I just realized though that the variant that failed here is the 'TSAN Enterprise RHEL 8.0 DEBUG' so I guess I should really try with that variant. One interesting fact is that the test in question should not cause the collections to be that large – I am dumping coll.stats() for the two collections that are used by this test and reports "size" : 31457330 (31MB) for both collections. To understand better, when you say 'there is a single dirty page with 107MB of updates on it', it means that there are several updates (or simply 107MB of updates) pending to that one page? (I'm confused since a page is of a small (4k?) size, right?) Thanks. | ||||||||||||||||||||
| Comment by Alexander Gorrod [ 14/Jun/23 ] | ||||||||||||||||||||
|
adi.zaimi@mongodb.com - the dump from WiredTiger when the failure happens indicates that there are 5 active sessions - each of those is completing a cursor operation of some sort (search, next, search_near or get_value). I'm not sure where those sessions are coming from. If you can reproduce locally in a debugger you might be able to set a breakpoint where the failure happens in WiredTiger and figure out what those sessions are doing in MongoDB? | ||||||||||||||||||||
| Comment by Adi Zaimi [ 13/Jun/23 ] | ||||||||||||||||||||
|
Hi alexander.gorrod@mongodb.com , I'm trying to understand what might be going on here–under what circumstance would the pages be pinned and unable to unpin to proceed with the workload? Are we running out of space? Does the drop not clean up? This test is supposed to downgrade FCV which results in all mongodb transactions to be aborted. Not sure how that might interact with wiredtiger in this test. | ||||||||||||||||||||
| Comment by Adi Zaimi [ 13/Jun/23 ] | ||||||||||||||||||||
|
I seems there are 6 drops on collection "test.batch_writes_out" (see but in my testing I get about twice that number, suggesting that we are stuck inside the aggregation stage in the test. It is possible that the aggregation stage wrote 107MB of data like Alexander is stating above. I don't understand how that may be related to "setFeatureCompatibilityVersion command called" received at '2023-06-02T15:54:35.115+00:00' a full 6s before the Wiredtiger event (2023-06-02T16:00:41.495+00:00) – maybe that's the timeout that wiredtiger uses to fail. | ||||||||||||||||||||
| Comment by Adi Zaimi [ 13/Jun/23 ] | ||||||||||||||||||||
|
Thank you alexander.gorrod@mongodb.com for the explanation above. The test in question (./jstests/aggregation/sources/merge/batch_writes.js) inserts two large (15MB) documents to a collection and then performs an agregation into an output collection several times (while dropping that output collection after each aggregation), and then after dropping all collections in the test, inserts few tiny entries to the original collection. | ||||||||||||||||||||
| Comment by Alexander Gorrod [ 12/Jun/23 ] | ||||||||||||||||||||
|
That error message means that WiredTiger has been unable to find any pages to evict for a configured timeout value. The default wait time is 5 minutes - so when it fires the cache is usually genuinely stuck. There are three different types of content that can cause the WiredTiger cache to get stuck: clean, dirty, updates. In this case it's the updates threshold that is causing the problem:
The default threshold for updates is half the dirty threshold, which is 20% of the cache size. Content is usually pinned by active transactions/operations, in this case there aren't many:
None of them have done any updates (by this time WiredTiger has usually forced update transactions to abort). There are 5 threads, and it's unusual that they pin 105MB worth of non-dirty content in cache. The two sources of pinned content tend to be uncommitted transactions and sessions with positioned cursors. There is one collection that has the relevant content:
So there is a single dirty page with 107MB of updates on it. That page probably can't be evicted for some reason (likely one of the active sessions has a reference to it). Could the test/workload be generating that situation? |