[SERVER-61909] Hang inserting or deleting document with large number of index entries Created: 03/Dec/21  Updated: 29/Oct/23  Resolved: 22/Nov/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.2.0-rc3, 6.3.0-rc0, 6.0.5, 5.0.16

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Yujin Kang Park
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on WT-9879 Fix overcounting of session txn dirty... Closed
depends on WT-10027 Session txn dirty statistic is incorrect Closed
depends on SERVER-68739 Add WiredTiger session statistic with... Closed
depends on WT-8848 Add API to roll back and indicate tha... Closed
is depended on by SERVER-69480 TransientTransactionError label poten... Closed
is depended on by COMPASS-6311 Investigate changes in SERVER-61909: ... Closed
is depended on by TOOLS-3223 Investigate changes in SERVER-61909: ... Closed
Documented
is documented by DOCS-15751 [Server] Hang when inserting or delet... Closed
Problem/Incident
Related
related to SERVER-71750 Revert refactor into handleWriteConfl... Closed
related to SERVER-71751 Skip transaction_too_large_for_cache.... Closed
related to WT-8290 Adding a new API to the session to re... Closed
is related to SERVER-60839 Introduce a TemporarilyUnavailable er... Closed
is related to SERVER-61454 Change retry policy when txns are rol... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Requested:
v6.2, v6.0, v5.0
Sprint: Execution Team 2022-10-17, Execution Team 2022-10-31, Execution Team 2022-11-14, Execution Team 2022-11-28
Participants:
Case:
Linked BF Score: 164

 Description   

Inserting a document that creates a large number of index entries can create a large amount of dirty data in a single transaction, causing it to be canceled and retried indefinitely, resulting in a hang.

For example on a node with a 256 MB cache, create a text index then insert a document with a large string to be indexed, or equivalently a lot of terms to be indexed:

function repro() {
 
    db.c.drop()
    printjson(db.c.createIndex({x: "text"}))
    
    doc = {x: []}
    for (var j = 0; j < 50000; j++)
        doc.x.push("" + Math.random() + Math.random())
 
    for (var i = 0; i < 20; i++) {
        start = new Date()
        db.c.insert(doc)
        print(new Date() - start, "ms")
    }
}

This will hang after a few documents, with high cache pressure, and the following emited repeatedly in the log:

{"t":\{"$date":"2021-12-03T11:43:20.820-05:00"},"s":"I", "c":"STORAGE", "id":22430, "ctx":"conn21","msg":"WiredTiger message","attr":\{"message":"oldest pinned transaction ID rolled back for eviction"}}

This will effectively make the server inoperational due to cache pressure. If it occurs on the secondaries they will stall because it will prevent completion of the current batch.

This is a regression as these inserts complete successfully (even if somewhat slowly) in 4.2.

I think this is related to SERVER-61454, but I'm opening this as a distinct ticket because

  • This is a somewhat different use case as the issue can be reliably created with single inserts.
  • I don't think the change described in SERVER-61454 would apply here, as the insert is the only transaction running so delaying retries would have no effect, and the issue is not related to CPU resource starvation as far as I can tell.
  • It's not clear to me where the appropriate fix would lie - query layer, retry behavior, storage engine behavior.


 Comments   
Comment by Githook User [ 20/Mar/23 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-61909 Abort operation if it cannot ever fit in WT cache

(cherry picked from commit 9ddbd3512044ca6508cb72d9a321391d45c18140)
Branch: v5.0
https://github.com/mongodb/mongo/commit/07cdae59ad2a613acc3a77e9491fa56ddcb78df3

Comment by Githook User [ 06/Feb/23 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-61909 Abort operation if it cannot ever fit in WT cache

(cherry picked from commit 9ddbd3512044ca6508cb72d9a321391d45c18140)
Branch: v6.0
https://github.com/mongodb/mongo/commit/05c9f930eaaae18b8abb7058aacc93fe0549aadf

Comment by Githook User [ 12/Dec/22 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-61909 Fix transaction_too_large_for_cache.js to requires_fcv_62
Branch: v6.2
https://github.com/mongodb/mongo/commit/5f3ed635a7232f47bdcc877ff75a0c5895f0625c

Comment by Githook User [ 02/Dec/22 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-61909 Abort operation if it cannot ever fit in WT cache

(cherry picked from commit 9ddbd3512044ca6508cb72d9a321391d45c18140)
Branch: v6.2
https://github.com/mongodb/mongo/commit/9ca8158531bf786e374275598cde7e972910827e

Comment by Yujin Kang Park [ 22/Nov/22 ]

SERVER-61909 has been completed, and a non-retryable TransactionTooLargeForCache error been introduced. This error is thrown when a transaction rolls back and is found to be using a large portion (75% by default) of the total dirty cache. Since the dirty cache limit is 20% of the total cache, this means that the largest transactions may only occupy 15% of the total size of the storage engine cache.

Some more details can be found here:
https://github.com/10gen/mongo/blob/master/src/mongo/db/catalog/README.md#transactiontoolargeforcacheexception

Comment by Githook User [ 21/Nov/22 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-61909 Abort operation if it cannot ever fit in WT cache
Branch: master
https://github.com/mongodb/mongo/commit/9ddbd3512044ca6508cb72d9a321391d45c18140

Comment by Yujin Kang Park [ 28/Oct/22 ]

Adding dependency on SERVER-68739, as session statistics are reset after each operation within a multi-document transaction, checking the txn bytes dirty statistic on WT_ROLLBACK would report only the txn bytes dirty for the last operation, not the whole transaction.

Comment by Josef Ahmad [ 20/Sep/22 ]

I'm going to re-evaluate this ticket once WT-9879 is implemented, which should resolve a source of over-reporting in the API introduced by WT-8848.

Comment by Josef Ahmad [ 17/Feb/22 ]

I'm making this ticket dependent on WT-8848, which proposes a guard-rail on the maximum amount of dirty data that can be pinned by a transaction and a specific code for rolling back due to exceeding that limit. This specific WT code will inform the server that the operation just won't fit in cache - and there's no point in retrying it - and will allow the server to return an informative MongoDB error code to the client (ExceededMemoryLimit may be a candidate). On secondaries, we might consider retrying internally with a slightly higher dirty data ceiling to cater for asymmetrically sized replica sets.

Comment by Bruce Lucas (Inactive) [ 09/Dec/21 ]

This can also occur deleting such a document.

Comment by Eric Milkie [ 03/Dec/21 ]

Ultimately, the project to permit uncommitted writes to be evicted will solve this. I don't have good ideas on how to avoid this problem in the interim, other than turning off or tuning the oldest-transaction-rolled-back behavior. The issue here is that because that rollback behavior is asynchronous and non-deterministic, it allows situations where the transaction manages to commit on a primary node but the equivalent replicated transactions on secondaries always get rolled back.

Generated at Thu Feb 08 05:53:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.