[SERVER-57737] The index builds code path can throw WCEs and invalidate an active cursor by calling abandonSnapshot without calling save/restore cursor. Created: 16/Jun/21  Updated: 29/Oct/23  Resolved: 09/Aug/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.3, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
causes SERVER-60451 Index builds code can access an inval... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0, v4.4
Sprint: Execution Team 2021-07-12, Execution Team 2021-07-26, Execution Team 2021-08-09, Execution Team 2021-08-23
Participants:
Linked BF Score: 118

 Description   

TLDR; the MultiIndexBlock::_doCollectionScan runs a while-loop that iterates a collection cursor and inserts once document at a time into the index build bulk loaders. The write can throw a WCE, which calls abandonSnapshot, resetting the active cursor. The cursor must be saved and restored around WCE handling. We hopefully can minimize save/restore to only when this code is run to handle constraint violations writes, where the WCE is occurring, to save performance. Should probably check that there aren't any other write paths in there that can throw.

----------------------------------------------

Copied from the test failure diagnosis:

----------------------------------------------

The invariant stack trace goes through the MultiIndexBlock::_doCollectionScan() function that calls getNext here. The cursor is reading from the collection in a while-loop, and as each document is read it gets inserted into the index builders tables. The inserts go through the bulk loader, through the AbstractIndexAccessMethod ::BulkBuilderImpl::insert. The BulkBuilder has special handling for documents that violate index constraints, writing them to a side table. The constraints violations side table does a write here that can throw a WCE and retry. Throwing the WCE causes abandonSnapshot() to be called. Lastly, Louis and I think that, since the cursor is not saved and restored around the insert, the abandonSnapshot() call must reset the cursor internally in WT.

The hybrid index build logic doesn't do a save/restore cursor around the inserts, unlike the old index build code. This was a significant performance improvement, as saving and restoring the cursor is very costly. We clearly need some of the saves/restores back, but Louis suggested limiting those calls to only when writing to the constraints violations side table. It is not expected that there will be maybe violation constraints during an index build, except in pathological cases.

We could pass the cursor down to this logic that writes out to the side table – or somewhere lower, even. Louis suggested maybe passing down a lambda so as to avoid adding linking / code layer violation to the lower level code. Given the perf gains when the save/restore calls were removed with hybrid builds, we should probably sacrifice code niceties for performance. Other ideas might be passing flags down to recognize when the WCE+abandonSnapshot() has triggered, so we could save and use the high level record id before and after this call,, using the record id to forward the cursor back to where we need – that's what save/restore does, save the record ID on save and seek for the record ID on restore.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 20/Aug/21 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}

Message: SERVER-57737 Index builds must save and restore the collection scanning cursor around writes to the
violated index key constraints side table in case the write throws and unpositions the read cursor.

(cherry picked from commit 05f055aad47b6afa3902ec4c11925c848f9cd534)
Branch: v5.0
https://github.com/mongodb/mongo/commit/853ef0de7a7adfe2b7e36d96d1ae89791ef41832

Comment by Githook User [ 09/Aug/21 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}

Message: SERVER-57737 Index builds must save and restore the collection scanning cursor around writes to the
violated index key constraints side table in case the write throws and unpositions the read cursor.
Branch: master
https://github.com/mongodb/mongo/commit/05f055aad47b6afa3902ec4c11925c848f9cd534

Generated at Thu Feb 08 05:42:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.