[SERVER-29011] Compact Calls to WiredTiger take multiple overlapping WT_SESSION objects Created: 28/Apr/17  Updated: 30/Oct/23  Resolved: 04/May/17

Status: Closed
Project: Core Server
Component/s: Storage, WiredTiger
Affects Version/s: 3.5.7
Fix Version/s: 3.4.6, 3.5.7

Type: Bug Priority: Major - P3
Reporter: David Hows Assignee: David Hows
Resolution: Fixed Votes: 0
Labels: bkp
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4
Steps To Reproduce:

Instrument code to show all session take/return calls.
Run a compact.
Track when each session is taken vs returned.

Sprint: Storage 2017-05-08, Storage 2017-05-29
Participants:
Linked BF Score: 0

 Description   

We have seen build failures with a stuck cache when running the FSM suite and compact tasks are in flight.

Diving into the issue, it appears that the compact operation runs over multiple WT_SESSION objects. A first session, with an "empty" transaction is opened when the command is in the early stages, then subsequent sessions are taken from the session cache to run compact on the record store and each index.

This can cause problems in testing due to there being a single transaction running for the length of all the compact operations.

There may also be scope here for a more full review of places in the WiredTiger KV Engine that we can go through and find locations that also exhibit this behaviour of opening sessions with transactions that are never used or taking sessions directly from the session cache.



 Comments   
Comment by Githook User [ 15/Jun/17 ]

Author:

{u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'}

Message: SERVER-29011 Don't use side sessions during compacts in the WT KV Engine

(cherry picked from commit 584d4a6a25ce56b07f13247b3ce7fe298b4a111e)
Branch: v3.4
https://github.com/mongodb/mongo/commit/4a92de28ed34f85e190744bda3930f3cdbc85e75

Comment by Githook User [ 02/May/17 ]

Author:

{u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'}

Message: SERVER-29011 Don't use side sessions during compacts in the WT KV Engine
Branch: master
https://github.com/mongodb/mongo/commit/584d4a6a25ce56b07f13247b3ce7fe298b4a111e

Comment by David Hows [ 28/Apr/17 ]

After some discussion I have set the scope at looking at the slower WiredTiger session methods, compact and truncate.

I had initially considered create and drop, but create operations (on a record store at least) are always within a WUOW. Drops face similar issues, as the opCtx is not currently plumbed down to the level where we perform all the drop operations.

I had also considered looking at salvage, verify and checkpoint. These three had issues with access of opCtx objects as well. With salvage and verify having the potential to be used at the instanciation of the WT KV Engine and checkpoint being run by durability code.

Comment by David Hows [ 28/Apr/17 ]

As noted, I found that we take extra sessions from the WiredTiger session cache to perform compact operations. I'm currently testing a change where we would take these sessions from the opCtx/recoveryUnit and then close the automatically opened txn (with abandonSnapshot).

Generated at Thu Feb 08 04:19:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.