[SERVER-27345] reclaim inactive WT sessions on calling closeAllCursors Created: 08/Dec/16  Updated: 06/Feb/17  Resolved: 17/Jan/17

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.4.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Geert Bosch Assignee: Geert Bosch
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-23131 validate() can return EBUSY on WiredT... Closed
is depended on by SERVER-27347 Only close idle cached cursors on the... Closed
Duplicate
duplicates WT-1228 Improve performance of WT_SESSION::op... Closed
Related
is related to SERVER-21004 Implement wait objects that support o... Closed
Operating System: ALL
Sprint: Storage 2017-01-23
Participants:

 Description   

There is a problem in the way we're caching WiredTigerCursor and WiredTigerSession objects. We cache cursors in each session, and also cache sessions globally. Most operations are short-lived in MongoDB, so that tends not to be a big problem: when we get a WT_BUSY, we go through the cached WiredTigerSession objects and close cached cursors. We also increment the cursor epoch, so that sessions returned subsequently also have their cursors closed. However, we have some internal clients that have very long-running operations: they could run forever. That way, they may have cached cursors that they never give back to the cache, and that will not be freed. This is especially an issue if the session it gets has been used by a client, such as "initandlisten", that may have cursors for thousands of tables. See SERVER-26870.

This problem is exacerbated by the interruptible sleep improvements (SERVER-21004), as those result in more OperationContext objects, each a possible WiredTigerSession object, in use while sleeping.

So, implement a way for the WiredTigerSessionCache to reclaim cursors from inactive sessions still owned by a WiredTigerRecoveryUnit. Locally cached MODE_IX and MODE_IS locks have a similar issue, so we may be able to use the same solution.



 Comments   
Comment by Geert Bosch [ 17/Jan/17 ]

The motivating issue for this ticket, avoiding holding on to cursors that may interfere with drop and verify operations, will likely be resolved more completely by WT-1228. So closing this as duplicate.

Comment by Eric Milkie [ 12/Jan/17 ]

Note that validate() would particularly benefit from improvements here (SERVER-23131). Right now, our testing doesn't necessarily check all data in some tests, due to this.

Generated at Thu Feb 08 04:14:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.