[SERVER-38779] Build a mechanism to periodically cleanup old WT sessions from session cache Created: 27/Dec/18  Updated: 29/Oct/23  Resolved: 16/Jan/19

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.6.11, 4.0.6, 4.1.7

Type: Improvement Priority: Major - P3
Reporter: Sulabh Mahajan Assignee: Sulabh Mahajan
Resolution: Fixed Votes: 4
Labels: RF
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-39355 Collection drops can block the server... Closed
is related to WT-4336 With MongoDB, sweep attempts after ta... Closed
is related to WT-4513 Investigate improvements in session's... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.0, v3.6
Sprint: Storage Engines 2018-12-31, Storage Engines 2019-01-14, Storage Engines 2019-01-28
Participants:
Case:
Story Points: 8

 Description   

The way session cache is maintained, idle sessions keep accumulating in the session cache. If the workload doesn't use all the idle sessions, the oldest sessions stay open forever. In some cases these sessions might hold some resources inside WiredTiger, which can cause problems. eg: dhandles that never close in WiredTiger.

This ticket is to build a mechanism around the session cache, to cleanup old sessions that have been idle for too long.

More details in the linked tickets.



 Comments   
Comment by Sulabh Mahajan [ 05/Apr/19 ]

bruce.lucas, WT session cache is NOT related to the WT cursor cache. I looked at 3.4 and we do have a WT session cache in there. The change here introduces a mechanism to close WT sessions when they have been idle for some time, mainly to encourage closing of active dhandles.

Comment by Bruce Lucas (Inactive) [ 04/Apr/19 ]

sulabh.mahajan, can you clarify - is the wt session cache mentioned here related to the wt cursor caching mechanism that was introduced into 4.0 (and then backported to 3.6), or is this something different? In particular, could the session cache issue described here affect 3.4?

Comment by Sulabh Mahajan [ 03/Mar/19 ]

Note:
As the per final change the default for the parameter wiredTigerSessionCloseIdleTimeSecs is 5 mins.

Comment by Sulabh Mahajan [ 31/Jan/19 ]

Update on backports:

Backport to 4.0: Completed
Backport to 3.6: Completed

Comment by Githook User [ 31/Jan/19 ]

Author:

{'name': 'Sulabh Mahajan', 'email': 'sulabh.mahajan@mongodb.com', 'username': 'sulabhM'}

Message: SERVER-38779 Have a session sweep job to close old idle WT sessions

(cherry picked from commit 97de2142f89ab280a4d0b2ddf168248c79f741d0)
Branch: v3.6
https://github.com/mongodb/mongo/commit/49cbe21f9a7cd0175aa6db3cd82035c44c7b97cd

Comment by Githook User [ 31/Jan/19 ]

Author:

{'name': 'Sulabh Mahajan', 'email': 'sulabh.mahajan@mongodb.com', 'username': 'sulabhM'}

Message: SERVER-38779 Add clocksource to the encryptdb unit test

(cherry picked from commit fb7a077a7a7d70aba2ca28564af0b0ccbee1f7fb)
Branch: v3.6
https://github.com/10gen/mongo-enterprise-modules/commit/5254eb29755a5d976ba2de8b96597ec99e749a5b

Comment by Githook User [ 23/Jan/19 ]

Author:

{'email': 'sulabh.mahajan@mongodb.com', 'name': 'Sulabh Mahajan', 'username': 'sulabhM'}

Message: SERVER-38779 Have a session sweep job to close old idle WT sessions

(cherry picked from commit 97de2142f89ab280a4d0b2ddf168248c79f741d0)
Branch: v4.0
https://github.com/mongodb/mongo/commit/b87f723dceb3f2b1bbfe12ebf23b6cae21144e9e

Comment by Githook User [ 23/Jan/19 ]

Author:

{'username': 'sulabhM', 'email': 'sulabh.mahajan@mongodb.com', 'name': 'Sulabh Mahajan'}

Message: SERVER-38779 Add clocksource to the encryptdb unit test

(cherry picked from commit fb7a077a7a7d70aba2ca28564af0b0ccbee1f7fb)
Branch: v4.0
https://github.com/10gen/mongo-enterprise-modules/commit/d517409e38a30b07564743eb879142dbc38109d5

Comment by Sulabh Mahajan [ 21/Jan/19 ]

Updates on backports:

Backport to 4.0: Approved - cherry picking and testing the backport now.
Backport to 3.6: Hold till we get results from backporting to 4.0

Comment by Jackie Chu [ 16/Jan/19 ]

Excellent! Thank you!

Comment by Sulabh Mahajan [ 16/Jan/19 ]

chujacky1128,

I have initiated the backport process to 4.0 and 3.6, the team will go over the change and if there are no concerns in backporting, we will get it done.

Also, would killing the session manually (https://docs.mongodb.com/manual/release-notes/3.6/#server-sessions) release the dhandles/memory?

Killing the server sessions will not help here. This change handles cleanup of internal storage engine sessions that aren't exposed to the MongoDB user.

Comment by Githook User [ 16/Jan/19 ]

Author:

{'username': 'sulabhM', 'email': 'sulabh.mahajan@mongodb.com', 'name': 'Sulabh Mahajan'}

Message: SERVER-38779 Have a session sweep job to close old idle WT sessions
Branch: master
https://github.com/mongodb/mongo/commit/97de2142f89ab280a4d0b2ddf168248c79f741d0

Comment by Githook User [ 15/Jan/19 ]

Author:

{'username': 'sulabhM', 'email': 'sulabh.mahajan@mongodb.com', 'name': 'Sulabh Mahajan'}

Message: SERVER-38779 Add clocksource to the encryptdb unit test
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/fb7a077a7a7d70aba2ca28564af0b0ccbee1f7fb

Comment by Jackie Chu [ 04/Jan/19 ]
  1. My team's project has been impacted by this issue (dhandles leak after table drops), and we have been periodically restarting the mongodb as temporary solution. We are running in version 3.4 (and planning to move to 3.6). I am wondering if there's plan to also add this mechanism/fix to version 3? 
  2. Also, would killing the session manually (https://docs.mongodb.com/manual/release-notes/3.6/#server-sessions) release the dhandles/memory?
Comment by Sulabh Mahajan [ 27/Dec/18 ]

Proposal:

  • Have a session sweeper background job that would periodically run (say every 10 secs) and close WT sessions that are older than idle timeout value (say default of 100000 secs = 27.78 hours 300 secs = 5 mins).
  • Timeout will be configurable using --setParameter wiredTigerSessionCloseIdleTimeSecs=<value>
  • A value of 0 disables clearing of sessions and restores the current behavior as it is.
Generated at Thu Feb 08 04:50:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.