[SERVER-37631] Disable logical sessions if FCV is 3.4 Created: 15/Oct/18  Updated: 08/Jan/24  Resolved: 01/Nov/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.8
Fix Version/s: 3.6.9

Type: Bug Priority: Major - P3
Reporter: Misha Tyulenev Assignee: Misha Tyulenev
Resolution: Fixed Votes: 0
Labels: SWCW
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-33763 3.6 drivers fail to communicate with ... Closed
depends on SERVER-36104 LogicalSessions should destroy cache ... Closed
Related
related to MONGOID-5113 Fix session specs on 3.6 sharded clus... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2018-10-22, Sharding 2018-11-05
Participants:
Case:

 Description   

The following scenario causes the server to stop accepting sessions.

  • the driver creates logical sessions implicitly assuming the server supports it. Per offline chat with behackett the driver implicitly starts a session on the server iff the isMaster has a logicalSessionTimeoutMinutes set (https://github.com/mongodb/specifications/blob/master/source/sessions/driver-sessions.rst) and mongos 3.6 unconditionally returns that value in isMaster because it has no concept of FCV
  • the mongos passes the command to mongod
  • mongod creates the session in the cache but fails to add it to a collection because it does not have the config.systme.sessions collection with FCV 3.4
  • without config.sessions.collection sessions are not expired and eventually hit the max of the activeSessionsCount = 1,000,000

Suggested Fix

The gist of the fix is mongos should rely on sessions collection existence on the config server to return the logicalSessionTimeoutMinutes in isMaster and handling explicit sessions operations.

  • every refreshSessions SessionsCollection needs to detect if the sessions collection exists and set the corresponding member
  • add a method to LogicalSessionsCache::hasSessionsTable that retruns SessionsCollection::hasSessionsTable
  • run LogicalSessionsCache::refreshNow on startup
  • add logicalSessionTimeoutMinutes to isMaster if (FCV ==3.6) && LogicalSessionsCache::hasSessionsTable()
  • reject explicit sessions operations if !LogicalSessionsCache::hasSessionsTable()

FCV update 3.6 to 3.4



 Comments   
Comment by Githook User [ 19/Mar/19 ]

Author:

{'email': 'mbroadst@gmail.com', 'name': 'Matt Broadstone', 'username': 'mbroadst'}

Message: test: fix sessions tests on 3.6 sharded clusters

SERVER-37631 requires that 3.6 servers have a command run on them
to report `logicalSessionsTimeoutMinutes` on connected proxies.
Branch: master
https://github.com/mongodb/node-mongodb-native/commit/7d4c88adc22a3b5c212f47870b8a8783b12997c1

Comment by Githook User [ 19/Mar/19 ]

Author:

{'email': 'mbroadst@gmail.com', 'name': 'Matt Broadstone', 'username': 'mbroadst'}

Message: test: fix sessions tests on 3.6 sharded clusters

SERVER-37631 requires that 3.6 servers have a command run on them
to report `logicalSessionsTimeoutMinutes` on connected proxies.
Branch: fix-sharded-36-tests
https://github.com/mongodb/node-mongodb-native/commit/311c3fabc18a152e58810ba4ca4eeb0822dbd793

Comment by Matt Broadstone [ 18/Mar/19 ]

I'm also seeing the same behavior when running integration tests for the node driver, is there any plan to correct this in future versions?

Comment by Shane Harvey [ 08/Feb/19 ]

When starting 3.6.9/3.6.10 sharded clusters I had to make a change to mongo-orchestration to run the refreshLogicalSessionCacheNow command on the config server and again on each mongos in order for the mongoses to correctly report logicalSessionTimeoutMinutes in their isMaster responses.

Without running refreshLogicalSessionCacheNow in this way, the mongoses only start reporting logicalSessionTimeoutMinutes 5 minutes after adding the shard. Is this working as designed? Why does is take 5 minutes for the cluster to decide it supports sessions?

Comment by Githook User [ 01/Nov/18 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com', 'username': 'mikety'}

Message: SERVER-37631: disable logical sessions if FCV is 3.4
Branch: v3.6
https://github.com/mongodb/mongo/commit/0d8a9736bc458eb8b523ed5c50c63c4ddb1e6b4e

Comment by Misha Tyulenev [ 22/Oct/18 ]

kaloian.manassiev thanks for the feedback:
1. Yes its mongos logic
2. If the sessions table disappears it's an unspecified behavior - there is no code in mongodb deleting config.system.sessions collection. It will not be disruptive for the existing connections.
3. mongos waits for ShardRegistry and signing keys, Ill add the code to the initMongosServer.
4. I think mongos uses binary version for FCV, so its always 3.6. Mongod does not need an extra logic to handle the FCV settings.

Comment by Kaloian Manassiev [ 18/Oct/18 ]

The approach sounds good, but I want to ask just a couple of clarifications.

The logic that you are describing will only run on MongoS, right?

run LogicalSessionsCache::refreshNow on startup

This will run before MongoS starts accepting connections, right? Otherwise at MongoS restart, a driver which happened to work might suddenly stop and/or have undesired side effects. If for some reason refreshNow fails we will not fail startup, right - just continue trying?

Also, do we have precedent before that where we do reads from MongoS before opening the connections - maybe the ShardRegistry? Just asking because that would increase the MongoS startup time, but I think it is fine to do that. Just something to keep in mind with the overall push towards "less impactful restarts/upgrades".

add logicalSessionTimeoutMinutes to isMaster if (FCV ==3.6) && LogicalSessionsCache::hasSessionsTable()

MongoS doesn't have any concept of FCV. Do you mean that this would be the logic on the MongoD side? Because MongoD already has the first condition.

Generated at Thu Feb 08 04:46:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.