[SERVER-39420] Remove in-memory boolean to indicate config.server.sessions collection set up Created: 07/Feb/19  Updated: 29/Oct/23  Resolved: 11/Apr/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.0.6
Fix Version/s: 3.6.13, 4.0.10, 4.1.11

Type: Bug Priority: Major - P3
Reporter: Danny Hatcher (Inactive) Assignee: Blake Oler
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Duplicate
is duplicated by SERVER-39044 Cannot add session into the cache err... Closed
is duplicated by SERVER-36904 Fuzzer drops config.system.sessions a... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Steps To Reproduce:

1. Launch a sharded cluster
2. Wait for the config.system.sessions collection to be created
3. Drop that collection

Sprint: Sharding 2019-04-08, Sharding 2019-04-22
Participants:
Case:
Linked BF Score: 52

 Description   

Problem Statement

config.system.sessions is not automatically recreated after a drop in 4.0.

Proposed Solution

  1. Create a method in SessionsCollection named onSessionsCollectionDropped. This method would be a no-op everywhere except for the config server child class, where we would flip the _collectionSetUp boolean to false.
  2. Expose the SessionsCollection to outside callers through a new LogicalSessionCache method. If we decide this isn't alright, we can just mirror the sessions collection with onSessionsCollectionDropped.
  3. Hook into the config server opObserver onCollectionDropped to get the SessionsCollection and call onSessionsCollectionDropped.
  4. Create a ticket to consider making the LogicalSessionCache or SessionsCollection its own opObserver.

Rejected Solutions

  1. Create an opObserver for the SessionsCollection. We think that this is overkill for the current bugfix.


 Comments   
Comment by Githook User [ 16/Apr/19 ]

Author:

{'email': 'blake.oler@mongodb.com', 'name': 'Blake Oler', 'username': 'BlakeIsBlake'}

Message: SERVER-39420 Remove in-memory boolean to indicate config.server.sessions collection set up

(cherry picked from commit 2c20db31fcd6a2a9ac02506d55794f9b234af0a6)
Branch: v4.0
https://github.com/mongodb/mongo/commit/fbf88f57c0b356c665e8fe1a79d546bdebe4cc8d

Comment by Githook User [ 15/Apr/19 ]

Author:

{'email': 'blake.oler@mongodb.com', 'name': 'Blake Oler', 'username': 'BlakeIsBlake'}

Message: SERVER-39420 Remove in-memory boolean to indicate config.server.sessions collection set up

(cherry picked from commit 2c20db31fcd6a2a9ac02506d55794f9b234af0a6)
Branch: v3.6
https://github.com/mongodb/mongo/commit/e85bd255c5e0274c108b078dd210ad46e805ad97

Comment by Githook User [ 11/Apr/19 ]

Author:

{'name': 'Blake Oler', 'username': 'BlakeIsBlake', 'email': 'blake.oler@mongodb.com'}

Message: SERVER-39420 Remove in-memory boolean to indicate config.server.sessions collection set up
Branch: master
https://github.com/mongodb/mongo/commit/2c20db31fcd6a2a9ac02506d55794f9b234af0a6

Comment by Gregory McKeon (Inactive) [ 02/Apr/19 ]

misha.tyulenev

Comment by Blake Oler [ 02/Apr/19 ]

jack.mulrow 4misha@gmail.com Can I get an LGTM on this solution?

Comment by Kaloian Manassiev [ 15/Feb/19 ]

While dropping the config.system.sessions collections is not something we "support", we often ask customers to do it in order to work around bugs in the LogicalSessionCache.

Per jack.mulrow, we can fix this by listening for collection drop on the config server primary since it intercepts sharded collection drops.

Comment by Blake Oler [ 07/Feb/19 ]

Diagnosis

An in-memory boolean _collectionSetUp exists on the config server's session collection class. This in-memory boolean becomes true upon the first set-up of the sessions collection, indicating that the collection has been set up. Whenever we run the logical session cache's periodic refresh, we will attempt to set up the collection, in case it doesn't exist at the time of the refresh. Unfortunately, this same boolean prevents any recovery from a dropped sessions collection for the entire duration that the config server is running. In a single-node replica set config server, this is fine. We only have to restart the config server, thus clearing the in-memory state, to have the collection recreated.

The problem compounds with a multi-node replica set – all nodes in a config server replica set will set this boolean _collectionSetUp to true, as they all run the refresh, and will all "see" that the sessions collection exists. If we restart the primary node in an attempt to reset the sessions collection, another node will take over as primary. The node that has taken over as primary saw from before that the sessions collection was set up, and will therefore never attempt to recreate the collection. Luckily, we have an escape hatch here – that restarted now-secondary node will successfully recognize that the sessions collection is not set up. However, because it's not secondary, it will not be able to recreate the collection.

Support Solution

Config Server as Single-Node Replica Set

Simply restart the replica set. On the next refresh, the sessions collection will be recreated.

Config Server as Multi-Node Replica Set

  1. Restart any node in the replica set.
  2. Transition the restarted node to primary.
  3. On the next refresh, the sessions collection will be recreated.

Affected Versions

The erroneous boolean _collectionSetUp exists and behaves the same way on all versions of sharded clusters on MongoDB starting with 3.6.

Bug Solution

This will be assessed later this week or early next week.

Comment by Danny Hatcher (Inactive) [ 07/Feb/19 ]

blake.oler 4.0.0-4.0.3 crash the config server when the collection is dropped and then subsequently checked. 4.0.4-4.0.6 no longer crash the config server but still do not recreate the collection as 3.6 does.

Comment by Blake Oler [ 07/Feb/19 ]

Have we confirmed whether this bug exists pre-4.0.6? daniel.hatcher

Generated at Thu Feb 08 04:52:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.