[SERVER-39420] Remove in-memory boolean to indicate config.server.sessions collection set up Created: 07/Feb/19 Updated: 29/Oct/23 Resolved: 11/Apr/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 4.0.6 |
| Fix Version/s: | 3.6.13, 4.0.10, 4.1.11 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Danny Hatcher (Inactive) | Assignee: | Blake Oler |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v4.0, v3.6
|
||||||||||||||||
| Steps To Reproduce: | 1. Launch a sharded cluster |
||||||||||||||||
| Sprint: | Sharding 2019-04-08, Sharding 2019-04-22 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Linked BF Score: | 52 | ||||||||||||||||
| Description |
| Comments |
| Comment by Githook User [ 16/Apr/19 ] |
|
Author: {'email': 'blake.oler@mongodb.com', 'name': 'Blake Oler', 'username': 'BlakeIsBlake'}Message: (cherry picked from commit 2c20db31fcd6a2a9ac02506d55794f9b234af0a6) |
| Comment by Githook User [ 15/Apr/19 ] |
|
Author: {'email': 'blake.oler@mongodb.com', 'name': 'Blake Oler', 'username': 'BlakeIsBlake'}Message: (cherry picked from commit 2c20db31fcd6a2a9ac02506d55794f9b234af0a6) |
| Comment by Githook User [ 11/Apr/19 ] |
|
Author: {'name': 'Blake Oler', 'username': 'BlakeIsBlake', 'email': 'blake.oler@mongodb.com'}Message: |
| Comment by Gregory McKeon (Inactive) [ 02/Apr/19 ] |
| Comment by Blake Oler [ 02/Apr/19 ] |
|
jack.mulrow 4misha@gmail.com Can I get an LGTM on this solution? |
| Comment by Kaloian Manassiev [ 15/Feb/19 ] |
|
While dropping the config.system.sessions collections is not something we "support", we often ask customers to do it in order to work around bugs in the LogicalSessionCache. Per jack.mulrow, we can fix this by listening for collection drop on the config server primary since it intercepts sharded collection drops. |
| Comment by Blake Oler [ 07/Feb/19 ] |
DiagnosisAn in-memory boolean _collectionSetUp exists on the config server's session collection class. This in-memory boolean becomes true upon the first set-up of the sessions collection, indicating that the collection has been set up. Whenever we run the logical session cache's periodic refresh, we will attempt to set up the collection, in case it doesn't exist at the time of the refresh. Unfortunately, this same boolean prevents any recovery from a dropped sessions collection for the entire duration that the config server is running. In a single-node replica set config server, this is fine. We only have to restart the config server, thus clearing the in-memory state, to have the collection recreated. The problem compounds with a multi-node replica set – all nodes in a config server replica set will set this boolean _collectionSetUp to true, as they all run the refresh, and will all "see" that the sessions collection exists. If we restart the primary node in an attempt to reset the sessions collection, another node will take over as primary. The node that has taken over as primary saw from before that the sessions collection was set up, and will therefore never attempt to recreate the collection. Luckily, we have an escape hatch here – that restarted now-secondary node will successfully recognize that the sessions collection is not set up. However, because it's not secondary, it will not be able to recreate the collection. Support SolutionConfig Server as Single-Node Replica SetSimply restart the replica set. On the next refresh, the sessions collection will be recreated. Config Server as Multi-Node Replica Set
Affected VersionsThe erroneous boolean _collectionSetUp exists and behaves the same way on all versions of sharded clusters on MongoDB starting with 3.6. Bug SolutionThis will be assessed later this week or early next week. |
| Comment by Danny Hatcher (Inactive) [ 07/Feb/19 ] |
|
blake.oler 4.0.0-4.0.3 crash the config server when the collection is dropped and then subsequently checked. 4.0.4-4.0.6 no longer crash the config server but still do not recreate the collection as 3.6 does. |
| Comment by Blake Oler [ 07/Feb/19 ] |
|
Have we confirmed whether this bug exists pre-4.0.6? daniel.hatcher |