[SERVER-42827] Allow sessions collection to return OK for creating indexes if at least one shard returns OK and others return CannotImplicitlyCreateCollection Created: 15/Aug/19  Updated: 29/Oct/23  Resolved: 12/Sep/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.13, 4.2.0-rc8, 4.0.12
Fix Version/s: 4.3.1, 4.2.6

Type: Bug Priority: Major - P3
Reporter: Blake Oler Assignee: Janna Golden
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-46745 Failed to update config.system.sessio... Closed
Duplicate
is duplicated by SERVER-46745 Failed to update config.system.sessio... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Sprint: Sharding 2019-09-23
Participants:
Case:

 Description   
Issue Status as of April 13, 2020

ISSUE DESCRIPTION AND IMPACT

This bug prevents config server replica set primaries from creating new sessions, resulting in a loss of availability of the node after an internal limit of 1 million sessions are created. Unfortunately, this limit is eventually reached independent of the number of active sessions at any given point.

The impact occurs because the bug prevents the node from refreshing its in-memory logical session cache from the persisted config.system.sessions collection. This failure to refresh prevents the lastUsed TTL index on config.system.sessions from removing session records, and makes reaching 1 million in-memory sessions likely.

Ultimately, the underlying cause is that TTL index creation failure on shards that do not contain config.sessions chunks ends up halting the synchronization process that refreshes the in-memory logical session cache.

DIAGNOSIS AND AFFECTED VERSIONS

MongoDB versions 4.2.0 to 4.2.5 are impacted by this bug. Signs that the issue is occurring include:

  • Sharding commands fail
  • Chunk migrations fail
  • Inability to access the config server primary

REMEDIATION AND WORKAROUNDS

To remediate loss of availability, kill and re-start the config server primary and allow replica set failover to reset the session cache on the new primary.

Setting maxSessions to a higher number than the default of 1 million can delay the onset of this issue.

FIX VERSIONS

4.2.6

misha.tyulenev, LGTY?

original description

The cluster's createIndexes command allows for cannotImplicitlyCreateCollection to be returned as a valid error as long as one shard succeeds to create the index. This is because shards that don't have the collection locally won't be able to create it. We should change the session collection's createIndexes call to match this behavior.



 Comments   
Comment by Githook User [ 08/Apr/20 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com', 'username': 'mikety'}

Message: SERVER-42827 Allow sessions collection to return OK for creating indexes if at least one shard returns OK and others return CannotImplicitlyCreateCollection
Branch: v4.2
https://github.com/mongodb/mongo/commit/f8f5c3f65176a018aff085066cd60955e9a245c2

Comment by Misha Tyulenev [ 07/Apr/20 ]

This is needed to fix SERVER-46745

Comment by Githook User [ 12/Sep/19 ]

Author:

{'name': 'Janna Golden', 'username': 'jannaerin', 'email': 'janna.golden@mongodb.com'}

Message: SERVER-42827 Allow sessions collection to return OK for creating indexes if at least one shard returns OK and others return CannotImplicitlyCreateCollection
Branch: master
https://github.com/mongodb/mongo/commit/b40e542972c082e85098c09298eb56436bf57abb

Comment by Blake Oler [ 16/Aug/19 ]

Theoretically, this will happen repeatedly until every shard has a notion of the config.system.sessions collection. It will prevent any refreshes from happening until all shards have a notion of said collection.

And yes, it exists solely because it doesn't match the mongos createIndex command path.

Comment by Kaloian Manassiev [ 16/Aug/19 ]

Does this bug exist, because the SessionsCollection implementation for sharding does it's own manual index creation, which doesn't match what the mongos createIndex command path does?

What's the effect if it happens? Is it just some nasty message in the logs and skipping a refresh round because setupCollection failed?

Generated at Thu Feb 08 05:01:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.