[DOCS-12839] The Build Indexes on Replica Sets document must include the disableLogicalSessionCacheRefresh=true for standalone instances for MongoDB 3.6+ Created: 30/Jun/19  Updated: 30/Oct/23  Resolved: 08/Jul/19

Status: Closed
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Task Priority: Critical - P2
Reporter: Andrey Brindeyev Assignee: Kay Kim (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DOCS-12857 Audit and update starting repl as sta... Closed
is related to DOCS-8589 Comment on: "manual/reference/method/... Closed
Participants:
Days since reply: 4 years, 31 weeks, 2 days ago
Epic Link: DOCSP-1769

 Description   

Description

  • A sharded cluster.
  • The config.system.sessions collection has only one chunk and is sharded. All other shards don't have the config.system.sessions collection
  • The customer is trying to perform the Build Indexes on Replica Sets procedure according to our documentation by taking out a single node and building indices.
  • When they built all indexes on the secondaries and performed the rs.stepDown() command on the primary, the secondaries crashed with the DBException::toString(): NamespaceNotFound: Failed to apply operation due to missing collection (6e54d489-42cd-4cec-9aed-06e030cdbe3f) error.

Analysis:
1. When a secondary was restarted as a standalone, the new config.system.sessions collection was created with new UUID since MongoDB Shell creates a new session implicitly.
2. Every secondary in the replica set ended up with the config.system.sessions collection with a different UUID.
3. When the primary stepped down, every replica set member crashed (only arbiter and the new primary survived with the Fatal assertion 16359 NamespaceNotFound error, making the shard read-only since the majority was lost.

Resolution: the manual must include the disableLogicalSessionCacheRefresh=true option in the Stop One Secondary and Restart as a Standalone step to prevent the config.system.sessions collection from being created by standalone mongod.

Alternatively, the customer can split the existing chunk for the config.system.sessions collection and shuffle it around across all available shards to pre-create the config.system.sessions collection with a correct UUID to prevent the future errors like this:

mongos> db.adminCommand({moveChunk: "config.system.sessions", find: {_id: MinKey},to:"shard03"})
{
	"ok" : 0,
	"errmsg" : "Data transfer error: Cannot receive chunk [{ _id: MinKey }, { _id: MaxKey }) for collection config.system.sessions because we already have an identically named collection with UUID 5127af6c-3a3f-4990-9669-ea307f795c92, which differs from the donor's UUID 25dfc80c-db79-4f94-81a9-c37c2b438fd6. Manually drop the collection on this shard if it contains data from a previous incarnation of config.system.sessions",
	"code" : 96,
	"codeName" : "OperationFailed",
	"operationTime" : Timestamp(1561927900, 9),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1561927900, 9),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

Scope of changes

Impact to Other Docs

MVP (Work and Date)

Resources (Scope or Design Docs, Invision, etc.)



 Comments   
Comment by Kay Kim (Inactive) [ 08/Jul/19 ]

DOCS-8589 is to add a blurb to the getShardDistribution page.

Comment by Kay Kim (Inactive) [ 08/Jul/19 ]

kaloian.manassiev – ultimately, I think we should fix it since I think it's more intuitive that a method called db.coll.getShardDistribution returns not-stale data instead of stating "Hey, to run db.coll.getShardDistribution, you should always have a pre-req step of refreshing the cache".
Until it's fixed, we can put a blurb on the page that to avoid returning stale data, to run the refresh. (The page is a low-traffic page with ~615 views a month, if that info helps you all to prioritize).

Comment by Githook User [ 08/Jul/19 ]

Author:

{'name': 'Kay Kim', 'email': 'kay.kim@10gen.com', 'username': 'kay-kim'}

Message: DOCS-12839: update to rolling index builds
Branch: v3.6
https://github.com/mongodb/docs/commit/70bc02377dd3dc2328b5e953f46928ba84d697ce

Comment by Githook User [ 08/Jul/19 ]

Author:

{'name': 'Kay Kim', 'email': 'kay.kim@10gen.com', 'username': 'kay-kim'}

Message: DOCS-12839: update to rolling index builds
Branch: v4.0
https://github.com/mongodb/docs/commit/6c486c5cb23ade76c4b016d34853dbef3a48dae6

Comment by Githook User [ 08/Jul/19 ]

Author:

{'name': 'Kay Kim', 'username': 'kay-kim', 'email': 'kay.kim@10gen.com'}

Message: DOCS-12839: update to rolling index builds
Branch: master
https://github.com/mongodb/docs/commit/9b936dd64cbbd565e3fdb0bcca45d7f1dbdc6ecd

Comment by Kay Kim (Inactive) [ 05/Jul/19 ]

Heads up – created a separate ticket for the perform maintenance on repl set to separate out into a repl page and sharded clusters page.

Generated at Thu Feb 08 08:06:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.