[SERVER-41005] Sharding initialization should not occur before replication recovery Created: 03/May/19  Updated: 06/Dec/22  Resolved: 21/Feb/20

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-41799 Update shard_aware_init to flush its ... Closed
related to SERVER-52989 Complete TODO listed in SERVER-41005 Closed
Assigned Teams:
Sharding
Operating System: ALL
Participants:
Linked BF Score: 16

 Description   

Consider the startup sequence of a mongod which is both a replica set member and a shard server (i.e. a member in a replica set, where the replica set is a shard in a sharded cluster). When a node starts up with the --shardsvr flag, it needs to initialize some aspects of the sharding system. In particular, it needs to read a document out of the admin.system.version collection with _id:"shardIdentity" in order to establish things like its shard id and the config server connection string.

This node also needs to perform some initialization tasks for the replication subsystem, in particular replication recovery. This involves replaying oplog in order to ensure that the collection and indices are consistent with all committed writes in the oplog before the node services queries.

The problem observed in this ticket is that sharding initialization takes place prior to replication recovery. Therefore, the sharding system may attempt to perform reads, at least reads against admin.system.version, before replication recovery has occurred. Sharding initialization could therefore fail to see committed data. For example, it could fail to see the shard identity document, even though the write of the shard identity document was committed.



 Comments   
Comment by Githook User [ 26/Jun/19 ]

Author:

{'name': 'Jason Carey', 'email': 'jcarey@argv.me', 'username': 'hanumantmk'}

Message: SERVER-41799 await stable TS in shard_aware_init

We currently spin up sharding in advance of replication (see
SERVER-41005). Because of that, it is possible for sharding to miss out
on certain writes on startup (writes to admin.system.version that are
still in the oplog and haven't yet been recovered).

It's going to be quite difficult to untangle all the dependencies
between sharding and replication, and in the mean while shard_aware_init
has more failures than we'd like. See BF-12759. That particular test
specifically checks that corrupting our version (via a manual update to
admin.system.version) causes mongod to crash on startup. The problem is
that because we start sharding before replication (and also do a
complicated dance of restarting in standalone mode to corrupt the
document), we can perform an update when the document we want to modify
isn't present (because it's still in the oplog and we're in standalone
mode), and then fail to crash on startup.

So let's fix up that test by waiting to flush the oplog before shutting
down the node (when in replica set mode).

(cherry picked from commit 303adb5e50eb02d077b734aa27ae8d02a781d7a2)
Branch: v4.2
https://github.com/mongodb/mongo/commit/22c05b4a4fcc7b0213041067bd9539db9d4da8f5

Comment by Githook User [ 19/Jun/19 ]

Author:

{'name': 'Jason Carey', 'email': 'jcarey@argv.me', 'username': 'hanumantmk'}

Message: SERVER-41799 await stable TS in shard_aware_init

We currently spin up sharding in advance of replication (see
SERVER-41005). Because of that, it is possible for sharding to miss out
on certain writes on startup (writes to admin.system.version that are
still in the oplog and haven't yet been recovered).

It's going to be quite difficult to untangle all the dependencies
between sharding and replication, and in the mean while shard_aware_init
has more failures than we'd like. See BF-12759. That particular test
specifically checks that corrupting our version (via a manual update to
admin.system.version) causes mongod to crash on startup. The problem is
that because we start sharding before replication (and also do a
complicated dance of restarting in standalone mode to corrupt the
document), we can perform an update when the document we want to modify
isn't present (because it's still in the oplog and we're in standalone
mode), and then fail to crash on startup.

So let's fix up that test by waiting to flush the oplog before shutting
down the node (when in replica set mode).
Branch: master
https://github.com/mongodb/mongo/commit/303adb5e50eb02d077b734aa27ae8d02a781d7a2

Generated at Thu Feb 08 04:56:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.