[SERVER-34212] Config servers should only upconvert writeConcern to majority for requests on internal connections Created: 30/Mar/18  Updated: 29/Oct/23  Resolved: 22/Oct/18

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: None
Fix Version/s: 4.1.5

Type: Task Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Esha Maharishi (Inactive)
Resolution: Fixed Votes: 0
Labels: neweng, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-34501 Remove the config server code which u... Closed
Related
related to SERVER-67874 Remove config server code which upcon... Closed
is related to SERVER-28641 remove requirement that non-internal ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2018-10-08, Sharding 2018-11-05
Participants:
Linked BF Score: 17

 Description   

In MongoDB version 3.2 when CSRS was introduced, there were cases where commands coming from mongos did not specify write concern. For this purpose, the config server by default upconverts the writeConcern to majority. With the current codebase, this upconversion is no longer necessary because all mongos and shard commands against the config server should be including a write concern.

This ticket is to add testing that writeConcern:majority is always included with writes against the config server and to remove the upconversion code.



 Comments   
Comment by Githook User [ 22/Oct/18 ]

Author:

{'name': 'Esha Maharishi', 'email': 'esha.maharishi@mongodb.com', 'username': 'EshaMaharishi'}

Message: SERVER-34212 Config servers should only upconvert writeConcern to majority for requests on internal connections
Branch: master
https://github.com/mongodb/mongo/commit/6d9c01b93fb94caf1b6c57bb00bfceb3e329ede9

Comment by Esha Maharishi (Inactive) [ 22/Oct/18 ]

Git commit url: https://github.com/mongodb/mongo/commit/6d9c01b93fb94caf1b6c57bb00bfceb3e329ede9

Comment by Siyuan Zhou [ 06/Oct/18 ]

Upconversion is always safe, but I think the problem is it has an arbitrary small timeout, 30 seconds, much shorter than the default 10-minute timeout of awaitReplication() in testing. As proposed in BF-8529, a better solution could be only upconverting write concerns or error on unexpected write concerns for internal connections. From clients or test infrastructure's point of view, config servers should provide the same API as a normal mongod. Re-opening the ticket.

Comment by Esha Maharishi (Inactive) [ 24/Sep/18 ]

Spoke to kaloian.manassiev to try to (1) clarify and (2) be more succinct:

In most cases, there is no need for writes to the sharding catalog to be done with writeConcern: majority. All readers of the catalog read with readConcern: majority, so they will only ever see committed data. (It is a separate issue that they may see inconsistent or partial data, because catalog updates that involve multiple writes are not always wrapped in transactions. For this reason, readers of the catalog already handle seeing such inconsistent/partial data).

The places writeConcern: majority is required are places an action is taken on the config server's response (i.e., interpret a success response as meaning the catalog change became majority committed). In contrast, most callers ignore the response and do a refresh of their cache of the catalog to pick up the latest view.

However, there is no real harm in doing all sharding catalog writes with writeConcern: majority, and there may be edge cases that we haven't discovered that do depend on it (and that didn't show up as immediate failures in the Evergreen patch linked above).

So, we are closing this request as Works as Designed.

Comment by Esha Maharishi (Inactive) [ 10/Sep/18 ]

The only case where it's necessary to use majority writeConcern for a write against the config server is if the sending node will then take an action that assumes the write will not roll back.

These are all the collections on the config server:

  • config.databases
  • config.collections
  • config.chunks
  • config.shards
  • config.tags
  • config.version
  • config.locks
  • config.lockpings
  • config.settings
  • config.mongos
  • config.changelog

Here are some actions that might be taken after writing to one of those collections:

  • writing user data to the cluster
  • doing another write against the config server
  • sending setShardVersion to one or more shards

 

I am most concerned about the first kind of action, writing user data to the cluster. A user should supply majority writeConcern for metadata commands, but it seems easy for them to forget to do this and end up in a bad state.

About the second: We already have very weak guarantees around the consistency of the sharding catalog, particularly because the config server doesn't block nodes from reading the sharding catalog while the config server is writing to it. So, readers of the sharding catalog generally have to handle seeing an inconsistent view and retrying their read.

About the third: It is harmless to send an extra setShardVersion that ends up being unnecessary because the data it was trying to read rolled back.

 

We could, as the ticket suggests, add tests that all commands that write to the sharding catalog assert that the writeConcern is majority. We do in fact already have such a jstest.

But it feels highly brittle to me to rely on people remembering to update the jstest whenever they add a metadata command. The sharding team has plans to add metadata commands in the future, such as renameDatabase, renameCollection, and commands to clean up an inconsistent catalog. We have also added metadata commands in the recent past (for shardCollection for the "fast initial split" project in 4.2; for movePrimary for the "database versioning" project in 4.0).

 

I ran a patch removing the upconversion, and it came back mostly green (it just caught two places in mapReduce where we internally send _configsvrDropCollection and _configsvrShardCollection with default writeConcern instead of majority writeConcern).

However, after speaking to spencer, I think the safest solution is to continue upconverting writeConcern. Our test infrastructure (and our users) can always avoid the upconversion by supplying their own writeConcern.

Generated at Thu Feb 08 04:35:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.