[SERVER-34212] Config servers should only upconvert writeConcern to majority for requests on internal connections Created: 30/Mar/18 Updated: 29/Oct/23 Resolved: 22/Oct/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.5 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Esha Maharishi (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | neweng, sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Sprint: | Sharding 2018-10-08, Sharding 2018-11-05 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Linked BF Score: | 17 | ||||||||||||||||||||||||
| Description |
|
In MongoDB version 3.2 when CSRS was introduced, there were cases where commands coming from mongos did not specify write concern. For this purpose, the config server by default upconverts the writeConcern to majority. With the current codebase, this upconversion is no longer necessary because all mongos and shard commands against the config server should be including a write concern. This ticket is to add testing that writeConcern:majority is always included with writes against the config server and to remove the upconversion code. |
| Comments |
| Comment by Githook User [ 22/Oct/18 ] |
|
Author: {'name': 'Esha Maharishi', 'email': 'esha.maharishi@mongodb.com', 'username': 'EshaMaharishi'}Message: |
| Comment by Esha Maharishi (Inactive) [ 22/Oct/18 ] |
|
Git commit url: https://github.com/mongodb/mongo/commit/6d9c01b93fb94caf1b6c57bb00bfceb3e329ede9 |
| Comment by Siyuan Zhou [ 06/Oct/18 ] |
|
Upconversion is always safe, but I think the problem is it has an arbitrary small timeout, 30 seconds, much shorter than the default 10-minute timeout of awaitReplication() in testing. As proposed in BF-8529, a better solution could be only upconverting write concerns or error on unexpected write concerns for internal connections. From clients or test infrastructure's point of view, config servers should provide the same API as a normal mongod. Re-opening the ticket. |
| Comment by Esha Maharishi (Inactive) [ 24/Sep/18 ] |
|
Spoke to kaloian.manassiev to try to (1) clarify and (2) be more succinct: In most cases, there is no need for writes to the sharding catalog to be done with writeConcern: majority. All readers of the catalog read with readConcern: majority, so they will only ever see committed data. (It is a separate issue that they may see inconsistent or partial data, because catalog updates that involve multiple writes are not always wrapped in transactions. For this reason, readers of the catalog already handle seeing such inconsistent/partial data). The places writeConcern: majority is required are places an action is taken on the config server's response (i.e., interpret a success response as meaning the catalog change became majority committed). In contrast, most callers ignore the response and do a refresh of their cache of the catalog to pick up the latest view. However, there is no real harm in doing all sharding catalog writes with writeConcern: majority, and there may be edge cases that we haven't discovered that do depend on it (and that didn't show up as immediate failures in the Evergreen patch linked above). So, we are closing this request as Works as Designed. |
| Comment by Esha Maharishi (Inactive) [ 10/Sep/18 ] |
|
The only case where it's necessary to use majority writeConcern for a write against the config server is if the sending node will then take an action that assumes the write will not roll back. These are all the collections on the config server:
Here are some actions that might be taken after writing to one of those collections:
I am most concerned about the first kind of action, writing user data to the cluster. A user should supply majority writeConcern for metadata commands, but it seems easy for them to forget to do this and end up in a bad state. About the second: We already have very weak guarantees around the consistency of the sharding catalog, particularly because the config server doesn't block nodes from reading the sharding catalog while the config server is writing to it. So, readers of the sharding catalog generally have to handle seeing an inconsistent view and retrying their read. About the third: It is harmless to send an extra setShardVersion that ends up being unnecessary because the data it was trying to read rolled back.
We could, as the ticket suggests, add tests that all commands that write to the sharding catalog assert that the writeConcern is majority. We do in fact already have such a jstest. But it feels highly brittle to me to rely on people remembering to update the jstest whenever they add a metadata command. The sharding team has plans to add metadata commands in the future, such as renameDatabase, renameCollection, and commands to clean up an inconsistent catalog. We have also added metadata commands in the recent past (for shardCollection for the "fast initial split" project in 4.2; for movePrimary for the "database versioning" project in 4.0).
I ran a patch removing the upconversion, and it came back mostly green (it just caught two places in mapReduce where we internally send _configsvrDropCollection and _configsvrShardCollection with default writeConcern instead of majority writeConcern). However, after speaking to spencer, I think the safest solution is to continue upconverting writeConcern. Our test infrastructure (and our users) can always avoid the upconversion by supplying their own writeConcern. |