[DOCS-15321] Document that writeConcern.wtimeout is not applicable to sharded cluster ddl operations Created: 04/May/22  Updated: 22/Jan/24

Status: Backlog
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Eric Sedor Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: backlog, feature, replication, sharding
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-65340 Operations hang when re-using dropped... Closed
Participants:
Days since reply: 1 year, 39 weeks, 6 days ago
Epic Link: DOCSP-11702

 Description   

In SERVER-65340, we are discussing that if a sharded cluster performing a rename operation reaches a point where it cannot roll back a rename operation, it's going to continue retrying indefinitely. So, writeConcern.wtimeout should not be provided to a renameCollection in a sharded cluster.

We can make this behavior clearer:

  • here
  • and in the writeConcern field description here

Additionally:

  • "Resource locking in sharded clusters" could be clarified to include reference to majority write concern, as is described in the writeConcern field description
  • There may be other operations that should be similarly clarified (cc pierlauro.sciarelli@mongodb.com can you suggest others?)


 Comments   
Comment by Tommaso Tocci [ 05/May/22 ]

We should also change:

https://www.mongodb.com/docs/manual/reference/command/setDefaultRWConcern/#sharding-administrative-commands-override-write-concern-settings

In fact DDL operation in sharded clsuter do not use write concern timeout at all. Once they start they are guaranteed to complete eventually

Comment by Tommaso Tocci [ 05/May/22 ]

Thanks pierlauro.sciarelli@mongodb.com for the clarification. I just wanted to summarize briefly the relation between Sharded DDL operations and WriteConcern:

  • All sharded DDL operations must be invoked with majority writeConcern, otherwise they will simply return an InvalidOptions error.
  • All sharded DDL operations ignore the given writeConcern timeout.
Comment by Pierlauro Sciarelli [ 05/May/22 ]

While write concerns for DDLs can be used in replica sets, no write concern other than majority may be provided for sharded DDLs. I would try to explicitly point that out in the documentation: metadata operations are not transactional within a sharded cluster (each shard independently executes them), so once they all start they're all expected to finish (and that is guaranteed to happen for every DDL operations as soon as all shards can majority commit). The rationale behind that is: if there is no possibility to majority commit in a sharded cluster, the user better solve that because anyway all data operations may be rollback-ed at some point.

Regarding the wtimeout, there seems to have been a lot of confusion regarding it in SERVER-65340: the "problem" (that is arguably a problem since the cluster self-recovers when majority commit is available) is reproducible also without setting wtimeout. And - in general - if a timeout is hit it does not necessarily mean that the operation failed, as also stated by the documentation: "MongoDB does not undo successful data modifications performed before the write concern exceeded the wtimeout time limit".

A questions may arise at this point: why are we allowing write concerns to be set for sharded DDLs? Since applications need to work seamlessly both on plain replica sets and sharded clusters, the public APIs are still allowing users to set write concerns. Any set write concern is then ignored and "upgraded" always to majority in a sharded cluster. (NB: this only applies to DDLs, write concerns are always respected for CRUD operations).

Generated at Thu Feb 08 08:12:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.