-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Replication
-
ALL
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
I think this issue likely dates back to 4.4 when cluster-wide read/write concern support was added. See SERVER-45692 for some previous discussion on this. Back then, we opted not to try to avoid applying the RC/WC in this case due to difficulty knowing which namespaces a command would touch. Instead, we opted to require an explicit RC/WC for all internal operations. However, this leaves external commands interacting with non-replicated collections vulnerable to this bug.
The impact of this:
- For writes, I don't believe there is an impact, other than that we confusingly emit a debug log "applying default writeConcern" for the operation. ServiceEntryPointMongod::Hooks::waitForWriteConcern ultimately checks if the namespace is unreplicated and no-ops if so.
- For reads, I think in most cases there is also little impact besides confusing logs. If a cluster-wide majority RC is set, so long as there is some majority committed snapshot available already, we will successfully "wait" for read concern. However, in certain situations (such as if no majority committed snapshot exists yet and a majority of nodes are down) we could hang indefinitely waiting for one to become available. The only other allowed cluster-wide read concern levels are "available" and "local", and if I understand those correctly in neither case would we actually wait for anything.
The ideal behavior seems like it should be:
- If no read/write concern is supplied for the operation, we do not apply the cluster-wide default.
- If a read/write concern is supplied for the operation, we ignore it (and also do not apply the cluster-wide default). Arguably the most correct behavior could be to error. However, drivers support setting client-wide read and write concerns that are applied to each operation (and drivers do not know or check whether a namespace is replicated before applying the RC/WC), and so I think we may break applications relying on us ignoring/effectively ignoring the value when they upgrade, if we were to start erroring.
There is a straightforward workaround, which is to explicitly specify the correct read/write concern for any operations that are failing due to the incorrectly applied default. When an explicit read/write concern is specified, we do not apply the default.