Priority: Major - P3
Affects Version/s: 4.2.1
Backport Requested:v4.4, v4.2
Sprint:Query 2020-04-20, Query 2020-05-04, Query 2020-05-18, Query 2020-06-01, Query 2020-06-15
The default port of mongocryptd is 27020. The default port of mongod is 27017. There is precedent (for example, mlaunch does this, as well as drivers' test suites in evergreen) to start launching mongod/mongos on 27017 and go up in port numbers until the required number of daemons is provisioned.
Following the described port allocation for mongod/mongos processes, eventually there is going to be a mongod or mongos on port 27020. When this happens, operations fail with the following cryptic error:
Note that the above exception is referencing localhost:27019.
Here is what happened:
- I have a sharded cluster deployment that starts on the default port 27017 for the mongos. This is used for testing srv monitoring by the driver.
- As part of this deployment, there is a 2-node replica set for one of the shards occupying ports 27019 and 27020.
- Right now 27019 is the primary and 27020 is the secondary.
- When running client side encryption tests, the driver assumes mongocryptd exists on port 27020 and tries to connect there.
- The driver performs normal SDAM discovery, detects the topology as a replica set, finds the primary on 27019, and sends the command intended for mongocryptd to the primary on 27019.
- The command fails because it is received by a mongod rather than mongocryptd but does not give this as the reason for failure.
As a user of the driver, when the driver sends a command intended for mongocryptd to a mongod/mongos I want to be informed that the command was received by the wrong process, so that I can immediately take corrective action (reconfigure the driver and/or my deployments).
The error message produced does not indicate the root cause of the problem (command received by wrong daemon).