[SERVER-58981] Investigate why saslStart/saslContinue commands are sent as OP_QUERY commands Created: 30/Jul/21  Updated: 09/Aug/21  Resolved: 09/Aug/21

Status: Closed
Project: Core Server
Component/s: Query Execution
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Yoon Soo Kim Assignee: Yoon Soo Kim
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File multiversion_auth_v5.1_kall_v5.0_kall_log.txt     Text File multiversion_auth_v5.1_kall_v5.0_old_log.txt    
Issue Links:
Related
is related to SERVER-58338 Return an error if client attempts an... Closed
Sprint: QE 2021-08-09
Participants:

 Comments   
Comment by David Storch [ 09/Aug/21 ]

Changing the resolution to "Done", since this ticket just tracked an investigation.

Comment by David Storch [ 09/Aug/21 ]

Thanks yoonsoo.kim, this makes sense! I don't fully follow all the intricacies of how SASL works, but it makes sense that when we bypass the isMaster protocol negotiation we would end up performing authentication-related commands using OP_QUERY. It's good to know that no further action is required right now.

Comment by Yoon Soo Kim [ 05/Aug/21 ]

david.storch, please, take a look at the summary.

Comment by Yoon Soo Kim [ 05/Aug/21 ]

In summary, the reason why saslStart/saslContinue are being sent as OP_QUERY commands is because the repl code bypasses the protocol negotiation when it's sending _isSelf command and it need to send saslStart/saslContinue before sending _isSelf when the internal authentication is turned on and the default rpc protocol was kOpQuery before the following changes in SERVER-58338.

https://github.com/mongodb/mongo/commit/b6f5fa2c56195844a543a5561ea336379c5b1f02#diff-8858938c42aff8c113c7e2594c707d8db941ea4b9244ba9d3857cd42e80e1a6fL839-R843

Now the default rpc protocol is kOpMsg for v5.1 and forward. So, _isSelf/saslStart/saslContinue are being sent as OP_MSG commands. There's nothing to fix and no further action is required.

Comment by Yoon Soo Kim [ 05/Aug/21 ]

multiversion_auth_v5.1_kall_v5.0_kall_log.txt log file is attached.
 

Comment by Yoon Soo Kim [ 05/Aug/21 ]

Experimentation #2: v5.1 uses kAll protocol, v5.0 uses kAll protocol by default by DBClientBase. The code is...

https://github.com/10gen/mongo/blob/256a4302f13be977c6bb5421ef16784b5214a68a/src/mongo/client/dbclient_base.h#L840-L848

Here's the log messages from experimentation:

In this log d20030 is v5.0 mongod and b20030 is mongobridge for d20030. 5838802 is the sender-side log message id, 5838801 is the receiver-side log message id.

$ buildscripts/resmoke.py run --suites=multiversion_auth jstests/multiVersion/minor_version_tags_old_new_old.js 2>&1 | tee multiversion_auth_v5.1_kall_v5.0_kall_log.txt
$ grep -n 583880 multiversion_auth_v5.1_kall_v5.0_kall_log.txt | grep -v 'isMaster' | grep -v 'ismaster' | head -n 200
...
211:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T19:51:16.168+00:00"},"s":"I",  "c":"NETWORK",  "id":5838802, "ctx":"conn2","msg":"the internal auth is set. isSelf() is authenticating itself"}
215:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T19:51:16.172+00:00"},"s":"I",  "c":"NETWORK",  "id":5838802, "ctx":"conn2","msg":"DBClientBase::runCommandWithTarget is about to run","attr":{"command":"saslStart"}}
217:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T19:51:16.323+00:00"},"s":"I",  "c":"NETWORK",  "id":5838802, "ctx":"conn2","msg":"DBClientBase::runCommandWithTarget is about to run","attr":{"command":"saslContinue"}}
219:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T19:51:16.325+00:00"},"s":"I",  "c":"NETWORK",  "id":5838802, "ctx":"conn2","msg":"DBClientBase::runCommandWithTarget is about to run","attr":{"command":"_isSelf"}}
...

d20030 isSelf() tries to authenticate itself against itself but it does not send saslStart/saslContinue/_isSelf as OP_QUERY commands. We don't see receiver-side 5838801 logs messages.

$ grep -n 5838801 multiversion_auth_v5.1_kall_v5.0_kall_log.txt | grep -v 'isMaster' | grep -v 'ismaster' | wc -l
0

The reason why they are sent as OP_MSG commands is because the preferred protocol is kOpMsg if both kOpMsg and kOpQuery are allowed.

const Protocol kPreferredProtos[] = {Protocol::kOpMsg, Protocol::kOpQuery};
...
StatusWith<Protocol> negotiate(ProtocolSet fst, ProtocolSet snd) {
    using std::begin;
    using std::end;
 
    ProtocolSet common = fst & snd;
 
    auto it = std::find_if(begin(kPreferredProtos), end(kPreferredProtos), [common](Protocol p) {
        return common & static_cast<ProtocolSet>(p);
    });
 
    if (it == end(kPreferredProtos)) {
        return Status(ErrorCodes::RPCProtocolNegotiationFailed, "No common protocol found.");
    }
    return *it;
}

https://github.com/10gen/mongo/blob/yoonsoo.kim/v5.0-investigate-sasl-op-commands/src/mongo/rpc/protocol.cpp#L51

https://github.com/10gen/mongo/blob/yoonsoo.kim/v5.0-investigate-sasl-op-commands/src/mongo/rpc/protocol.cpp#L83-L97

DBClientBase::runCommandWithTarget() eventually calls negotiate().

Comment by Yoon Soo Kim [ 04/Aug/21 ]

In the entire multiversion_auth_v5.1_kall_v5.0_old_log.txt, we can see following occurrences of _isSelf/saslStart/saslContinue.

$ grep -n 5838801 multiversion_auth_v5.1_kall_v5.0_old_log.txt | grep -v 'isMaster' | grep -v 'ismaster' | wc -l
1596
$ grep -n 5838801 multiversion_auth_v5.1_kall_v5.0_old_log.txt | grep -v 'isMaster' | grep -v 'ismaster' | grep '_isSelf' | wc -l
532
$ grep -n 5838801 multiversion_auth_v5.1_kall_v5.0_old_log.txt | grep -v 'isMaster' | grep -v 'ismaster' | grep 'saslStart' | wc -l
532
$ grep -n 5838801 multiversion_auth_v5.1_kall_v5.0_old_log.txt | grep -v 'isMaster' | grep -v 'ismaster' | grep 'saslContinue' | wc -l
532

This means that there's nothing else other than _isSelf/saslStart/saslContinue except isMaster & ismaster.

Comment by Yoon Soo Kim [ 04/Aug/21 ]

multiversion_auth_v5.1_kall_v5.0_old_log.txt log file is attached.

Comment by Yoon Soo Kim [ 04/Aug/21 ]

Experimentation #1: v5.1 uses kAll protocol, v5.0 uses kOpQuery protocol by default by DBClientBase. The code is...

https://github.com/10gen/mongo/blob/yoonsoo.kim/v5.0-investigate-sasl-op-commands/src/mongo/client/dbclient_base.h#L840-L848

Here's the log messages from experimentation:

In this log d20030 is v5.0 mongod and b20030 is mongobridge for d20030. 5838802 is the sender-side log message id, 5838801 is the receiver-side log message id.

$ buildscripts/resmoke.py run --suites=multiversion_auth jstests/multiVersion/minor_version_tags_old_new_old.js 2>&1 | tee multiversion_auth_v5.1_kall_v5.0_old_log.txt

$ grep -n 583880 multiversion_auth_v5.1_kall_v5.0_old_log.txt | grep -v 'isMaster' | grep -v 'ismaster' | head -n 200

211:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.038+00:00"},"s":"I", "c":"NETWORK", "id":5838802, "ctx":"conn2","msg":"the internal auth is set. isSelf() is authenticating itself"}
220:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.052+00:00"},"s":"I", "c":"NETWORK", "id":5838802, "ctx":"conn2","msg":"DBClientBase::runCommandWithTarget is about to run","attr":{"command":"saslStart"}}
221:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.052+00:00"},"s":"I", "c":"COMMAND", "id":5838802, "ctx":"conn2","msg":"OP_QUERY commmand body","attr":{"command":{"saslStart":1,"mechanism":"SCRAM-SHA-256","options":{"skipEmptyExchange":true},"payload":{"$binary":{"base64":"biwsbj1fX3N5c3RlbSxyPVo2dm0xZ2xGUzZsY1ZXQlJGcmtUeW83RG55Ti81ZkVV","subType":"0"}}}}}
222:[js_test:minor_version_tags_old_new_old] b20020| {"t":{"$date":"2021-08-04T21:56:02.052Z"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command","attr":{"ns":"local.$cmd","command":{"saslStart":1,"mechanism":"SCRAM-SHA-256","options":{"skipEmptyExchange":true},"payload":{"$binary":{"base64":"biwsbj1fX3N5c3RlbSxyPVo2dm0xZ2xGUzZsY1ZXQlJGcmtUeW83RG55Ti81ZkVV","subType":"0"}}}}}
223:[js_test:minor_version_tags_old_new_old] b20020| {"t":{"$date":"2021-08-04T21:56:02.053Z"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command received from a client","attr":{"command":"saslStart","remote":"10.122.0.36:40200","client":"conn5"}}
224:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.053+00:00"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command","attr":{"ns":"local.$cmd","command":{"saslStart":1,"mechanism":"SCRAM-SHA-256","options":{"skipEmptyExchange":true},"payload":{"$binary":{"base64":"biwsbj1fX3N5c3RlbSxyPVo2dm0xZ2xGUzZsY1ZXQlJGcmtUeW83RG55Ti81ZkVV","subType":"0"}}}}}
225:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.053+00:00"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command received from a client","attr":{"command":"saslStart","remote":"10.122.0.36:39188","client":"conn5"}}
227:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.203+00:00"},"s":"I", "c":"NETWORK", "id":5838802, "ctx":"conn2","msg":"DBClientBase::runCommandWithTarget is about to run","attr":{"command":"saslContinue"}}
228:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.203+00:00"},"s":"I", "c":"COMMAND", "id":5838802, "ctx":"conn2","msg":"OP_QUERY commmand body","attr":{"command":{"saslContinue":1,"payload":{"$binary":{"base64":"Yz1iaXdzLHI9WjZ2bTFnbEZTNmxjVldCUkZya1R5bzdEbnlOLzVmRVVxMFB6VHQvdXlGVEhFY0lLVU9DSVNJbVZHMFZWejg0LyxwPVRmb2xHNktDc29wVVgzQUpHOExRT2xCK3d4Rk5lNEthMzVaOWg5cXFudEk9","subType":"0"}},"conversationId":1}}}
229:[js_test:minor_version_tags_old_new_old] b20020| {"t":{"$date":"2021-08-04T21:56:02.204Z"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command","attr":{"ns":"local.$cmd","command":{"saslContinue":1,"payload":{"$binary":{"base64":"Yz1iaXdzLHI9WjZ2bTFnbEZTNmxjVldCUkZya1R5bzdEbnlOLzVmRVVxMFB6VHQvdXlGVEhFY0lLVU9DSVNJbVZHMFZWejg0LyxwPVRmb2xHNktDc29wVVgzQUpHOExRT2xCK3d4Rk5lNEthMzVaOWg5cXFudEk9","subType":"0"}},"conversationId":1}}}
230:[js_test:minor_version_tags_old_new_old] b20020| {"t":{"$date":"2021-08-04T21:56:02.204Z"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command received from a client","attr":{"command":"saslContinue","remote":"10.122.0.36:40200","client":"conn5"}}
231:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.204+00:00"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command","attr":{"ns":"local.$cmd","command":{"saslContinue":1,"payload":{"$binary":{"base64":"Yz1iaXdzLHI9WjZ2bTFnbEZTNmxjVldCUkZya1R5bzdEbnlOLzVmRVVxMFB6VHQvdXlGVEhFY0lLVU9DSVNJbVZHMFZWejg0LyxwPVRmb2xHNktDc29wVVgzQUpHOExRT2xCK3d4Rk5lNEthMzVaOWg5cXFudEk9","subType":"0"}},"conversationId":1}}}
232:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.205+00:00"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command received from a client","attr":{"command":"saslContinue","remote":"10.122.0.36:39188","client":"conn5"}}
234:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.206+00:00"},"s":"I", "c":"NETWORK", "id":5838802, "ctx":"conn2","msg":"DBClientBase::runCommandWithTarget is about to run","attr":{"command":"_isSelf"}}
235:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.207+00:00"},"s":"I", "c":"COMMAND", "id":5838802, "ctx":"conn2","msg":"OP_QUERY commmand body","attr":{"command":{"_isSelf":1}}}
236:[js_test:minor_version_tags_old_new_old] b20020| {"t":{"$date":"2021-08-04T21:56:02.207Z"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command","attr":{"ns":"admin.$cmd","command":{"_isSelf":1}}}
237:[js_test:minor_version_tags_old_new_old] b20020| {"t":{"$date":"2021-08-04T21:56:02.207Z"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command received from a client","attr":{"command":"_isSelf","remote":"10.122.0.36:40200","client":"conn5"}}
238:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.208+00:00"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command","attr":{"ns":"admin.$cmd","command":{"_isSelf":1}}}
239:[js_test:minor_version_tags_old_new_old] d20030| {"t":{"$date":"2021-08-04T21:56:02.208+00:00"},"s":"W", "c":"COMMAND", "id":5838801, "ctx":"conn5","msg":"OP_QUERY command received from a client","attr":{"command":"_isSelf","remote":"10.122.0.36:39188","client":"conn5"}}

This process starts with the message "the internal auth is set. isSelf() is authenticating itself" at line 211.

This log messages is added to here:

https://github.com/10gen/mongo/blob/73ec4775c7cc9427936aeb160d9ea9bfac4bf9dc/src/mongo/db/repl/isself.cpp#L247

If we look at the the following code block, isSelf() authenticates itself against itself when internal auth is set before sending "_isSelf" command.

https://github.com/10gen/mongo/blob/73ec4775c7cc9427936aeb160d9ea9bfac4bf9dc/src/mongo/db/repl/isself.cpp#L246-L263

The line 253 leads to DBClientBase::authenticateInternalUser() call which again leads to auth::authenticateInternalClient(). auth::authenticateInternalClient() is given a RunCommandHook which is created by DBClientBase::_makeAuthRunCommandHook(). DBClientBase::_makeAuthRunCommandHook() calls DBClientBase::runCommand() which in turn, calls DBClientBase::runCommandWithTarget() to run an auth command. A log message is added to DBClientBase::runCommandWithTarget(). We can see OpMsgRequest be converted to a request message depending on negotiated rpc protocol (or default rpc protocol) here. If the rpc protocol is kOpQuery, then legacyRequestFromOpMsgRequest() is called. A log message is added to legacyRequestFromOpMsgRequest(). We can see this log message at 221 line in the above log. This means the rpc protocol is kOpQuery for the authentication process.

A similar analysis can be done for saslContinue OP_QUERY command.

The reason why the rpc protocol is kOpQuery here is that isSelf() bypasses rpc protocol negotiation and instead, uses the default rpc protocol.

https://github.com/10gen/mongo/blob/yoonsoo.kim/v5.0-investigate-sasl-op-commands/src/mongo/db/repl/isself.cpp#L232-L237

So, if we change the default rpc protocol to kOpMsgOnly, then saslStart/saslContinue commands will be sent as OP_MSG ones, which we've already done for _isSelf command in SERVER-58338.

In the above log, we can see mongobridge and mongod receive saslStart OP_QUERY command at line 222-225 and saslContinue OP_QUERY command at 229-232.

Comment by Yoon Soo Kim [ 04/Aug/21 ]

Here are changesets for logging:

 

Comment by Yoon Soo Kim [ 04/Aug/21 ]

In multiversion_auth test task, the last-lts is 5.0.0-rc5.

To differentiate 5.0 sending saslStart/saslContinue OP_QUERY commands from 5.1 more easily, added logging to 5.0 too.

Added logging to mongobridge too since multiversion_auth launches mongobridge.

Comment by Yoon Soo Kim [ 04/Aug/21 ]

There are two C++ clients which are used by server internal components.

Going to add some logging to two clients along the way for command execution and also authentication code with code = 5838802. And also remove uassert on no OP_QUERY commands condition so that every OP_QUERY command goes on and completes. Otherwise, one blocked OP_QUERY command may lead to failures in next steps.

I expect that we should not see OP_QUERY saslStart/saslContinue commands for replica_sets_auth/sharding_auth since both clients are not supposed to send OP_QUERY commands except hello/isMaster.

The server side logs OP_QUERY commands with code = 5838801.

Comment by Yoon Soo Kim [ 04/Aug/21 ]

Realized that saslStart/saslContinue commands show up only in *_auth test task such as replica_sets_auth, sharding_auth, or multiversion_auth.

They must be related to authentication process.

Generated at Thu Feb 08 05:46:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.