[SERVER-26431] Validate the collection name for commands that read documents from a collection Created: 03/Oct/16  Updated: 08/Jan/24  Resolved: 02/Aug/19

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: Backlog - Query Team (Inactive)
Resolution: Duplicate Votes: 0
Labels: query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-12817 Read only commands should validate th... Backlog
Related
is related to JAVA-1644 NPE on createIndex when the database ... Closed
is related to DRIVERS-2092 Drivers Spec : Specify object, colle... Closed
Assigned Teams:
Query
Participants:

 Description   

When the server receives a command sent to a database with an illegal name (e.g., one that contains a '.' character), it returns an OP_REPLY with zero documents.

For example, a createIndexes command sent to the 'dbname-v1.1' database:

query dbname-v1.1.$cmd query: { createIndexes: "whatever", indexes: [ { key: { aField: 1 }, name: "aField_1", ns: "dbname-v1.1.whatever" } ] }

returns a reply with zero documents.

Expected results: reply with a document indicating command failure.



 Comments   
Comment by David Storch [ 02/Aug/19 ]

It appears that we have an older improvement request which suggests that read commands should validate the collection name. Therefore, I'm closing this ticket as a duplicate of SERVER-12817.

Comment by David Storch [ 26/Jul/19 ]

The $ character is most certainly not legal in a collection name, which is enforced on code paths that create collections:

https://github.com/mongodb/mongo/blob/5090e4efc24b88d28fa83d30457e1d097f2fc273/src/mongo/db/namespace_string.h#L527-L545

Therefore, I believe this ticket tracks an improvement request to validate the collection name for OP_QUERY reads and read commands (find, aggregate, etc.). This change would cause the server to return an error rather than an empty result set for this example. Updating the title of the ticket and the issue type accordingly.

Comment by Matt Cotter [ 12/Oct/16 ]

So digging into this more, the db for the namespace string is definitely separated by the first period, so the db name here is "dbname-v1" and the collection is "1.$cmd"

inline StringData NamespaceString::db() const {
    return _dotIndex == std::string::npos ? StringData() : StringData(_ns.c_str(), _dotIndex);
}
 
inline StringData NamespaceString::coll() const {
    return _dotIndex == std::string::npos ? StringData() : StringData(_ns.c_str() + _dotIndex + 1,
                                                                      _ns.size() - 1 - _dotIndex);
}

So the db name is totally valid. The collection would fail the validCollectionName check (because it has a '$'), but it turns out we don't validate the collection name in isValid check for namespaces:

    /**
     * @return true if the namespace is valid. Special namespaces for internal use are considered as
     * valid.
     */
    bool isValid() const {
        return validDBName(db(), DollarInDbNameBehavior::Allow) && !coll().empty();
    }

sketchy. Maybe we should validate the collection there... o.o looks like the empty() check was the easy way of allowing "Special namespaces", but "1.$cmd" should fail.

But that might not matter anyway – I did a quick read of instance.cpp's assembleResponse, and it seems not to call isValid() at all on the path that this code is executing. I think we should fix that (uassert isValid?), and then it would be more clear what's going wrong.

On the other hand, if '$' was a valid collection name character though, this is totally "closed works as designed", because this is definitely a query and not a command.

Comment by Matt Cotter [ 12/Oct/16 ]

david.storch here's the reply:
240000002a00000008000000010000000800000000000000000000000000000000000000
36B message, request id is 42, response to message 8, op code 1, the only flag set is ResultFlag_AwaitCapable, cursor id 0, skip 0, number returned 0.

Comment by David Storch [ 12/Oct/16 ]

matt.cotter, gotcha, that makes sense. My bad. It's definitely running as a query rather than as a command based on the log output. Assuming the shell is sending a proper OP_QUERY command but with an invalid database name, perhaps the absence of a valid database name is causing us to fall into the query path? Can you tell what the OP_REPLY received by the shell looks like?

Comment by Matt Cotter [ 12/Oct/16 ]

My point with the original shell code is that, using legacy find, that and my second code example would be the exact same – i.e. I think what's happening is instead of a command being run, it's just a query on the collection "1.$cmd".

Comment by Matt Cotter [ 12/Oct/16 ]

charlie.swanson, david.storch that code I put was just to illustrate, sorry for confusion.
Does this make it more clear:

db.getMongo().find('dbname-v1.1.$cmd', { createIndexes: "whatever", indexes: [ { key: { aField: 1 }, name: "aField_1", ns: "dbname-v1.1.whatever" } ] }, false, 1, 0, 100, 0)

2016-10-12T16:39:59.239-0400 D QUERY    [conn3] Running query: query: { createIndexes: "whatever", indexes: [ { key: { aField: 1.0 }, name: "aField_1", ns: "dbname-v1.1.whatever" } ] } sort: {} projection: {} ntoreturn=1
2016-10-12T16:39:59.239-0400 D QUERY    [conn3] Collection dbname-v1.1.$cmd does not exist. Using EOF plan: query: { createIndexes: "whatever", indexes: [ { key: { aField: 1.0 }, name: "aField_1", ns: "dbname-v1.1.whatever" } ] } sort: {} projection: {} ntoreturn=1

Comment by Jeffrey Yemin [ 12/Oct/16 ]

I'm not sure how to reproduce this in the shell because it doesn't let you 'use' an invalid database name:

MongoDB Enterprise > use dbname-v1.1
2016-10-12T16:41:12.129-0400 E QUERY    [thread1] Error: [dbname-v1.1] is not a valid database name :
Mongo.prototype.getDB@src/mongo/shell/mongo.js:56:12
shellHelper.use@src/mongo/shell/utils.js:631:10
shellHelper@src/mongo/shell/utils.js:618:15
@(shellhelp2):1:1

I can confirm that the Java driver is using OP_QUERY to send the command, not OP_COMMAND, which is what I believe the 3.4 shell would be using.

Comment by David Storch [ 12/Oct/16 ]

charlie.swanson, as far as I understand this problem does not have to do with the legacy query path or the find command. Rather, it is a problem with our generic handling of the OP_QUERY command protocol. Note that the repro uses a createIndexes command, not a query. matt.cotter, your script above is using the find command, as opposed to issuing a createIndexes command.

Comment by Charlie Swanson [ 12/Oct/16 ]

matt.cotter I only vaguely understand this stuff, but it might be that your shell is using the find command instead of the legacy OP_QUERY, which will execute under a different path. I think you can do something like ./mongo --shellReadMode=legacy --shellWriteMode=legacy to get a runCommand to turn into an OP_QUERY on the db.$cmd namespace.

david.storch is that right?

Comment by Matt Cotter [ 12/Oct/16 ]

I'm trying to reproduce this, and I can't see how this is possible.
Is this not just a query on the collection "1.$cmd"? (which is an invalid collection, so maybe it should fail)

jeff.yemin do you have a script that reproduces this? Perhaps in the shell?
Here's what I think is happening:

use dbname-v1
db.getCollection("1.$cmd").find({ createIndexes: "whatever", indexes: [ { key: { aField: 1 }, name: "aField_1", ns: "dbname-v1.1.whatever" } ] })
// no results, it's a non-existent collection. See log line:
// 2016-10-12T16:19:01.370-0400 D QUERY    [conn2] Collection dbname-v1.1.$cmd does not exist. Using EOF plan: query: { createIndexes: "whatever", indexes: [ { key: { aField: 1.0 }, name: "aField_1", ns: "dbname-v1.1.whatever" } ] } sort: {} projection: {}

Generated at Thu Feb 08 04:12:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.