[SERVER-3702] printShardingStatus interprets database name as regular expression Created: 28/Aug/11  Updated: 11/Jul/16  Resolved: 21/Feb/12

Status: Closed
Project: Core Server
Component/s: Sharding, Shell
Affects Version/s: None
Fix Version/s: 2.1.1

Type: Bug Priority: Minor - P4
Reporter: James Pharaoh Assignee: Randolph Tan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

When running db.printShardingStatus(), you can get unexpected behaviour if you have database names with characters which are interpreted by regular expressions. This could cause the output to be incorrect or, worse, for the function to not work at all due to an invalid regular expression.

I think in my case I had a database with an asterisk in it or with square brackets and was getting a regexp error in shell/server.js when I ran the function. I'm a little confused as to exactly what the problematic name was now because I've deleted them.

I'm not sure whether these database names are valid or not. They don't seem to show up in "show dbs" but they were certainly present in the "config.databases" collection.

I would suggest that the printShardingStatus() function should be quoting any string it places into a regular expression, regardless of the range of valid values. JavaScript doesn't seem to provide a function to do this, but an example of a function that could be used is here:

http://80.68.89.23/2006/Jan/20/escape/

The function itself is as follows:

RegExp.escape = function(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}

Obviously this doesn't have to be stored in the RegExp function itself but it would probably be a good idea for MongoDB to provide something like this.

If it declared as above then the change in shell/server.js is from:

configDB.collections.find( { _id : new RegExp( "^" + db._id + "\." ) } ).sort( { _id : 1 } ).forEach(

To:

configDB.collections.find( { _id : new RegExp( "^" + RegExp.escape(db._id) + "\." ) } ).sort( { _id : 1 } ).forEach(

Or just declaring this function locally:

var regExpEscape = function(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}
configDB.collections.find( { _id : new RegExp( "^" + regExpEscape(db._id) + "\." ) } ).sort( { _id : 1 } ).forEach(

I have a feeling there is another bug here as well, and that these database names are not valid. If this is the case then they shouldn't be ending up in my config.databases collection. I'm not sure how they were created, however, only that they somehow were.



 Comments   
Comment by auto [ 17/Feb/12 ]

Author:

{u'login': u'renctan', u'name': u'Ren', u'email': u'renctan@gmail.com'}

Message: Changes to address code review comments for SERVER-3702
Branch: master
https://github.com/mongodb/mongo/commit/d27ca868154849476c29d2ef2cd99e2ab38f2b76

Comment by auto [ 17/Feb/12 ]

Author:

{u'login': u'renctan', u'name': u'Ren', u'email': u'renctan@gmail.com'}

Message: SERVER-3702 printShardingStatus interprets database name as regular expression
Merged pull request from: https://github.com/mongodb/mongo/pull/142
Added a simple test for the pull request.
Branch: master
https://github.com/mongodb/mongo/commit/46bc489a886d6d330a518ba8aaef05249b4798e3

Comment by Randolph Tan [ 08/Feb/12 ]

Just found out that the server also needs changes since there are also places where a regex string from client is used as is, without being escaped.

Comment by James Pharaoh [ 07/Nov/11 ]

Ok, I've updated my branch and submitted the pull request:

https://github.com/mongodb/mongo/pull/142

Comment by Spencer Brody (Inactive) [ 04/Nov/11 ]

Great, thanks for writing that up! Can you submit a pull request through github?

Comment by James Pharaoh [ 04/Nov/11 ]

Here's a branch with a fix for all the instances I can find:

https://github.com/jamespharaoh/mongo/commit/a6c82661afc02b5f79b6c0c55c05489fa249e33a

I've added an escape function to the RegExp object directly, in utils.js. I've updated all the instances I can find.

Comment by James Pharaoh [ 04/Nov/11 ]

This fixes the extra bug I noted in my comment above but it won't fix the original bug. Please read my original post. You can't just put a string which is supposed to be matched exactly into a regexp string unaltered because special characters will be interpreted in special ways. Worse, some valid database names may be invalid regular expressions and cause an error. This is what happened to me and it stopped me from being able to run this function at all.

It's a bit of a shame that the JavaScript RegExp object doesn't include a static method to quote strings for inclusion in a regular expression. Other implementations do - Pattern.quote in java, Regexp.escape or Regexp.quote in ruby, etc... It would probably make sense for Mongo to have a utility function somewhere which implements this and is shared by all code. Every time a RegExp is created throughout the code should be checked for this bug.

Comment by Spencer Brody (Inactive) [ 01/Nov/11 ]

Can you try downloading the 2.0 nightly build and confirm if Eliot's commit fixes the issue?

Comment by auto [ 24/Oct/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: fix \. in printShardingStatus when looking up collections, part of SERVER-3702
Branch: v2.0
https://github.com/mongodb/mongo/commit/f6c73c58a5ca914837fbb3a449ccc8d17ec8bd71

Comment by auto [ 24/Oct/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: fix \. in printShardingStatus when looking up collections, part of SERVER-3702
Branch: master
https://github.com/mongodb/mongo/commit/99db06077e305885ebad0f9c6b7704aeadb322b9

Comment by James Pharaoh [ 28/Aug/11 ]

Actually, I think there's a further bug here. The "\." in the regular expression should also be double escaped, because "\." == ".". The correct form would be "\ \." (without the space) which is interpreted as a string containing a single backslash followed by a dot, which is in turn interpreted as a regular expression matching a single dot.

This bug will cause database names which include the name of the one being searched for at the start to also be matched which is obviously not the intention.

Generated at Thu Feb 08 03:03:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.