[CXX-1187] mongocxx::uri::database() segfaults if no database specified in URL Created: 10/Jan/17  Updated: 16/Jun/17  Resolved: 22/Feb/17

Status: Closed
Project: C++ Driver
Component/s: Implementation
Affects Version/s: None
Fix Version/s: 3.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Adam Seering Assignee: J Rassi
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
backported by CXX-1191 Backport mongocxx::uri::database() se... Closed
Related
related to CXX-1066 Audit all libmongoc and libbson calls... Closed
is related to CXX-1219 Improve "Testing the mongocxx driver"... Closed
is related to CDRIVER-2051 Documentation for several mongoc_uri_... Closed

 Description   

Reproducer:

mongocxx::uri uri{mongocxx::uri::k_default_uri};
std::cout << uri.database() << std::endl;

Expected:

  • Should print out the empty string. Or, well, there's some room for debase here. It would also be ok if it asserted, if there were some sort of "has_database()" method so I can check whether a database is present.

Actual:

  • On my Mac (OS X Sierra), crashes with a segmentation fault. On my Linux box (Ubuntu 16.04), dumps some scary-looking memory debug information and exits.

Analysis:

The root of the problem appears to be in the deceptively-simple-looking "mongocxx::uri::database()":

std::string uri::database() const {
    return libmongoc::uri_get_database(_impl->uri_t);
}

libmongoc::uri_get_database() returns a char*. Because we are returning a std::string, C++ performs an implicit cast.

If no database was specified in the URI, uri_get_database() returns NULL, not (for example) "". I think this is an entirely reasonable C API. However, the implicit std::string(const char*) constructor requires/assumes that the char* point to a valid string. When passed a NULL pointer, it dereferences that pointer and causes a segfault.

A trivial solution would be to add a helper that handles NULL somehow. For example:

std::string safe_string(const char* ch)
{
    if (!ch) { return ""; }
    return std::string(ch);
}
std::string uri::database() const {
    return safe_string(libmongoc::uri_get_database(_impl->uri_t));
}

But there are a bunch of design decisions here. Also, just glancing at the code in this file, I suspect that some other accessors on this class may be affected by the same issue.



 Comments   
Comment by Githook User [ 16/Jun/17 ]

Author:

{u'username': u'jrassi', u'name': u'J. Rassi', u'email': u'rassi@10gen.com'}

Message: CXX-1187 Minor style/lint fixes to uri.cpp and test/uri.cpp
Branch: bugfix/3.1.x
https://github.com/mongodb/mongo-cxx-driver/commit/3ba0d6ffee2d15574402684e93fea5434fdddbe0

Comment by Githook User [ 16/Jun/17 ]

Author:

{u'name': u'Adam Seering', u'email': u'aseering@nutonian.com'}

Message: CXX-1187 Handle null return values from libmongoc for uri's

Closes #581
Branch: bugfix/3.1.x
https://github.com/mongodb/mongo-cxx-driver/commit/bff8804152a040f9b00de957acf806ae96157641

Comment by Githook User [ 22/Feb/17 ]

Author:

{u'username': u'jrassi', u'name': u'J. Rassi', u'email': u'rassi@10gen.com'}

Message: CXX-1187 Minor style/lint fixes to uri.cpp and test/uri.cpp
Branch: master
https://github.com/mongodb/mongo-cxx-driver/commit/6814a5b3e445173f2648349dfc05791079a2afd0

Comment by Githook User [ 22/Feb/17 ]

Author:

{u'name': u'Adam Seering', u'email': u'aseering@nutonian.com'}

Message: CXX-1187 Handle null return values from libmongoc for uri's

Closes #581
Branch: master
https://github.com/mongodb/mongo-cxx-driver/commit/c045cea1e66b3bebcb96c838efdf99dcb40c6086

Comment by J Rassi [ 14/Feb/17 ]

I've filed bug ticket CDRIVER-2051 to address the incorrect libmongoc documentation. From code inspection and my past experience with the C driver I feel fairly confident in my guess of which functions are guaranteed to return non-null, so I'd recommend we proceed here without waiting for CDRIVER-2051 to be resolved. If it turns out that CDRIVER-2051 identifies a different subset than our guess, we can open a follow-up ticket to address the discrepancies.

Comment by David Golden [ 09/Feb/17 ]

PR posted here: https://github.com/mongodb/mongo-cxx-driver/pull/581

Thank you!

Comment by Adam Seering [ 09/Feb/17 ]

Ah – I appear to have libmongoc 1.5.2. I've worked around the IPv6 problem for now, but will try updating at some point; hopefully that will fix things.

I'm now able to run my tests for the ::database() issue. They need a little tweaking but they mostly pass. I'll try to post a PR soon.

Comment by David Golden [ 09/Feb/17 ]

What version of the libmongoc? This sounds like CDRIVER-1988 (fixed in libmongoc 1.5.3).

Comment by J Rassi [ 09/Feb/17 ]

Ah, good catch! I failed to notice the "IP6" in the tcpdump output you posted.

To get the server to listen for IPv6 connections, I believe you have to use the --ipv6 command-line option (or can set this in your config file with {net: {ipv6: true}})? If that doesn't work, removing the "bindIp" directive entirely, in addition.

Comment by Adam Seering [ 09/Feb/17 ]

<looks at the output again> Ah, I see what's going on: "localhost" does not uniquely identify an interface on modern computers. The server is listening exclusively on 127.0.0.1:27017, as per the default MongoDB config file for Mac. The test is exclusively attempting to connect to [::1]:27017. So it's an IPv4 vs IPv6 issue.

I tried updating the mongod.conf with:

net:
  bindIp: [127.0.0.1, ::1]

But this doesn't seem to have caused the server to listen on IPv6. Any thoughts? Is it possible to point the tests at 127.0.0.1 explicitly?

Comment by J Rassi [ 09/Feb/17 ]

The tcpdump output shows that test_driver is indeed sending a SYN to localhost:27017 and getting back an RST, which is unexpected if there is indeed a server listening on localhost:27017. Can you confirm that running tcpdump with the mongo shell instead shows the connection get established (Flags [S] incoming, followed by Flags [S.] outgoing, followed by Flags [.] incoming)?

If so, the problem is almost certainly with some local firewall settings, as your operating system decides at the network layer which packets get through and which don't. Can you try temporarily disabling your Mac OS firewall?

Comment by Adam Seering [ 09/Feb/17 ]

Sure:

$ sudo tcpdump -i lo0 port 27017
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes
11:53:43.009944 IP6 localhost.53398 > localhost.27017: Flags [S], seq 964631166, win 65535, options [mss 16324,nop,wscale 5,nop,nop,TS val 1433344897 ecr 0,sackOK,eol], length 0
11:53:43.009964 IP6 localhost.27017 > localhost.53398: Flags [R.], seq 0, ack 964631167, win 0, length 0

$ src/mongocxx/test/test_driver "CRUD functionality"
 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_driver is a Catch v1.0 b53 host application.
Run with -? for options
 
-------------------------------------------------------------------------------
CRUD functionality
-------------------------------------------------------------------------------
/path/to/mongo-cxx-driver/src/mongocxx/test/collection.cpp:84
...............................................................................
 
/path/to/mongo-cxx-driver/src/mongocxx/test/collection.cpp:84: FAILED:
due to unexpected exception with message:
  No suitable servers found (`serverSelectionTryOnce` set): [connection refused
  calling ismaster on 'localhost:27017']: generic server error
 
===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed

Comment by J Rassi [ 09/Feb/17 ]

Hmm. It's somewhat strange that the "mongo" binary can connect but "test_driver" can't; I wonder if there's a misbehaving OS X firewall involved. In one terminal, can you run tcpdump as follows:

sudo tcpdump -i lo0 port 27017

Then, in another terminal, can you run the "CRUD functionality" test as follows (from your "out" directory):

src/mongocxx/test/test_driver "CRUD functionality"

Please provide the output of running both commands. Thanks.

Comment by Adam Seering [ 09/Feb/17 ]

Ah, yep, the problem was an issue with the library search path. I'm on a Mac; I had to set DYLD_FALLBACK_LIBRARY_PATH to find both libbson and libmongo_c.

I'm now seeing a second problem. Tests are running, but a pile of them are failing with messages that look, for example, like:

2: -------------------------------------------------------------------------------
2: CRUD functionality
2: -------------------------------------------------------------------------------
2: /path/to/mongo-cxx-driver/src/mongocxx/test/collection.cpp:84
2: ...............................................................................
2: 
2: /path/to/mongo-cxx-driver/src/mongocxx/test/collection.cpp:84: FAILED:
2: due to unexpected exception with message:
2:   No suitable servers found (`serverSelectionTryOnce` set): [connection refused
2:   calling ismaster on 'localhost:27017']: generic server error
2: 
2: -------------------------------------------------------------------------------

I do have a database running locally, and am able to connect to it just fine by running the command-line client as "mongo --host localhost --port 27017". When I log in using the command-line client, I see a line in the server log logging the connection; when I run the test suite, I don't see any new log messages. Any thoughts?

Comment by J Rassi [ 08/Feb/17 ]

We do have testing docs here: http://mongodb.github.io/mongo-cxx-driver/contributing/testing-mongocxx/, but they're rather incomplete. I just filed linked ticket CXX-1219 to improve that page (I'll also update the ticket with whatever relevant information comes out of the issue you're running into).

"make test" is indeed the canonical way to run all driver tests, so you're doing the right thing. The only other required step for setup is that you must bring up a MongoDB deployment on localhost:27017 for the integration tests targets to pass (test_driver, test_specs). Any supported server version (2.4+) or server configuration will do; during development we'll often just run the latest stable version (currently 3.4.2) locally in standalone mode.

I've most often all tests fail with OTHER_FAULT when the test binaries can't start due to a missing required shared library file (e.g. if the tests are linking dynamically against libbson, but the user has installed libbson to a directory not in the ld.so search path), though I'd have to see the verbose output before making a conclusion. Can you provide the output of running "ctest -VV", in the same directory you're running "make test"? ctest will also print out the full paths of the test executables (test_bson, etc), which you can also run individually.

Comment by Adam Seering [ 08/Feb/17 ]

@rassi : sorry for the radio silence; I was pulled off to other things for a while and just got back to this.

I have a tentative patch, but I'm wondering if you have any docs on how to run the mongocxx test suite? If I simply run "make test", I get:

Test project /path/to/mongo-cxx-driver/out
    Start 1: bson
1/4 Test #1: bson .............................***Exception: Other  0.00 sec
    Start 2: driver
2/4 Test #2: driver ...........................***Exception: Other  0.00 sec
    Start 3: instance
3/4 Test #3: instance .........................***Exception: Other  0.00 sec
    Start 4: specs
4/4 Test #4: specs ............................***Exception: Other  0.00 sec
 
0% tests passed, 4 tests failed out of 4
 
Total Test time (real) =   0.02 sec
 
The following tests FAILED:
	  1 - bson (OTHER_FAULT)
	  2 - driver (OTHER_FAULT)
	  3 - instance (OTHER_FAULT)
	  4 - specs (OTHER_FAULT)

Comment by J Rassi [ 18/Jan/17 ]

adam.seering: just letting you know we kicked this ticket out to the following sprint (which starts Feb. 10) during our sprint planning meeting today. So, we'd happily accept a pull request for this if you submit one before Feb. 10; if you don't, no worries, we'll just assign this internally on that date.

Comment by J Rassi [ 13/Jan/17 ]

Oh, we'd happily accept a pull request for this issue if you'd like to submit one. Above, I just meant that someone should think about whether any other broken checks on return values from libmongoc::uri_* methods should change the way we handle this one, but that person could be you if you'd like! If you do submit a pull request, please fix the libmongoc error handling for this whole file, and add appropriate test coverage to src/mongocxx/test/uri.cpp.

We'll probably want to merge a fix for this sometime in the next few weeks, so we'll just post a comment here that we're starting work on this if you don't happen to get around to it by the time it makes it to the front of our queue.

Comment by Adam Seering [ 12/Jan/17 ]

Thanks for the quick triage, and for the update on your plan! It sounds like things are scheduled? Let me know if there's anything I can do to help. I'd be glad to submit a PR if you'd like, though it might be faster for you to just do it than for you to describe your design decisions to me and review my code

For what it's worth, I've worked around this for now by parsing my URL using the C API. Clunky, but it works.

Comment by J Rassi [ 12/Jan/17 ]

Per recent changes to our versioning policy, I've updated this ticket to target 3.2.0-rc0, and created CXX-1191 to track the fix that will be applied to 3.1.2.

Comment by J Rassi [ 10/Jan/17 ]

Thanks for the report, adam.seering! I've triaged this into 3.1.2. Your suggestion of returning the empty string in this case matches my intuition, but we'll decide for certain after taking a look at other defective methods in this class.

Note that that we've scheduled a full audit of the libmongoc/libbson integration in CXX-1066, since we've seen similar bugs crop up in the past.

Relatedly, it seems like the documentation for mongoc_uri_t could use improvement. The eventual assignee of this ticket should make sure to file tickets in CDRIVER for any mongoc_uri_t API methods that are incorrectly documented to return non-null.

Comment by Adam Seering [ 10/Jan/17 ]

Hm, I guess I didn't format my code examples properly... I don't know how to edit this issue description? Let me know if the examples are unclear; I can re-post them.

Generated at Wed Feb 07 22:01:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.