[CXX-890] Add an example that shows how to manage a long-lived instance object Created: 08/Apr/16 Updated: 17/Feb/17 Resolved: 09/Sep/16 |
|
| Status: | Closed |
| Project: | C++ Driver |
| Component/s: | Documentation |
| Affects Version/s: | None |
| Fix Version/s: | 3.0.2 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Andrew Morrow (Inactive) | Assignee: | Andrew Morrow (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Description |
|
See the discussion here: |
| Comments |
| Comment by Githook User [ 10/Sep/16 ] | ||||||||||||||||||||||||||||||
|
Author: {u'username': u'acmorrow', u'name': u'Andrew Morrow', u'email': u'acm@mongodb.com'}Message: | ||||||||||||||||||||||||||||||
| Comment by Githook User [ 09/Sep/16 ] | ||||||||||||||||||||||||||||||
|
Author: {u'username': u'acmorrow', u'name': u'Andrew Morrow', u'email': u'acm@mongodb.com'}Message: | ||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 30/Jul/16 ] | ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 06/Jun/16 ] | ||||||||||||||||||||||||||||||
|
acm, I think the value of removing the lock speaks for itself. To me at least; instance::current() isn't of much value; if I'm going to create something, I can always hang on to it. If a library introduces complexity around something that's simple, then I have a hard time deriving value from it. Personally, given the need to interact with the C level here and thus deal with order of destruction, I think your options are somewhat limited. Were it me, I think I'd go with 'remove the undefined behavior, but don't go to heroics'. That is, what's there now is undefined; that's clearly not good. Attempting a fix from a pure library perspective puts things on somewhat of the hairy edge of the interaction between the language and the runtime, and I'm not sure there's a solid solution there, TBH. 'Tricky' is definitely an apt description. In our instance, we worked around the issue by implementing a registry of singletons with a defined destruction order. I think that any reasonably-sized C++ solution gets to that kind of thing eventually; you tend to need at least a few singletons, they tend to have interactions, and having things go sideways at termination time isn't ideal, so solutions tend to gravitate in such a direction. Given that such a registry is a trivial matter, perhaps pointing the consumer in that direction via documentation and examples is a viable option. As an example, the interface for ours is just this here; the implementation is just a priority queue. Lambdas work well for the actual functions.
| ||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 06/Jun/16 ] | ||||||||||||||||||||||||||||||
|
alb - I'm back to thinking about this. I'm pretty sure I agree with you. That lock has to go. I think there are a few things to consider as well:
| ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 12/Apr/16 ] | ||||||||||||||||||||||||||||||
|
While I'm typically in the 'singletons are best avoided where possible' camp, I think in this case it makes perfect sense; to do otherwise I would think mean asking the consumer to create a function-scope 'mongocxx::library::initializer' or something like that in main() and forcing dependency injection. Again, while I'm normally a fan of DI, it seems obtuse here for something that's basically got to be a global, no matter how hard you try to dress it up. Were it me I'd do whatever I could to eliminate that lock in current(); the benefits are many. And frankly, while any .0 release is going to encounter some issues as customers start to really use it, we're quite impressed with this one. | ||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 12/Apr/16 ] | ||||||||||||||||||||||||||||||
|
For the record, this is also why we didn't stamp an ABI for the 3.0.0 release - we were pretty certain there were going to be some things we got wrong. Looks like instance::current is probably a bad idea. | ||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 12/Apr/16 ] | ||||||||||||||||||||||||||||||
|
Right, the DSO initialization order basically groups them, but DSOs aren't something C++ speaks to, at least not yet. I agree we shouldn't rely on it. I also think magic statics can probably solve a lot of this, but I want to go re-read the rules on how their destruction is sequenced. But that discussion really only applies to instance::current. The bigger question is, assuming you don't use instance::current, how do you manage the lifetime of the instance object. The intention is that it is a singleton. There are initialization and termination functions that we need to call in libmongoc, and we really can't avoid it. I think the ultimate answer of how the instance object should be managed needs to be left up to the application. Only the application knows the scope in which it needs the driver to be useable. One way might be to hold it by shared_ptr, and allow it to be destroyed after the last reference goes away. I'm not usually a big fan of shared_ptr though, as it obscures and confounds reasoning about ownership. A function scoped instance using magic statics may in fact be the best way to go. | ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 12/Apr/16 ] | ||||||||||||||||||||||||||||||
|
It's actually likely to be problematic when dynamically linked on a non-ELF system, i.e. PE, COFF, XCOFF. Static initialization/destruction order bugs are somewhat papered over by the way ELF implements DSOs, but in reality you're relying on implementation-defined behavior – that is, it works, but not for the reasons you'd expect. Debugging these can be the stuff of story and song. At the moment, the implementation is sort of a quasi-singleton; it might be the global instance created on-demand, it might be something created for the first time elsewhere. I think that instance::current() with respect to the global instance, given that the initialization order of statics within a file scope is well-defined, is likely to be portable; the mutex is declared first, so per the spec you should be ok there. However, when the global instance isn't in play and instead it's an instance created elsewhere, we're now crossing translation units and thus in undefined behavior. FWIW, I'd probably just go with making it a singleton and by doing so strive to eliminate the lock entirely. Given that you're targeting C++11, a function-scope static is thread-safe anyway, i.e., 'magic statics'. | ||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 11/Apr/16 ] | ||||||||||||||||||||||||||||||
|
Yes, in static linking land, file scoped unique_ptr<instance> is a bad idea for the reasons you point out. That suggests that the current design of instance::current is flawed, at least for static linking. | ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 11/Apr/16 ] | ||||||||||||||||||||||||||||||
|
In short, the static destruction order fiasco is the same as the static construction order fiasco; we don't know which order they're going to be run in, so while there's a static instance_mutex, I think having an instance referenced by a static unique_ptr is basically rolling the dice – if you're lucky, it works; if you're unlucky, it'll abort on exit due to the already-destructed mutex being used. | ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 11/Apr/16 ] | ||||||||||||||||||||||||||||||
|
My test case is running on OSX, latest clang, static linking. Make a file-scope std::unique_ptr<mongocxx::instance>. Assign an instance to this. The destructor for this unique_ptr will run when static destructors are called. When that happens, the instance destructor will attempt to lock the file-scope instance_mutex. If you're lucky, there'll be no issues. If you're unlucky, you'll abort on termination with an exception, libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argument. What I presume to be the case here is that the order that the static destructors are run in is undefined, and the mutex is going first. | ||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 11/Apr/16 ] | ||||||||||||||||||||||||||||||
|
Sure, in the case where you want to inject a custom logger you do need to manage the lifetime explicitly. Going back to your comment above, I don't understand why the behavior is undefined when the std::unique_ptr<mongocxx::instance> destructor runs. | ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 11/Apr/16 ] | ||||||||||||||||||||||||||||||
|
The relative merits of requiring dynamic linking aside, I guess my confusion lies in knowing what value I derive from having access to a default-constructed instance in light of wanting to use a custom logger. | ||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 11/Apr/16 ] | ||||||||||||||||||||||||||||||
|
Hi alb - Have a look at the mongocxx::instance::current() method: I think it does basically what you want. The instance object is owned in a file scoped global in instance.cpp, so as long as you are linking to libmongocxx dynamically you shouldn't have any initialization order fiasco to worry about. You also don't need to track the instance object manually. The examples were all basically written before instance::current was written, so they don't use it. | ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 08/Apr/16 ] | ||||||||||||||||||||||||||||||
|
More succinctly: It would appear from testing that the only safe way to instantiate a mongocxx::instance is within a function scope that will not outlive main(). To do otherwise appears to run into a static destructor / atexit() fiasco. However, when using a pool, one typically wants the pool to have a global point of reference, and yet be destroyed before the instance. These two requirements appear at present to be somewhat tricky to achieve. | ||||||||||||||||||||||||||||||
| Comment by Allan Bazinet [ 08/Apr/16 ] | ||||||||||||||||||||||||||||||
|
Since I'm the instigator here, some more information. The basic examples are all the same – within a function scope, instantiate an instance. Fine for trivial things, but if all we had were trivial problems, we wouldn't use Mongo. Commonly, the instance is going to be long-lived, there'll probably be a pool involved, and going about this naively results in odd behavior. For example, nothing prohibits an instance from being heap-allocated. We likely want to hand it a logger instance, since we want to know of issues via our preexisting logging interface. We want a long-lived object here, so heap allocation makes sense. We're using C++11, in which use of raw pointers are considered to be just so five minutes ago, so it'd be reasonable to wrap that in a std::unique_ptr. After all, the documentation doesn't say anything about lifetime management, only that there must be one around. Thus, one approach is to have a static std::unique_ptr<mongocxx::instance> present, and at an appropriate time, e.g., once the connection string has been read from whatever's being used for configuration, a reset() is done with a new instance. This is well and good until the program exits, the std::unique_ptr destructor runs, and behavior is undefined; this will typically exhibit itself via an error message indicating that a mutex lock argument is invalid on exit. One can dig for a while and discover why, but it'd be nice to indicate that in general, it's probably best that the instance is at function scope in main(), and that it not be heap allocated unless the lifetime of the heap object is managed such that it is destroyed before the C++ runtime calls static destructors and atexit() routines. |