[CXX-890] Add an example that shows how to manage a long-lived instance object Created: 08/Apr/16  Updated: 17/Feb/17  Resolved: 09/Sep/16

Status: Closed
Project: C++ Driver
Component/s: Documentation
Affects Version/s: None
Fix Version/s: 3.0.2

Type: Improvement Priority: Major - P3
Reporter: Andrew Morrow (Inactive) Assignee: Andrew Morrow (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to CXX-1196 Improve mongocxx::instance documentation Closed

 Description   

See the discussion here:

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/mongodb-user/04Eo8DSgAXg/C54uwTr7DgAJ



 Comments   
Comment by Githook User [ 10/Sep/16 ]

Author:

{u'username': u'acmorrow', u'name': u'Andrew Morrow', u'email': u'acm@mongodb.com'}

Message: CXX-890 Not all C++11 type traits are supported before GCC 5
Branch: master
https://github.com/mongodb/mongo-cxx-driver/commit/ae4f793b7295038018474451132f651804408325

Comment by Githook User [ 09/Sep/16 ]

Author:

{u'username': u'acmorrow', u'name': u'Andrew Morrow', u'email': u'acm@mongodb.com'}

Message: CXX-890 Don't use objects requiring dynamic init/fini for instance::current
Branch: master
https://github.com/mongodb/mongo-cxx-driver/commit/db943d54894f4a1bfa2f9f840f5150179e59b487

Comment by Andrew Morrow (Inactive) [ 30/Jul/16 ]

https://github.com/mongodb/mongo-cxx-driver/pull/513

Comment by Allan Bazinet [ 06/Jun/16 ]

acm, I think the value of removing the lock speaks for itself.

To me at least; instance::current() isn't of much value; if I'm going to create something, I can always hang on to it. If a library introduces complexity around something that's simple, then I have a hard time deriving value from it.

Personally, given the need to interact with the C level here and thus deal with order of destruction, I think your options are somewhat limited. Were it me, I think I'd go with 'remove the undefined behavior, but don't go to heroics'. That is, what's there now is undefined; that's clearly not good. Attempting a fix from a pure library perspective puts things on somewhat of the hairy edge of the interaction between the language and the runtime, and I'm not sure there's a solid solution there, TBH. 'Tricky' is definitely an apt description.

In our instance, we worked around the issue by implementing a registry of singletons with a defined destruction order. I think that any reasonably-sized C++ solution gets to that kind of thing eventually; you tend to need at least a few singletons, they tend to have interactions, and having things go sideways at termination time isn't ideal, so solutions tend to gravitate in such a direction.

Given that such a registry is a trivial matter, perhaps pointing the consumer in that direction via documentation and examples is a viable option. As an example, the interface for ours is just this here; the implementation is just a priority queue. Lambdas work well for the actual functions.

#include <functional>
 
namespace Cleanup
{
    // Order in which we want things to be destroyed; the higher
    // the priority, the sooner it'll be cleaned.  PID should be
    // the first entry in this list, with LOG immediately after.
    
    enum class Priority
    {
        PID,
        LOG,
        DB
    };
    
    // A cleanup function is anything that can be called with
    // no arguments.
    
    using Function = std::function<void()>;
    
    // Add the provided function as a managed cleanup function,
    // using the requested priority.
    
    void manage(Priority, Function);
    
    // To be called at the end of main(); calls managed cleanup
    // functions in priority order.
    
    void finish();
}

Comment by Andrew Morrow (Inactive) [ 06/Jun/16 ]

alb - I'm back to thinking about this. I'm pretty sure I agree with you. That lock has to go. I think there are a few things to consider as well:

  • Do you think that instance::current() something of use or value? I think I see a way to implement it without a lock, but it is up against the edge of what I'd consider within bounds w.r.t. the standard. Close enough that I'm slightly worried that it is outside.
  • Is the goal, to your way of thinking, that it should be possible for an instance object to be destroyed anytime after main, once we are in static destruction and/or atexit processing? That may be tricky.
Comment by Allan Bazinet [ 12/Apr/16 ]

While I'm typically in the 'singletons are best avoided where possible' camp, I think in this case it makes perfect sense; to do otherwise I would think mean asking the consumer to create a function-scope 'mongocxx::library::initializer' or something like that in main() and forcing dependency injection. Again, while I'm normally a fan of DI, it seems obtuse here for something that's basically got to be a global, no matter how hard you try to dress it up.

Were it me I'd do whatever I could to eliminate that lock in current(); the benefits are many.

And frankly, while any .0 release is going to encounter some issues as customers start to really use it, we're quite impressed with this one.

Comment by Andrew Morrow (Inactive) [ 12/Apr/16 ]

For the record, this is also why we didn't stamp an ABI for the 3.0.0 release - we were pretty certain there were going to be some things we got wrong. Looks like instance::current is probably a bad idea.

Comment by Andrew Morrow (Inactive) [ 12/Apr/16 ]

Right, the DSO initialization order basically groups them, but DSOs aren't something C++ speaks to, at least not yet. I agree we shouldn't rely on it. I also think magic statics can probably solve a lot of this, but I want to go re-read the rules on how their destruction is sequenced. But that discussion really only applies to instance::current.

The bigger question is, assuming you don't use instance::current, how do you manage the lifetime of the instance object. The intention is that it is a singleton. There are initialization and termination functions that we need to call in libmongoc, and we really can't avoid it. I think the ultimate answer of how the instance object should be managed needs to be left up to the application. Only the application knows the scope in which it needs the driver to be useable.

One way might be to hold it by shared_ptr, and allow it to be destroyed after the last reference goes away. I'm not usually a big fan of shared_ptr though, as it obscures and confounds reasoning about ownership.

A function scoped instance using magic statics may in fact be the best way to go.

Comment by Allan Bazinet [ 12/Apr/16 ]

It's actually likely to be problematic when dynamically linked on a non-ELF system, i.e. PE, COFF, XCOFF. Static initialization/destruction order bugs are somewhat papered over by the way ELF implements DSOs, but in reality you're relying on implementation-defined behavior – that is, it works, but not for the reasons you'd expect. Debugging these can be the stuff of story and song.

At the moment, the implementation is sort of a quasi-singleton; it might be the global instance created on-demand, it might be something created for the first time elsewhere.

I think that instance::current() with respect to the global instance, given that the initialization order of statics within a file scope is well-defined, is likely to be portable; the mutex is declared first, so per the spec you should be ok there. However, when the global instance isn't in play and instead it's an instance created elsewhere, we're now crossing translation units and thus in undefined behavior.

FWIW, I'd probably just go with making it a singleton and by doing so strive to eliminate the lock entirely. Given that you're targeting C++11, a function-scope static is thread-safe anyway, i.e., 'magic statics'.

Comment by Andrew Morrow (Inactive) [ 11/Apr/16 ]

Yes, in static linking land, file scoped unique_ptr<instance> is a bad idea for the reasons you point out. That suggests that the current design of instance::current is flawed, at least for static linking.

Comment by Allan Bazinet [ 11/Apr/16 ]

In short, the static destruction order fiasco is the same as the static construction order fiasco; we don't know which order they're going to be run in, so while there's a static instance_mutex, I think having an instance referenced by a static unique_ptr is basically rolling the dice – if you're lucky, it works; if you're unlucky, it'll abort on exit due to the already-destructed mutex being used.

Comment by Allan Bazinet [ 11/Apr/16 ]

My test case is running on OSX, latest clang, static linking.

Make a file-scope std::unique_ptr<mongocxx::instance>. Assign an instance to this. The destructor for this unique_ptr will run when static destructors are called.

When that happens, the instance destructor will attempt to lock the file-scope instance_mutex. If you're lucky, there'll be no issues. If you're unlucky, you'll abort on termination with an exception, libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argument.

What I presume to be the case here is that the order that the static destructors are run in is undefined, and the mutex is going first.

Comment by Andrew Morrow (Inactive) [ 11/Apr/16 ]

Sure, in the case where you want to inject a custom logger you do need to manage the lifetime explicitly.

Going back to your comment above, I don't understand why the behavior is undefined when the std::unique_ptr<mongocxx::instance> destructor runs.

Comment by Allan Bazinet [ 11/Apr/16 ]

The relative merits of requiring dynamic linking aside, I guess my confusion lies in knowing what value I derive from having access to a default-constructed instance in light of wanting to use a custom logger.

Comment by Andrew Morrow (Inactive) [ 11/Apr/16 ]

Hi alb - Have a look at the mongocxx::instance::current() method:

https://github.com/mongodb/mongo-cxx-driver/blob/06d29a00e4e554e7930e3f8ab40ebcecc9ab31c8/src/mongocxx/instance.hpp#L59-L65

I think it does basically what you want. The instance object is owned in a file scoped global in instance.cpp, so as long as you are linking to libmongocxx dynamically you shouldn't have any initialization order fiasco to worry about. You also don't need to track the instance object manually.

The examples were all basically written before instance::current was written, so they don't use it.

Comment by Allan Bazinet [ 08/Apr/16 ]

More succinctly:

It would appear from testing that the only safe way to instantiate a mongocxx::instance is within a function scope that will not outlive main(). To do otherwise appears to run into a static destructor / atexit() fiasco.

However, when using a pool, one typically wants the pool to have a global point of reference, and yet be destroyed before the instance. These two requirements appear at present to be somewhat tricky to achieve.

Comment by Allan Bazinet [ 08/Apr/16 ]

Since I'm the instigator here, some more information.

The basic examples are all the same – within a function scope, instantiate an instance. Fine for trivial things, but if all we had were trivial problems, we wouldn't use Mongo. Commonly, the instance is going to be long-lived, there'll probably be a pool involved, and going about this naively results in odd behavior.

For example, nothing prohibits an instance from being heap-allocated. We likely want to hand it a logger instance, since we want to know of issues via our preexisting logging interface. We want a long-lived object here, so heap allocation makes sense. We're using C++11, in which use of raw pointers are considered to be just so five minutes ago, so it'd be reasonable to wrap that in a std::unique_ptr. After all, the documentation doesn't say anything about lifetime management, only that there must be one around.

Thus, one approach is to have a static std::unique_ptr<mongocxx::instance> present, and at an appropriate time, e.g., once the connection string has been read from whatever's being used for configuration, a reset() is done with a new instance.

This is well and good until the program exits, the std::unique_ptr destructor runs, and behavior is undefined; this will typically exhibit itself via an error message indicating that a mutex lock argument is invalid on exit. One can dig for a while and discover why, but it'd be nice to indicate that in general, it's probably best that the instance is at function scope in main(), and that it not be heap allocated unless the lifetime of the heap object is managed such that it is destroyed before the C++ runtime calls static destructors and atexit() routines.

Generated at Wed Feb 07 22:00:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.