[CDRIVER-2863] Consider API to allow program to optionally provide custom threading intrinsics Created: 26/Oct/18  Updated: 29/Oct/18  Resolved: 29/Oct/18

Status: Closed
Project: C Driver
Component/s: libbson
Affects Version/s: 1.13.0
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Mitchell Blank Assignee: A. Jesse Jiryu Davis
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible

 Description   

src/common/common-thread-private.h includes a bunch of macros that abstract between using the windows or pthreads threading primitives. However, in some environments you need finer control over this.

For example, we have our own code that tracks all of the threads that are running and enables some profiling. We also have our own mutex/cvar wrappers that also have extended functionality available. However now our accounting for the running threads is inaccurate because _mongoc_topology_run_background() is creating pthreads behind our back.

OpenSSL is a good citizen in this regard – it provides APIs like CRYPTO_set_dynlock_*_callback() which let us register callbacks for creating/locking/unlocking/destroying mutex objects, so we can make it use the same extended primitives that all of our other code does.

mongoc is a more complicated case since its internal threading needs go beyond simple mutexes, but it would be really helpful if the common-thread-private.h stuff was better abstracted behind a callback interface (like a struct filled with function pointers, with objects like mutexes being only visible as anonymous struct pointers) and an API was provided so that before the library was used we could provide our own implementation. Of course, if our own wasn't provided it would just use the same win32/pthread implementations.



 Comments   
Comment by A. Jesse Jiryu Davis [ 29/Oct/18 ]

Thanks for the info! I understand your idea but it isn't going to be a priority for us. I suggest you make the changes you need in your copy of the C Driver code, I hope it'll be reasonably isolated and it won't be difficult for you to maintain your patch. Thanks for understanding.

Comment by Mitchell Blank [ 26/Oct/18 ]

> Can you tell me more about your profiling code?

Well, I don't want to go too deep into the details on this because the answer is "various stuff" but off the top of my head:
1. We have introspection tools that allow reporting on the number of threads in use, and what their "names" are, and on some platforms can even force a backtrace if you want
2. We sometimes build against google-profiler which wants us to call ProfilerRegisterThread() when each thread runs, which our threading abstraction takes care of for us
3. In the past we've used custom memory allocators that required us to manually flush per-thread caches when finishing the thread
4. We customize the stack size of each thread on a per-OS basis
5. On windows we use CRT's _beginthreadex instead of the lower-level CreateThread

However, what I'm trying to get across is not that we have specific needs for how threads are set up, but that in general large projects often have similar constraints. We might have others in the future. It's something of an anti-pattern to create background threads from inside a library because it complicates so many things. For example, if someone manually uses your library via dlopen()/dlcose() they need to make sure all of its internal threads are shut down first... and of course fork() gets complicated too. However, if you must create your own background thread it would be best if it were at least possible for the calling program to be able to keep some control over the process.

For just the case of thread spawn/join the plugin interface doesn't need to be particularly complicated. Something like:

struct bson_thread_pointer;   /* opaque type */
 
/* Start a new thread running func(arg), returns NULL if unable to create a new thread */
typedef struct bson_thread_pointer *(*bson_thread_spawn_callback_t)(void (*func)(void *arg), void *arg);
/* Joins a thread created by the above callback, and frees the "thr" pointer */
typedef void (*bson_thread_join_callback_t)(struct bson_thread_pointer *thr);
 
struct bson_thread_spawnjoin_callbacks {
  bson_thread_spawn_callback_t spawn_callback;
  bson_thread_join_callback_t join_callback;
};
 
extern void bson_set_thread_spawn_callbacks(const struct bson_thread_spawnjoin_callbacks *callbacks);

(This is just a sketch; I'm sure you can come up with API names that better match your conventions)

Then the default callbacks just do something like this (untested):

struct bson_thread_pointer {
#if defined(BSON_OS_UNIX)
  pthread_t pt;
#else
  HANDLE h;
#endif
  void (*func)(void *arg);
  void *arg;
};
 
/* Little wrapper functions to deal with the fact that pthreads and winthreads want a different function prototype */
#if defined(BSON_OS_UNIX)
static void *bson_pthread_main(void *thr)
{
    thr->func(thr->arg);
    return NULL;
}
#else
static DWORD __stdcall bson_winthread_main(void *thr)
{
    thr->func(thr->arg);
    return 0;
}
#endif
 
static bool used_default_callbacks_already = false;
 
static struct bson_thread_pointer *default_thread_spawn_callback(void (*func)(void *arg), void *arg)
{
    struct bson_thread_pointer *thr = bson_malloc(sizeof(*thr));
    thr->func = func;
    thr->arg = arg;
#if defined(BSON_OS_UNIX)
    if (pthread_create(&thr->pt, NULL, bson_pthread_main, thr) != 0) {
        bson_free(thr);
        return NULL;
    }
#else
    thr->h = CreateThread(NULL, 0, bson_winthread_main, thr, 0, NULL);
    if (thr->h == NULL) {
        bson_free(thr);
        return NULL;
    }
#endif
    used_default_callbacks_already = true;
    return thr;
}
 
static void default_thread_join_callback(struct bson_thread_pointer *thr)
{
#if defined(BSON_OS_UNIX)
    (void) pthread_join(thr->pt, NULL);
#else
    (void) WaitForSingleObject(thr->h, INFINITE);
#endif
    bson_free(thr);
}
 
static const struct bson_thread_spawnjoin_callbacks default_spawnjoin_callbacks = {
    default_thread_spawn_callback,
    default_thread_join_callback
};
 
static const struct bson_thread_spawnjoin_callbacks *active_spawnjoin_callbacks = &default_spawnjoin_callbacks;
 
void bson_set_thread_spawn_callbacks(const struct bson_thread_spawnjoin_callbacks *callbacks)
{
    assert(!used_default_callbacks_already);  /* ...or it's too late to change! */
    active_spawnjoin_callbacks = callbacks;
}
 
struct bson_thread_pointer *bson_thread_create(void (*func)(void *arg), void *arg)
{
    return (active_spawnjoin_callbacks->spawn_callback)(func, arg);
}
 
void bson_thread_join(struct bson_thread_pointer *thr)
{
    (active_spawnjoin_callbacks->join_callback)(thr);
}

...that would not only reduce the amount of OS-specific stuff you have in common-thread-private.h but would also give users who want control over the thread spawning an easy interface that they can plug into to override the default.

I gave the example of OpenSSL as a "best practice" since it even lets the library user provide their own thread-id and mutex implementations. This isn't as major of an issue for me: we do have our own mutex/cvar classes that have some extra debugging option and it's nice that openssl uses the same ones as our normal code, but I don't foresee major issues with just using pthread/winthread primitives directly here. (We used to have to do much more extensive stuff in the days of WinXP to emulate a condition-variable on top of windows EVENT's but these days the windows primitives are fairly reasonable) Still, it is something you may want to consider if you want to make your library as user-friendly as possible. The same techniques outlined about could be used to allow plug-in replacement of mutex and cvar.

Comment by A. Jesse Jiryu Davis [ 26/Oct/18 ]

Hi, this is an interesting request. I'd like to explore other ways to fulfill your needs that don't expose this much internal state of the C Driver or possibly impose this much maintenance and testing burden. Can you tell me more about your profiling code?

mongoc_client_pool_new only starts one thread which is responsible for "SDAM", our Server Discovery and Monitoring algorithm. You can watch its operations with "SDAM Monitoring", which allows you to register callbacks for each event associated with the thread's responsibilities: http://mongoc.org/libmongoc/current/application-performance-monitoring.html#sdam-monitoring-example

If you use mongoc_client_t without a mongoc_client_pool_t, then the C Driver doesn't start any threads, although this has disadvantages compared with the mongoc_client_pool_t.

Generated at Wed Feb 07 21:16:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.