Uploaded image for project: 'C++ Driver'
  1. C++ Driver
  2. CXX-90

Crash due to static initialization order fiasco in BSON implementation

      The actual BSON implementation assumes the following static initialization order: (static) global variables of Boost Thread implementation are initialized before MongoDB's global (static) variables. However, as you know it, the C/C++ standard does not guarantee this across different translation units.

      To give an example:
      1. static_fiasco.7z attachment contains two directories:
      a. mongodb-src-r2.0.1 - the debug variant of the static mongoclient library, against Boost 1.43. It was built with the following command:
      > scons mongoclient --d
      b. mongotest - a very basic application, that uses the static mongo library from the first directory.
      2. running the debug variant of the mongotest.exe most of the time results in crash (just run the already built EXE file from the Debug directory).

      What happens? I tracked down with OllyDbg the cause of the corruption:
      1. Due to the fact the mongotest is linked statically with mongoclient and boost, on my computer the linker choosed to initialize mongoclient before boost.
      2. This is executed from the dbclient.cpp:
      const BSONObj getlasterrorcmdobj = fromjson("

      {getlasterror:1}

      ");
      It could be also other static initialization, for example from jsobjcpp:
      BSONObj staticNull = fromjson( "

      {'':null}

      " );
      It does not matter which one. Depends on the linker, how it constructs the array with the static initialization callbacks.
      3. fromjson() indirectly calls the "set_current_thread_data()" from boost\libs\thread\src\win32\thread.cpp
      Please check the 00_call_once.png screenshot with the complete callstack.
      4. This sets the "current_thread_tls_init_flag" from the thread.cpp
      5. When the static initialization from the mongoclient is finished the boost_thread one will be executed, which will destroy/reset the already set current_thread_tls_init_flag.
      Please check the 01_once_flag.png screenshot with the complete callstack.
      6. The next time a thread is created, like at connect, the "create_current_thread_tls_key()" will be called second time, because the flag was reset previously incorrectly.
      7. A new TLS key (index) will be allocated, causing corruption.
      8. It does not matter whether it is debug or release build. The release one usually crashes more subtle.

      Conclusion:

      • fromjson() uses JsonGrammar, which uses boost spirit, which uses thread specific data. Due to the fact that fromjson() is called in static initialization phase, you cannot guarantee that the "set_current_thread_data()" function which uses a "call_once()" will be executed BEFORE the initialization of "current_thread_tls_init_flag". This depends on the linker: i.e. it is random.

      Possible solutions:
      1. Do not use fromjson() directly in the static/global variable initializations.
      2. Do not use boost::spirit for grammar in the static/global variable initializations.
      3. Do not use static linking of Mongo with Boost.

      I think the first two options can be dropped. It is too complicated to change the code and the logic of the C++ clases.

      Solution: ALWAYS use dynamic linking with Boost on Windows. This way it can be ensured that the DllMain of boost_thread, containing the initialization of the "current_thread_tls_init_flag", will be called BEFORE the initialization of mongoclient specific static data. Add to SConstruct:
      env.Append( CPPDEFINES=[ "BOOST_ALL_DYN_LINK" ] )
      and use /MD instead of /MT in release.

      So you should drop/deny the mongoclient.lib to be linked statically with Boost.

        1. 00_call_once.png
          151 kB
          Balint Szente
        2. 01_once_flag.png
          121 kB
          Balint Szente
        3. static_fiasco.7z
          20.65 MB
          Balint Szente

            Assignee:
            mira.carey@mongodb.com Mira Carey
            Reporter:
            bszente Balint Szente
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: