Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8496

_id values generated by mongod are not thread safe across databases, may cause mongod to generate duplicate ids for a single collection (resulting in insert failure)

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.4.0-rc1
    • Affects Version/s: 2.2.3, 2.3.2
    • Component/s: Concurrency, Storage
    • Labels:
      None
    • ALL

      DataFileMgr::insert is typically called with a database level write lock, for example when an insert operation request is received on the wire receivedInsert() will acquire such a lock and call checkAndInsert() which calls DataFileMgr::insertWithObjMod() which calls DataFileMgr::insert().

      The implementation of DataFileMgr::insert() checks for an existing id in the document to be inserted. If no _id is found, a new ObjectId is generated and added. The implementation for adding an ObjectId uses the static global variables idToInsert and idToInsert. First the OID member of idToInsert_ is initialized with its init() method. Then the idToInsert object is byte copied into the on disk representation of the document to be inserted. If two threads are concurrently inserting into two separate databases, possible race conditions include:

      • the two init() calls precede the two copies into the doc store, resulting in duplicate _ids in different databases
      • the two init() calls overlap. OID::init() is not thread safe for a shared OID object, and this may create an ObjectId value unanticipated by the OID implementation. (In particular the bytes of the counter portion of the id are set individually not in one atomic assignment.) The resulting unexpected id value may conflict with a prior or future document's _id in the same collection, resulting in an insert failure when the second a conflicting document is inserted.

      Test

      #include <iostream>
      #include <cstdlib>
      
      #include "mongo/client/dbclient.h"
      
      using namespace std;
      using namespace mongo;
      
      int main( int argc, const char **argv ) {
      
          const char *port = "27017";
          const char *db = argv[ 1 ];
      
          mongo::DBClientConnection conn;
          string errmsg;
          if ( ! conn.connect( string( "127.0.0.1:" ) + port , errmsg ) ) {
              cout << "couldn't connect : " << errmsg << endl;
              return EXIT_FAILURE;
          }
      
          string ns = string( db ) + ".coll";
          int i = 0;
          while( true ) {
            conn.insert( ns, BSON( "a" << 1 ) );
            if ( i % 1000 == 0 ) {
              BSONObj err = conn.getPrevError();
              if ( !err[ "err" ].isNull() ) {
                log() << "err: " << err << endl;
                conn.resetError();
              }
            }
            ++i;
          }
      
      }
      
      

      Run two instances of the test concurrently (on different databases using the first command line arg). It can take several minutes to observe the first dup key error.

      This bug will only manifest when clients do not insert _id fields and mongod generates _ids itself.

            Assignee:
            eliot Eliot Horowitz (Inactive)
            Reporter:
            aaron Aaron Staple
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: