[CDRIVER-4231] Duplicates in ObjectID when inserting in parallel Created: 17/Nov/21  Updated: 25/Jan/24  Resolved: 25/Jan/22

Status: Closed
Project: C Driver
Component/s: None
Affects Version/s: None
Fix Version/s: 1.21.0

Type: Task Priority: Unknown
Reporter: Seamus Riley Assignee: Colby Pike
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Cloners
Duplicate
is duplicated by CDRIVER-3913 Reduce likelihood of colliding Object... Closed
Related
is related to CDRIVER-3913 Reduce likelihood of colliding Object... Closed
Case:

 Description   

Prior to driver version 1.14, ObjectID included 3 bytes identifying the machine and 2 bytes denoting PID. This made duplicate ID's impossible. Following 1.14 (and CDRIVER-2771) we see users bulk loading in parallel getting the error:

Error from Mongo database on operation 'bulk operation execute': E11000 duplicate key error collection: gfxdb.Rpt_AggregatedExposure_HU index: id dup key: { : ObjectId('60ab881bb7b6e778e76e34a2') }

The new algorithm includes a 5-byte random value generated once per process. This random value is meant to be unique to the machine and process. My testing shows that different PIDs are capable of creating the same 5-byte random value. 

An additional mechanism for avoiding duplicates is that the 3-byte counter at the end of the ObjectID which starts at a random number. In the cases of duplicate 5-byte situation, the counter number starts at the same number.

To demonstrate this behavior, I wrote a program that loads to a Mongo collection with 1 document per process, and ran it 5000 times. Each ObjectID is then parsed into it's constituent parts. Here I show those parts for a number of 5-byte random number duplicates:

random duplicates timestamp counter write PID
191,162,307,748 2 1637005599 1943458 20617
    1637005804 1943458 4837
162,692,425,265 2 1637005688 6330770 9673
    1637005995 6330770 17759
335,481,554,228 2 1637005711 4833330 15678
    1637006002 4833330 19618
-146,439,389,216 2 1637005841 6191586 13851
    1637006026 6191586 25035
-484,404,755,953 2 1637005985 6445778 15646
    1637006023 6445778 24430
51,631,052,349 2 1637005747 2533378 23997
    1637005882 2533378 23354

Testing against a 1.6 C Driver, I see the 5-bytes (3 for machine and 2 for PID) identical in the case of processes which happen to have identical PIDs (running at different times), but in those cases the counter bytes are randomly distributed, e.g.:

random duplicates timestamp counter write PID
206,863,667,287 3 1637004679 461522 1057
    1637004817 1525762 1057
    1637005104 6964834 1057

Given there are 1/1 trillion odds of two 5-byte random numbers colliding, combined with the odds of the 1/8 million counter numbers colliding (all in the same second), if the code matched the spec, the modern algorithm would not be a problem.



 Comments   
Comment by Colby Pike [ 25/Jan/22 ]

A new method of generating unique psuedorandomness has been introduced in OID generation.

Comment by Githook User [ 25/Jan/22 ]

Author:

{'name': 'vector-of-bool', 'email': 'vectorofbool@gmail.com', 'username': 'vector-of-bool'}

Message: CDRIVER-4231: New random generation of OID values (#931)

  • New random generation of OID values [Fixes CDRIVER-4231]
Comment by Seamus Riley [ 22/Nov/21 ]

I saw this behavior with 1.14.0 C driver on RHEL release 6.9 although I have heard reports the behavior is consistent in more recent versions (including 1.17). In my testing the C compiler identification is GNU 4.9.4. Output of cmake (I shortened some of the paths with $BASE_PATH for readability):

/ab/tools/bin/cmake . -G"Unix Makefiles" --no-warn-unused-cli -DCMAKE_MODULE_PATH=$BASE_PATH/build/cmake-modules -DCMAKE_PROGRAM_PATH=$BASE_PATH/bin -DCMAKE_LIBRARY_PATH=$BASE_PATH/lib -DCMAKE_INCLUDE_PATH=$BASE_PATH/include -DCMAKE_INSTALL_PREFIX=$BASE_PATH -DCMAKE_INSTALL_LIBDIR=$BASE_PATH/lib -DCMAKE_INSTALL_INCLUDEDIR=$BASE_PATH/include -DCMAKE_INSTALL_OLDINCLUDEDIR=$BASE_PATH/include -DCMAKE_INSTALL_SYSCONFDIR=$BASE_PATH/etc -DCMAKE_C_FLAGS="-m64 -DXXI_MODELBITS=64 -DXXI_LINUX=1 -DXXI_X86=1 -DXXI_LINUX_VARIANT="REDHAT_7_7" -Werror=return-type -Werror=uninitialized -Werror=implicit-function-declaration -Werror=maybe-uninitialized" -DCMAKE_CXX_FLAGS="-m64 -DXXI_MODELBITS=64 -DXXI_LINUX=1 -DXXI_X86=1 -DXXI_LINUX_VARIANT="REDHAT_7_7" -Werror=return-type -Werror=uninitialized -Werror=maybe-uninitialized" -DCMAKE_C_FLAGS_DEBUG="-O0 -g2" -DCMAKE_C_FLAGS_RELEASE="-O2" -DCMAKE_C_FLAGS_RELWITHDEBINFO="-O2 -g2" -DCMAKE_CXX_FLAGS_DEBUG="-O0 -g2" -DCMAKE_CXX_FLAGS_RELEASE="-O2" -DCMAKE_CXX_FLAGS_RELWITHDEBINFO="-O2 -g2" -DCMAKE_BUILD_TYPE=DEBUG -DENABLE_AUTOMATIC_INIT_AND_CLEANUP=OFF -DCMAKE_BUILD_TYPE=Release -DENABLE_SSL=OPENSSL -DENABLE_SASL=OFF -DENABLE_ICU=OFF -DENABLE_SNAPPY=OFF -DENABLE_ZLIB=BUNDLED -DENABLE_SRV=ON -DENABLE_BSON=ON -DENABLE_STATIC=OFF -DENABLE_TESTS=OFF -DENABLE_EXAMPLES=OFF -DOPENSSL_ROOT_DIR=../openssl-1.1.1k -DOPENSSL_LIBRARIES=../lib

Comment by Kevin Albertson [ 18/Nov/21 ]

seamus.riley@gmail.com, thank you for the report! We will look into this soon and attempt to reproduce. If possible, can you provide the following information? That may aid the effort of reproducing on an equivalent environment:

  • The exact version of the driver (e.g. 1.16.0).
  • Host OS, version, and architecture.
  • C Compiler and version.
  • The output of the cmake command. That may help to determine what random primitives are being used.
Generated at Wed Feb 07 21:20:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.