-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
(copied to CRM)
ObjectID random values and the counter sequence is seeded from combining time, PID, and hostname as noted in the ObjectID specification.
The seed is generated in _bson_context_init_random as an XOR of:
- current time in seconds.
- current time in microseconds. (this is incorrectly documented as milliseconds)
- PID
- hostname
jmikola identified two possible problems with the current seed generation:
1. The PID is stored as a uint16_t. This truncates PIDs exceeding the range of an uint16_t.
2. Because time is a sequential, and the PID is likely sequential on most systems, the XOR logic may be susceptible to duplicate seeds in rare cases where the difference in PID equals the difference in microseconds.
PID uint16_t truncation
_bson_getpid currently returns a uint16_t. The PID generated is truncated into a uint16_t, which likely restricts the range of PIDs on many systems where the maximum number of processes exceeds 2^16 = 65536.
Looking at the implementation of _bson_getpid:
- Windows uses the 32 bit DWORD returned by GetCurrentProcessId, XORing each 16 bit half.
- Non-Windows casts the pid_t directly as a uint16_t. Inspecting unistd.h on macOS shows this to be a typedef of int32_t, on 64 bit Ubuntu 18.04 pid_t typedefs an int.
The truncation of a signed wider integer type to a narrower unsigned type is likely undefined, and could run the risk of two different process IDs colliding.
The value of _bson_getpid is only used as part of the random seed in _bson_context_init_random, which is already generated from an int. Let's store this directly as a 32 bit integer type as input to the random seed if possible.
Duplicate seeds from sequential XORs
Consider the following scenario brought up by jmikola:
Process 30596 initializes at time 1614071145.568000 30596 ^ 1614071145 ^ 568000 = 1614551085 Process 30597 initializes at time 1614071145.568001 30597 ^ 1614071145 ^ 568001 = 1614551085
A plausible solution may be to bit shift before XORing.
Testing
If it is possible, let's make the process ID, time, and hostname configurable so we can unit test _bson_context_init_random. Given the difficulty of investigating this behavior in user applications, as part of this ticket, I think we should test scenarios like sequential process IDs with times differing by one microsecond.
- duplicates
-
CDRIVER-4231 Duplicates in ObjectID when inserting in parallel
- Closed
- related to
-
CDRIVER-4231 Duplicates in ObjectID when inserting in parallel
- Closed