[CDRIVER-3288] /bson/oid/init_with_threads failure Created: 08/Aug/19 Updated: 27/Oct/23 Resolved: 27/Sep/23 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libbson, tests |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor - P4 |
| Reporter: | Kevin Albertson | Assignee: | Kevin Albertson |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | flaky-tests | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Epic Link: | Stabilize Evergreen |
| Description |
|
This patch build failed /bson/oid/init_with_threads with the log:
Investigate if this is a transient system issue, a bug in how the test is creating/managing threads, or a possible bug in libbson. |
| Comments |
| Comment by Kevin Albertson [ 27/Sep/23 ] | |||||||
|
No observed failures from 2023-08-01 to 2023-09-27 for sasl-matrix-winssl variant.
Closing as "Gone away". | |||||||
| Comment by Kevin Albertson [ 07/May/20 ] | |||||||
|
My hypothesis is that the comparison that ObjectIDs will always increase (when comparing with memcmp) is wrong. The final three bytes of an ObjectID is initialized to a random sequence, and incremented by one on every new ObjectID: If it is randomly initialized at a high value, generating more ObjectIDs may overflow and wrap. If that happens in the same second, then there will be an observed decrease when comparing with memcmp. I think that is expected, and the test should just remove the comparison. Though oddly enough, repeatedly running this test for multiple hours on my local mac did not reproduce the issue. I'm not sure if it's because it's extremely unlikely, or if my hypothesis is wrong. The test generates 500,000 * (N_THREADS=4) = 2,000,000 ObjectIDs. If the sequence is initialized anywhere in the interval [2^24 - 2000000, 2^24], then we'd get an overflow assuming the test ran in the same second. Assuming the second was fixed and the generation uniformly random, that should happen with probability: (2^24 - 2000000) / 2^24 = .88 Looking further at our implementation I've noticed two things:
So the sequence doesn't start at a completely random value. Though that comment seems a little off. Not only is the last nibble (4 bits) chopped off, but so is the leading bit of the the three byte sequence (It is 7=0111 instead of F=1111). When a sequence is applied to a new ObjectID the sequence in the context is converted to big endian to conform to the ObjectID spec. _bson_context_set_oid_seq32_threadsafe currently does this:
Let's investigate further to make sure we're not doing something wrong in our ObjectID generation. | |||||||
| Comment by Kevin Albertson [ 07/May/20 ] | |||||||