<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Wed Feb 07 21:20:20 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[CDRIVER-4231] Duplicates in ObjectID when inserting in parallel</title>
                <link>https://jira.mongodb.org/browse/CDRIVER-4231</link>
                <project id="10030" key="CDRIVER">C Driver</project>
                    <description>&lt;p&gt;Prior to driver version 1.14, ObjectID included 3 bytes identifying the machine and 2 bytes denoting PID. This made duplicate ID&apos;s impossible. Following 1.14 (and &lt;a href=&quot;https://jira.mongodb.org/browse/CDRIVER-2771&quot; title=&quot;Implement ObjectID spec&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CDRIVER-2771&quot;&gt;&lt;del&gt;CDRIVER-2771&lt;/del&gt;&lt;/a&gt;) we see users bulk loading in parallel getting the error:&lt;/p&gt;

&lt;p&gt;Error from Mongo database on operation &apos;bulk operation execute&apos;: E11000 duplicate key error collection: gfxdb.Rpt_AggregatedExposure_HU index: &lt;em&gt;id&lt;/em&gt; dup key: { : ObjectId(&apos;60ab881bb7b6e778e76e34a2&apos;) }&lt;/p&gt;

&lt;p&gt;The new algorithm includes a 5-byte random value generated once per process. This random value is meant to be unique to the machine and process. My testing shows that different PIDs are capable of creating the same 5-byte random value.&#160;&lt;/p&gt;

&lt;p&gt;An additional mechanism for avoiding duplicates is that the 3-byte counter at the end of the ObjectID which starts at a random number. In the cases of duplicate 5-byte situation, the counter number starts at the same number.&lt;/p&gt;

&lt;p&gt;To demonstrate this behavior, I wrote a program that loads to a Mongo collection with 1 document per process, and ran it 5000 times. Each ObjectID is then parsed into it&apos;s constituent parts. Here I show those parts for a number of 5-byte random number duplicates:&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;random&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;duplicates&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;timestamp&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;counter&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;write PID&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;191,162,307,748&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005599&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1943458&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;20617&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005804&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1943458&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4837&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;162,692,425,265&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005688&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6330770&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;9673&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005995&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6330770&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;17759&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;335,481,554,228&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005711&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4833330&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;15678&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637006002&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4833330&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;19618&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;-146,439,389,216&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005841&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6191586&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;13851&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637006026&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6191586&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;25035&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;-484,404,755,953&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005985&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6445778&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;15646&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637006023&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6445778&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;24430&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;51,631,052,349&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005747&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2533378&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;23997&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005882&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2533378&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;23354&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Testing against a 1.6 C Driver, I see the 5-bytes (3 for machine and 2 for PID) identical in the case of processes which happen to have identical PIDs (running at different times), but in those cases the counter bytes are randomly distributed, e.g.:&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;random&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;duplicates&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;timestamp&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;counter&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;write PID&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;206,863,667,287&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637004679&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;461522&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1057&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637004817&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1525762&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1057&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&#160;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1637005104&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6964834&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1057&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Given there are 1/1 trillion odds of two 5-byte random numbers colliding, combined with the odds of the 1/8 million counter numbers colliding (all in the same second), if the code matched the spec, the modern algorithm would not be a problem.&lt;/p&gt;</description>
                <environment></environment>
        <key id="1927619">CDRIVER-4231</key>
            <summary>Duplicates in ObjectID when inserting in parallel</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="10300" iconUrl="https://jira.mongodb.org/images/icons/priorities/medium.svg">Unknown</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="colby.pike@mongodb.com">Colby Pike</assignee>
                                    <reporter username="seamus.riley@gmail.com">Seamus Riley</reporter>
                        <labels>
                    </labels>
                <created>Wed, 17 Nov 2021 17:56:17 +0000</created>
                <updated>Thu, 25 Jan 2024 09:34:52 +0000</updated>
                            <resolved>Tue, 25 Jan 2022 22:20:44 +0000</resolved>
                                                    <fixVersion>1.21.0</fixVersion>
                                                        <votes>2</votes>
                                    <watches>11</watches>
                                                                                                                <comments>
                            <comment id="4314273" author="JIRAUSER1260880" created="Tue, 25 Jan 2022 22:20:44 +0000"  >&lt;p&gt;A new method of generating unique psuedorandomness has been introduced in OID generation.&lt;/p&gt;</comment>
                            <comment id="4314267" author="xgen-internal-githook" created="Tue, 25 Jan 2022 22:18:44 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;vector-of-bool&apos;, &apos;email&apos;: &apos;vectorofbool@gmail.com&apos;, &apos;username&apos;: &apos;vector-of-bool&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/CDRIVER-4231&quot; title=&quot;Duplicates in ObjectID when inserting in parallel&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CDRIVER-4231&quot;&gt;&lt;del&gt;CDRIVER-4231&lt;/del&gt;&lt;/a&gt;: New random generation of OID values (#931)&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;New random generation of OID values &lt;span class=&quot;error&quot;&gt;&amp;#91;Fixes CDRIVER-4231&amp;#93;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;ul&gt;
	&lt;li&gt;Restore hashing of machine attributes, plus an atomic init-counter,&lt;br/&gt;
  plus the process ID.&lt;/li&gt;
	&lt;li&gt;Instead of MD5 hashing, use SipHash.&lt;/li&gt;
	&lt;li&gt;No more magic numbers, make bson-context -Wconversion clean, comment what&apos;s happening in the random init.&lt;/li&gt;
	&lt;li&gt;Remove old deprecated docs and no-op tests.&lt;/li&gt;
	&lt;li&gt;Host name mocking is gone&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo-c-driver/commit/ee64af14196266081d258fd53dec13cdce7d0707&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo-c-driver/commit/ee64af14196266081d258fd53dec13cdce7d0707&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="4205343" author="JIRAUSER1263383" created="Mon, 22 Nov 2021 17:10:47 +0000"  >&lt;p&gt;I saw this behavior with 1.14.0 C driver on RHEL release 6.9 although I have heard reports the behavior is consistent in more recent versions (including 1.17). In my testing the C compiler identification is GNU 4.9.4. Output of cmake (I shortened some of the paths with $BASE_PATH for readability):&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;/ab/tools/bin/cmake . -G&quot;Unix Makefiles&quot; --no-warn-unused-cli -DCMAKE_MODULE_PATH=$BASE_PATH/build/cmake-modules -DCMAKE_PROGRAM_PATH=$BASE_PATH/bin -DCMAKE_LIBRARY_PATH=$BASE_PATH/lib -DCMAKE_INCLUDE_PATH=$BASE_PATH/include -DCMAKE_INSTALL_PREFIX=$BASE_PATH -DCMAKE_INSTALL_LIBDIR=$BASE_PATH/lib -DCMAKE_INSTALL_INCLUDEDIR=$BASE_PATH/include -DCMAKE_INSTALL_OLDINCLUDEDIR=$BASE_PATH/include -DCMAKE_INSTALL_SYSCONFDIR=$BASE_PATH/etc -DCMAKE_C_FLAGS=&quot;-m64 -DXXI_MODELBITS=64 -DXXI_LINUX=1 -DXXI_X86=1 -DXXI_LINUX_VARIANT=&quot;REDHAT_7_7&quot; -Werror=return-type -Werror=uninitialized -Werror=implicit-function-declaration -Werror=maybe-uninitialized&quot; -DCMAKE_CXX_FLAGS=&quot;-m64 -DXXI_MODELBITS=64 -DXXI_LINUX=1 -DXXI_X86=1 -DXXI_LINUX_VARIANT=&quot;REDHAT_7_7&quot; -Werror=return-type -Werror=uninitialized -Werror=maybe-uninitialized&quot; -DCMAKE_C_FLAGS_DEBUG=&quot;-O0 -g2&quot; -DCMAKE_C_FLAGS_RELEASE=&quot;-O2&quot; -DCMAKE_C_FLAGS_RELWITHDEBINFO=&quot;-O2 -g2&quot; -DCMAKE_CXX_FLAGS_DEBUG=&quot;-O0 -g2&quot; -DCMAKE_CXX_FLAGS_RELEASE=&quot;-O2&quot; -DCMAKE_CXX_FLAGS_RELWITHDEBINFO=&quot;-O2 -g2&quot; -DCMAKE_BUILD_TYPE=DEBUG -DENABLE_AUTOMATIC_INIT_AND_CLEANUP=OFF -DCMAKE_BUILD_TYPE=Release -DENABLE_SSL=OPENSSL -DENABLE_SASL=OFF -DENABLE_ICU=OFF -DENABLE_SNAPPY=OFF -DENABLE_ZLIB=BUNDLED -DENABLE_SRV=ON -DENABLE_BSON=ON -DENABLE_STATIC=OFF -DENABLE_TESTS=OFF -DENABLE_EXAMPLES=OFF -DOPENSSL_ROOT_DIR=../openssl-1.1.1k -DOPENSSL_LIBRARIES=../lib&lt;/tt&gt;&lt;/p&gt;</comment>
                            <comment id="4198311" author="kevin.albertson" created="Thu, 18 Nov 2021 16:55:07 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=seamus.riley%40gmail.com&quot; class=&quot;user-hover&quot; rel=&quot;seamus.riley@gmail.com&quot;&gt;seamus.riley@gmail.com&lt;/a&gt;, thank you for the report! We will look into this soon and attempt to reproduce. If possible, can you provide the following information? That may aid the effort of reproducing on an equivalent environment:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;The exact version of the driver (e.g. 1.16.0).&lt;/li&gt;
	&lt;li&gt;Host OS, version, and architecture.&lt;/li&gt;
	&lt;li&gt;C Compiler and version.&lt;/li&gt;
	&lt;li&gt;The output of the cmake command. That may help to determine what random primitives are being used.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10720">
                    <name>Cloners</name>
                                                                <inwardlinks description="is cloned by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="1632109">CDRIVER-3913</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="1632109">CDRIVER-3913</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                                                                                    <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[5002K000010Ei4uQAC, 5006R00001vf973QAA, 5006R00001yJqAfQAK]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hzu4qf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            </customfields>
    </item>
</channel>
</rss>