<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Wed Feb 07 22:05:03 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[CXX-2148] GridFS corruption</title>
                <link>https://jira.mongodb.org/browse/CXX-2148</link>
                <project id="11980" key="CXX">C++ Driver</project>
                    <description>&lt;p&gt;&#160;When the replication lag is non zero ( was several hours in that specific occurrence), we see what looks like gridFS corruption when reading data from secondary nodes:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;mongocxx::gridfs_exception: expected file to have 1 chunk(s), but query to chunks collection only returned 0 chunk(s): a GridFS file being operated on was discovered to be corrupted&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;We don&apos;t update documents in gridFS, we only create or delete them.&lt;/p&gt;

&lt;p&gt;Let me know if you need any other information.&lt;/p&gt;</description>
                <environment>Debian buster.&lt;br/&gt;
Mongo 4.2.5&lt;br/&gt;
Mongo cxx driver 3.4&lt;br/&gt;
Mongo cluster is not sharded and has several replicas, some are in Paris (including the current master) and some are in Ohio.</environment>
        <key id="1584144">CXX-2148</key>
            <summary>GridFS corruption</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13202">Works as Designed</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="fechantillac@antidot.net">Francois EE</reporter>
                        <labels>
                    </labels>
                <created>Fri, 8 Jan 2021 09:30:26 +0000</created>
                <updated>Fri, 27 Oct 2023 13:13:34 +0000</updated>
                            <resolved>Fri, 8 Jan 2021 20:55:43 +0000</resolved>
                                                                                        <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="3558008" author="kevin.albertson" created="Mon, 11 Jan 2021 17:14:27 +0000"  >&lt;p&gt;Apologies &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=fechantillac%40antidot.net&quot; class=&quot;user-hover&quot; rel=&quot;fechantillac@antidot.net&quot;&gt;fechantillac@antidot.net&lt;/a&gt;, you are indeed correct! Though not all drivers support it, the the &lt;tt&gt;mongocxx::gridfs::bucket&lt;/tt&gt; class does support sessions being passed to operations. I was mistaken &#8211; the C driver does not support sessions in GridFS operations, but the C++ implementation of GridFS does not utilize the C driver&apos;s GridFS API (one of the few exceptional cases).&lt;/p&gt;

&lt;p&gt;You should be able to read from secondaries by using a session with causal consistency enabled.&lt;/p&gt;

&lt;p&gt;This Jira project is for bug reports and feature requests. For help with using the C++ driver, please create a post in our community forum &lt;a href=&quot;https://developer.mongodb.com/community/forums/tag/cxx-driver&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="3557291" author="JIRAUSER1258212" created="Mon, 11 Jan 2021 13:00:18 +0000"  >&lt;p&gt;Thanks Kevin for the quick response. You say GridFS is not designed to support passing sessions but I do see that various methods of mongocxx::gridfs::bucket accepts a session. May I ask what is missing in the current implementation ?&lt;/p&gt;

&lt;p&gt;Sorry if this is not the right place for asking such question, in this case please gently forward me to the correct place.&lt;/p&gt;</comment>
                            <comment id="3555744" author="kevin.albertson" created="Fri, 8 Jan 2021 20:55:43 +0000"  >&lt;p&gt;Closing since this is not a bug in the C++ driver.&lt;/p&gt;</comment>
                            <comment id="3555742" author="kevin.albertson" created="Fri, 8 Jan 2021 20:55:20 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=fechantillac%40antidot.net&quot; class=&quot;user-hover&quot; rel=&quot;fechantillac@antidot.net&quot;&gt;fechantillac@antidot.net&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Reading from a secondary in GridFS may get into situations where the file document has replicated but the chunks documents have not. I initially thought using a read and write concern of majority would solve this. But it is not a complete solution &#8211;&#160;as there is no guarantee that the selected secondary was among the majority for all inserted chunks.&lt;/p&gt;

&lt;p&gt;I checked with the broader drivers team. This is a current limitation of how GridFS is currently specified. Drivers implement GridFS based on this &lt;a href=&quot;https://github.com/mongodb/specifications/tree/master/source/gridfs/gridfs-spec.rst&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;common specification&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;A robust solution would be to do operations with a session configured with causal consistency. But, as it is designed, GridFS does not support passing sessions. I have created &lt;a href=&quot;https://jira.mongodb.org/browse/CXX-2150&quot; title=&quot;Support sessions in GridFS&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CXX-2150&quot;&gt;&lt;del&gt;CXX-2150&lt;/del&gt;&lt;/a&gt; to track that work.&lt;/p&gt;

&lt;p&gt;In the meantime, I would suggest using a primary read preference and watching &lt;a href=&quot;https://jira.mongodb.org/browse/CXX-2150&quot; title=&quot;Support sessions in GridFS&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CXX-2150&quot;&gt;&lt;del&gt;CXX-2150&lt;/del&gt;&lt;/a&gt; for updates.&lt;br/&gt;
&#160;&lt;/p&gt;</comment>
                            <comment id="3555173" author="JIRAUSER1258212" created="Fri, 8 Jan 2021 17:00:35 +0000"  >&lt;p&gt;We use default options when opening the gridfs::bucket. Read preference in the connection string is NEAREST.&lt;/p&gt;</comment>
                            <comment id="3555138" author="kevin.albertson" created="Fri, 8 Jan 2021 16:45:41 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=fechantillac%40antidot.net&quot; class=&quot;user-hover&quot; rel=&quot;fechantillac@antidot.net&quot;&gt;fechantillac@antidot.net&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for the report! We will look into this soon.&lt;/p&gt;

&lt;p&gt;My hypothesis aligns with your hypothetical. I suspect the file document is replicated before all chunks. And reading from the secondary sees the file document before all chunks are replicated.&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Note also that the write policy used is the default one.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;The default &lt;a href=&quot;https://docs.mongodb.com/manual/reference/write-concern/#write-concern-specification&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;write concern&lt;/a&gt; for a &lt;tt&gt;gridfs::bucket&lt;/tt&gt; is w:1 which only requires replication of one node (see &lt;a href=&quot;https://docs.mongodb.com/manual/reference/write-concern/#write-concern-specification&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the manual&lt;/a&gt; for a description).&lt;/p&gt;

&lt;p&gt;Are you configuring the &lt;tt&gt;gridfs::bucket&lt;/tt&gt; with a&#160;default read concern, and a read preference of &lt;tt&gt;SECONDARY_PREFERRED&lt;/tt&gt; or &lt;tt&gt;SECONDARY&lt;/tt&gt;?&lt;/p&gt;</comment>
                            <comment id="3555039" author="JIRAUSER1258212" created="Fri, 8 Jan 2021 16:15:02 +0000"  >&lt;p&gt;Actually it just happened again without any significant replication lag (less than 15 seconds).&lt;/p&gt;</comment>
                            <comment id="3554670" author="JIRAUSER1258212" created="Fri, 8 Jan 2021 13:09:11 +0000"  >&lt;p&gt;I should add that the issue disappears after some time without any writes to that gridFS object. We only had the opportunity to check gridFS consistency after the replication lag resorbs.&lt;/p&gt;

&lt;p&gt;WIth my little understanding of how gridFS and replication are working, it looks like for some object the layers.files collection documents are successfully written to the secondary before the corresponding chunks. This is purely hypothetical though.&lt;/p&gt;

&lt;p&gt;Note also that the write policy used is the default one.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="1584684">CXX-2150</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hya9jr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            </customfields>
    </item>
</channel>
</rss>