<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:01:31 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-22819] WiredTiger collection file read in 4k blocks</title>
                <link>https://jira.mongodb.org/browse/SERVER-22819</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;We are running an idle replica set that we have just added a new secondary to. Initial synchronization takes a long time and we have narrowed it down to the primary reading the collection file from the filesystem in 4k chunks. Is this by design or are we doing something wrong? We could surely speed this up by using a larger block size. Obviously the 4k chunk size makes sense on a write-busy primary, but we were wondering whether or not there was any autotuning available to make it read the data set more aggressively while idling.&lt;/p&gt;
</description>
                <environment>FreeBSD 10.2, ZFS</environment>
        <key id="267478">SERVER-22819</key>
            <summary>WiredTiger collection file read in 4k blocks</summary>
                <type id="6" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14720&amp;avatarType=issuetype">Question</type>
                                            <priority id="4" iconUrl="https://jira.mongodb.org/images/icons/priorities/minor.svg">Minor - P4</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="michael.cahill@mongodb.com">Michael Cahill</assignee>
                                    <reporter username="vgalu">Vlad Galu</reporter>
                        <labels>
                    </labels>
                <created>Tue, 23 Feb 2016 16:58:06 +0000</created>
                <updated>Mon, 28 Mar 2016 18:09:02 +0000</updated>
                            <resolved>Mon, 28 Mar 2016 18:09:02 +0000</resolved>
                                    <version>3.2.3</version>
                                                    <component>Replication</component>
                    <component>WiredTiger</component>
                                        <votes>0</votes>
                                    <watches>11</watches>
                                                                                                                <comments>
                            <comment id="1217001" author="ramon.fernandez" created="Mon, 28 Mar 2016 18:08:53 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=vgalu&quot; class=&quot;user-hover&quot; rel=&quot;vgalu&quot;&gt;vgalu&lt;/a&gt;, have you had a chance to test Michael&apos;s suggestions above? Since there&apos;s no bug in the server, and the SERVER project is to report bugs and feature requests for the MongoDB server, I&apos;m going to close this ticket for the time being.&lt;/p&gt;

&lt;p&gt;If some of your tests show that a different setting in WiredTiger provides performance improvements we can always reopen this ticket and repurpose it as an improvement request.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1201880" author="michael.cahill" created="Mon, 14 Mar 2016 01:25:09 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=vgalu&quot; class=&quot;user-hover&quot; rel=&quot;vgalu&quot;&gt;vgalu&lt;/a&gt;, from your description, it sounds like prefetch / readahead would help in this case.&lt;/p&gt;

&lt;p&gt;In some previous benchmarks with WiredTiger, we found that readahead caused more I/O and slower throughput for some workloads, so we currently use &lt;tt&gt;posix_fadvise&lt;/tt&gt; with the &lt;tt&gt;POSIX_FADV_RANDOM&lt;/tt&gt; flag to hint that readahead should be disabled.&lt;/p&gt;

&lt;p&gt;Are you able to run tests with a local build of MongoDB?  If so, try disabling &lt;tt&gt;HAVE_POSIX_FADVISE&lt;/tt&gt; in &lt;tt&gt;src/third_party/wiredtiger/build_freebsd/wiredtiger_config.h&lt;/tt&gt; and rebuilding to see whether that improves performance.  If it does, we can discuss whether changing the default behavior is reasonable.&lt;/p&gt;</comment>
                            <comment id="1185622" author="vgalu" created="Fri, 26 Feb 2016 09:00:50 +0000"  >&lt;p&gt;Hi ~michael.cahill, thanks for looking into this.&lt;/p&gt;

&lt;p&gt;Our application uses its own _id fields, which are VERY high cardinality 16 byte arrays. For all intents and purpose, they can be considered random. The insertion process is indeed aggressive, but does not use the bulk feature, writing one document at a time with &lt;/p&gt;
{ w: majority }
&lt;p&gt; instead.&lt;/p&gt;

&lt;p&gt;The hybrid ZFS pool uses 4k blocks on the spinning drives and uses bog standard settings, except for the ARC which is capped at 8GB (the rest of the RAM up to 32GB is the WiredTiger cache). The L2ARC is sized to 400GB, of which during these tests about 10GB were used. zlib is the compression algorithm for both journal and collection files.&lt;/p&gt;

&lt;p&gt;When we looked at the truss output, we noticed that offsets passed to pread() calls on the collection file were sequential rather than random, hinting at a programatically imposed constraint.&lt;/p&gt;

&lt;p&gt;Hope this helps&lt;br/&gt;
Vlad&lt;/p&gt;</comment>
                            <comment id="1185367" author="michael.cahill" created="Fri, 26 Feb 2016 00:32:07 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=vgalu&quot; class=&quot;user-hover&quot; rel=&quot;vgalu&quot;&gt;vgalu&lt;/a&gt;, sorry to hear that you are having performance problems with MongoDB and WiredTiger.&lt;/p&gt;

&lt;p&gt;WiredTiger will usually lay out files sequentially and read them in the same order, and uses variable-sized blocks internally (that are multiples of 4KB).  With default settings, we try to create blocks that are close to 32KB in memory, then compress them to whatever size snappy compression results in.&lt;/p&gt;

&lt;p&gt;Can you describe more about how the data on your primary was created?  Do you just insert documents or have an update-heavy workload?  Do you use default &lt;tt&gt;_id&lt;/tt&gt; fields or set your own, and if the latter, how are they generated?  This matters because many operations read documents via the &lt;tt&gt;_id&lt;/tt&gt; index, so random keys can lead to random I/O patterns rather than sequential.&lt;/p&gt;

&lt;p&gt;In terms of things that can improve performance, how have you configured readahead/prefetch for the filesystem?  Are the pools backed by SSDs or spinning disks, and if the latter, do you have an SSD cache?&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 26 Feb 2016 00:32:07 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        7 years, 46 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            7 years, 46 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>michael.cahill@mongodb.com</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
            <customfieldvalue>vgalu</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrkfbz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsiosn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsf76f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>