<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:20:13 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-29219] Test and consider exposing different compression engines for MongoDB users</title>
                <link>https://jira.mongodb.org/browse/SERVER-29219</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;The WiredTiger storage engine supports several compression engines that are not exposed via MongoDB. It would be interesting to know whether there is value in exposing additional  compression engines - it would also be very valuable to create a test that could be used to measure the relative compression rate vs CPU usage characteristics of different compression engines for a few interesting MongoDB workloads.&lt;/p&gt;

&lt;p&gt;The particular compression engines that might be interesting are LZ4 and zstandard. A full set of compression libraries supported by the WiredTiger team is here:&lt;br/&gt;
&lt;a href=&quot;https://github.com/wiredtiger/wiredtiger/tree/master/ext/compressors&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/wiredtiger/wiredtiger/tree/master/ext/compressors&lt;/a&gt;&lt;/p&gt;</description>
                <environment></environment>
        <key id="383699">SERVER-29219</key>
            <summary>Test and consider exposing different compression engines for MongoDB users</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="brian.lane@mongodb.com">Brian Lane</assignee>
                                    <reporter username="alexander.gorrod@mongodb.com">Alexander Gorrod</reporter>
                        <labels>
                            <label>nonnyc</label>
                            <label>storage-engines</label>
                    </labels>
                <created>Mon, 15 May 2017 20:05:30 +0000</created>
                <updated>Thu, 22 Nov 2018 00:50:19 +0000</updated>
                            <resolved>Tue, 20 Nov 2018 05:19:41 +0000</resolved>
                                                                    <component>Storage</component>
                                        <votes>0</votes>
                                    <watches>11</watches>
                                                                                                                <comments>
                            <comment id="2006093" author="brian.lane" created="Tue, 18 Sep 2018 06:28:06 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=alexander.gorrod&quot; class=&quot;user-hover&quot; rel=&quot;alexander.gorrod&quot;&gt;alexander.gorrod&lt;/a&gt; I think your proposed set of test suites looks like a good place to start, and we can always expand on this depending on what the results look like. &#160;Thanks.&lt;/p&gt;</comment>
                            <comment id="1998966" author="alexander.gorrod" created="Tue, 11 Sep 2018 05:37:53 +0000"  >&lt;p&gt;I&apos;ve been thinking about which workloads would be suitable for making this decision. I think the data sets used in the &lt;a href=&quot;https://archive.is/Jcs4c#selection-4527.165-4539.187&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;blog post&lt;/a&gt; about compression shortly after WiredTiger integration are probably a good list of data sets to measure compression ratios. The other relevant metric is CPU overhead of compression - we&apos;ve seen in the past that &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-17741&quot; title=&quot;LZ4 compressor for mongod&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-17741&quot;&gt;&lt;del&gt;YCSB&lt;/del&gt;&lt;/a&gt; can be used as a measure of CPU efficiency in compression engines.&lt;/p&gt;

&lt;p&gt;I&apos;m going to suggest that the following suite of tests be used to decide if there is enough incremental benefit to a different compression scheme to warrant adding it as an option for MongoDB users:&lt;/p&gt;

&lt;h3&gt;&lt;a name=&quot;Compressionratiotests%3A&quot;&gt;&lt;/a&gt;Compression ratio tests:&lt;/h3&gt;
&lt;p&gt;Using &lt;a href=&quot;https://docs.mongodb.com/manual/reference/program/mongoimport/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;mongoimport&lt;/a&gt; to load a dataset. These tests should be run using zlib, snappy, none, zstd and lz4 compression libraries.&lt;/p&gt;

&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt; Dataset &lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt; Link &lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; Enron email corpus &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; &lt;a href=&quot;http://www.cs.cmu.edu/~./enron/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://www.cs.cmu.edu/~./enron/&lt;/a&gt; &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; Flight database &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; &lt;a href=&quot;https://www.transtats.bts.gov/OT_Delay/ot_delaycause1.asp?display=data&amp;amp;pn=1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.transtats.bts.gov/OT_Delay/ot_delaycause1.asp?display=data&amp;amp;pn=1&lt;/a&gt; &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; TPC-H base data set &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; &lt;a href=&quot;http://www.tpc.org/information/results_spreadsheet.asp&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://www.tpc.org/information/results_spreadsheet.asp&lt;/a&gt; &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; Twitter data set &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; MongoDB has an internal test set consisting of about 200k tweets &lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;h3&gt;&lt;a name=&quot;CPU%2Fperformancetests%3A&quot;&gt;&lt;/a&gt;CPU/performance tests:&lt;/h3&gt;

&lt;p&gt;We should run the same set of compression libraries against YCSB phases: load, 100% read, 95% read, 5% update, 100% update, 50% read 50% update. With 5 million 1kb documents. Each workload executes for 20 million operations.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=brian.lane%40mongodb.com&quot; class=&quot;user-hover&quot; rel=&quot;brian.lane@mongodb.com&quot;&gt;brian.lane@mongodb.com&lt;/a&gt; and &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=asya&quot; class=&quot;user-hover&quot; rel=&quot;asya&quot;&gt;asya&lt;/a&gt; Do you think the above set of results would deliver enough information to decide whether to support new compression libraries for MongoDB?&lt;/p&gt;</comment>
                            <comment id="1575357" author="asya" created="Fri, 19 May 2017 14:17:01 +0000"  >&lt;p&gt;We are definitely interested in doing this.  Next step will be to scope and schedule this. &lt;/p&gt;
</comment>
                            <comment id="1571748" author="nick@innsenroute.com" created="Mon, 15 May 2017 22:17:55 +0000"  >&lt;p&gt;Yes please. I&apos;ve found LZ4 to be the best for my app.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="579561">SERVER-36352</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 15 May 2017 22:17:55 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        5 years, 21 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-838</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>brian.lane@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            5 years, 21 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>alexander.gorrod@mongodb.com</customfieldvalue>
            <customfieldvalue>asya.kamsky@mongodb.com</customfieldvalue>
            <customfieldvalue>brian.lane@mongodb.com</customfieldvalue>
            <customfieldvalue>nick@innsenroute.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht7gdr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr8g47:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht421r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>