<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:00:47 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-2649] Poor remove() performance (tested for pymongo only)</title>
                <link>https://jira.mongodb.org/browse/SERVER-2649</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;1. In my original case I have issued the command c.db.coll.remove(&lt;/p&gt;
{ &apos;str_key&apos; : &apos;myvalue&apos;}
&lt;p&gt;) that resulted in removal of about 200000 records from a collection containing 2000000 records. The removal of these 10% of records took 1 hour. The field str_key was indexed. Each record size was about 10 Kbytes. About 13Gbytes of RAM not used by any application and ready for mongodb mmap. &lt;/p&gt;

&lt;p&gt;2. I tried to check how remove() works in oversimplified case (see below). The insertions took 312 sec, whereas the removal 994sec. So, removals are much slower, whereas no much serialized data are pushed between server and a client. &lt;/p&gt;

&lt;p&gt;3. see also: &lt;br/&gt;
&lt;a href=&quot;http://groups.google.com/group/mongodb-user/browse_thread/thread/95f9386cd57003e4&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://groups.google.com/group/mongodb-user/browse_thread/thread/95f9386cd57003e4&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://groups.google.com/group/mongodb-user/browse_thread/thread/5d5dd12e37382b5b&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://groups.google.com/group/mongodb-user/browse_thread/thread/5d5dd12e37382b5b&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://groups.google.com/group/mongodb-user/browse_thread/thread/5a7033248bbe362d&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://groups.google.com/group/mongodb-user/browse_thread/thread/5a7033248bbe362d&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(4. &quot;limit=100000&quot; doesn&apos;t seem to work in the example below, but this is not important, I hope. Also, perhaps, I should have put safe=True for more transparency. Finally, in my case it was a &quot;cold&quot; I/O &amp;#8211; the records have not been prefetched from disk into RAM before the remove() has been invoked. If it is all still very important, I could try to reproduce my issue again but a bit closer, but IMHO the case below still shows well enough an unexpected slow-down)&lt;/p&gt;

&lt;p&gt;####################################&lt;br/&gt;
import unittest&lt;br/&gt;
from debug.decorators import timeit&lt;/p&gt;

&lt;p&gt;class Test(unittest.TestCase):&lt;br/&gt;
    def testName(self):&lt;br/&gt;
        from pymongo import Connection&lt;br/&gt;
        c = Connection()&lt;br/&gt;
        dummy_str = &apos;a&apos; * 10000&lt;br/&gt;
        c.drop_database(&apos;test_remove_performance&apos;)&lt;/p&gt;

&lt;p&gt;        @timeit        &lt;br/&gt;
        def insert_many():&lt;br/&gt;
            for i in range(1000000):&lt;br/&gt;
                c.test_remove_performance.coll.insert(&lt;/p&gt;
{&apos;dummy_str&apos; : dummy_str }
&lt;p&gt;)&lt;br/&gt;
        @timeit&lt;br/&gt;
        def remove_some():&lt;br/&gt;
            c.test_remove_performance.coll.remove({}, limit=100000)&lt;/p&gt;

&lt;p&gt;        insert_many()&lt;br/&gt;
        remove_some()&lt;br/&gt;
        print c.test_remove_performance.coll.count()&lt;/p&gt;

&lt;p&gt;if _&lt;em&gt;name&lt;/em&gt;_ == &quot;_&lt;em&gt;main&lt;/em&gt;_&quot;:&lt;br/&gt;
    unittest.main()&lt;/p&gt;</description>
                <environment>Ubuntu 10.10 (amd64)</environment>
        <key id="14942">SERVER-2649</key>
            <summary>Poor remove() performance (tested for pymongo only)</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="4">Incomplete</resolution>
                                        <assignee username="backlog-server-execution">Backlog - Storage Execution Team</assignee>
                                    <reporter username="vak">Valery Khamenya</reporter>
                        <labels>
                    </labels>
                <created>Tue, 1 Mar 2011 20:07:39 +0000</created>
                <updated>Tue, 6 Dec 2022 05:45:01 +0000</updated>
                            <resolved>Fri, 6 Oct 2017 20:15:24 +0000</resolved>
                                    <version>1.7.6</version>
                                                    <component>Performance</component>
                                        <votes>10</votes>
                                    <watches>12</watches>
                                                                                                                <comments>
                            <comment id="1691932" author="milkie" created="Fri, 6 Oct 2017 20:15:24 +0000"  >&lt;p&gt;Performance profiles have changed quite a bit since version 1.7.6; please open a new ticket for assistance with the current available versions of MongoDB.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25136"><![CDATA[Storage Execution]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 6 Oct 2017 20:15:24 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 18 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>false</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>alexander.golin@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 18 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-execution</customfieldvalue>
            <customfieldvalue>milkie@mongodb.com</customfieldvalue>
            <customfieldvalue>vak</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrp4yn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr9sbb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6168</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hszw7z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>