[SERVER-1383] Poor mongoexport performance when using a query Created: 08/Jul/10  Updated: 12/Jul/16  Resolved: 23/Sep/10

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 1.5.3
Fix Version/s: 1.7.1

Type: New Feature Priority: Major - P3
Reporter: Doug Hudson Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Thu Jul 8 18:58:12 db version v1.5.3, pdfile version 4.5
Thu Jul 8 18:58:12 git version: 068efc1b2356430c21a376f8963dd13979566fd4
Thu Jul 8 18:58:12 sys info: Linux domU-12-31-39-06-79-A1 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41


Participants:

 Description   

mongoexport is potentially very slow when passing in a query argument.

Below is an example of a shell find() that takes 30ms for 571 result documents, but many minutes when trying to export these documents using mongoexport.

If it is known for sure that a collection won't change during an export, can mongoexport be told to be more efficient and less strict when creating the output documents? Possibly it's a simple matter of allowing {$snapshot: false} to be specificed, or maybe there is another reason for the large change in efficiency.

Either way, providing a simple query which uses an index such as the example given, is currently an unfeasible way to export collection subsets, and one must rely on other manual means to achieve this.

    1. Simple query in shell
      > db.articles.find( {"authors.name": "cannon cp"}

      ).explain()
      {
      "cursor" : "BtreeCursor authors.name_1",
      "nscanned" : 571,
      "nscannedObjects" : 571,
      "n" : 571,
      "millis" : 30,
      "indexBounds" : [
      [

      { "authors.name" : "cannon cp" }

      ,

      { "authors.name" : "cannon cp" }

      ]
      ]
      }

    1. Same query in mongoexport takes multiple minutes with high nscanned
      > ./mongoexport -vv -h localhost -c articles -d pubmed -q ' {"authors.name": "cannon cp"}

      ' -o articles.json
      connected to: localhost
      exported 571 records

    1. Server output
      Thu Jul 8 14:58:51 query pubmed.articles reslen:174620 nscanned:1387320
      Unknown macro: { query}

      nreturned:101 39276ms
      Thu Jul 8 15:03:23 getmore pubmed.articles cid:7259446948513230975 getMore:

      Unknown macro: { query}

      bytes:716901 nreturned:470 271669ms
      Thu Jul 8 15:03:24 end connection 127.0.0.1:54779



 Comments   
Comment by auto [ 23/Sep/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: don't use snapshot with another query SERVER-1383
http://github.com/mongodb/mongo/commit/d3d719c3aa72d63fe838f1e5eae7098e3bffe721

Comment by auto [ 23/Sep/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: don't use snapshot with another query SERVER-1383
http://github.com/mongodb/mongo/commit/d3d719c3aa72d63fe838f1e5eae7098e3bffe721

Generated at Thu Feb 08 02:56:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.