Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9314

cursors that return over 60 million objects are extremely slow

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.2.3
    • Component/s: Performance, Querying
    • Labels:
      None
    • Environment:
      Ubuntu 12.04
    • ALL

      It looks like getmores on cursors that return a large number of objects run significantly slower than cursors that return fewer objects. We noticed this trying to run mongodump on one of our collections which has 84M objects. collection.stats() returns:
      {
      "ns" : "data.app_00da3da8-3ec2-490b-ac3a-1ac5d12d0814:SessionEvent",
      "count" : 84423082,
      "size" : 57713849288,
      "avgObjSize" : 683.6264197035592,
      "storageSize" : 59682012800,
      "numExtents" : 49,
      "nindexes" : 4,
      "lastExtentSize" : 2146426864,
      "paddingFactor" : 1,
      "systemFlags" : 1,
      "userFlags" : 0,
      "totalIndexSize" : 8947258432,
      "indexSizes" :

      { "_id_" : 3478397440, "_acl_1" : 1572457376, "_acl.*.r_1" : 1572457376, "_created_at_1" : 2323946240 }

      ,
      "ok" : 1
      }

      If we try to mongodump this collection it takes about 7 hours. If we instead dump the collection by parts (i.e. split the _id space into 4 parts) and dump them individually, the total run time is about 1.5 hours. We have another collection whose on disk size is greater, but with fewer objects which dumps in about 2 hours. Here is collection.stat() on that collection:
      {
      "ns" : "data.app_d237a400-f548-42cb-85e3-1643daa0dd4e:SaveGame",
      "count" : 1636453,
      "size" : 114000989904,
      "avgObjSize" : 69663.46720865188,
      "storageSize" : 114517589216,
      "numExtents" : 72,
      "nindexes" : 7,
      "lastExtentSize" : 2146426864,
      "paddingFactor" : 1,
      "systemFlags" : 1,
      "userFlags" : 0,
      "totalIndexSize" : 398171200,
      "indexSizes" :

      { "_id_" : 63372176, "UserId_1" : 75538064, "_acl_1" : 28305312, "_acl.*.r_1" : 28305312, "_created_at_1" : 41648544, "UDID_1" : 130153744, "location_1" : 30848048 }

      ,
      "ok" : 1
      }

      Experimentally, the point at which performance falls off a cliff is about 60M objects in the result set.

            Assignee:
            rui.zhang Rui Zhang (Inactive)
            Reporter:
            charity@parse.com charity majors
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: