-
Type: Bug
-
Resolution: Cannot Reproduce
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.2.3
-
Component/s: Performance, Querying
-
Labels:None
-
Environment:Ubuntu 12.04
-
ALL
It looks like getmores on cursors that return a large number of objects run significantly slower than cursors that return fewer objects. We noticed this trying to run mongodump on one of our collections which has 84M objects. collection.stats() returns:
{
"ns" : "data.app_00da3da8-3ec2-490b-ac3a-1ac5d12d0814:SessionEvent",
"count" : 84423082,
"size" : 57713849288,
"avgObjSize" : 683.6264197035592,
"storageSize" : 59682012800,
"numExtents" : 49,
"nindexes" : 4,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 8947258432,
"indexSizes" :
,
"ok" : 1
}
If we try to mongodump this collection it takes about 7 hours. If we instead dump the collection by parts (i.e. split the _id space into 4 parts) and dump them individually, the total run time is about 1.5 hours. We have another collection whose on disk size is greater, but with fewer objects which dumps in about 2 hours. Here is collection.stat() on that collection:
{
"ns" : "data.app_d237a400-f548-42cb-85e3-1643daa0dd4e:SaveGame",
"count" : 1636453,
"size" : 114000989904,
"avgObjSize" : 69663.46720865188,
"storageSize" : 114517589216,
"numExtents" : 72,
"nindexes" : 7,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 398171200,
"indexSizes" :
,
"ok" : 1
}
Experimentally, the point at which performance falls off a cliff is about 60M objects in the result set.