Details
-
Task
-
Resolution: Done
-
Major - P3
-
None
-
None
-
None
-
None
-
2.8.0 java driver. Mac OS X (development) & CentOS (production)
Description
I had a question about improving the performance of loading data from Mongo.
I'm doing a query as follows:
val prefixString = "^" + Pattern.quote(path);
val prefixPattern: Pattern = Pattern.compile(prefixString);
val query: BasicDBObject = new BasicDBObject(ID_FIELD_NAME, prefixPattern);
val cursor = this.collection.find(query).batchSize(10000);
val arr = cursor.toArray();
I'm using the 2.8.0 java driver (even though the code is written in scala).
When I do an "explain" of this query, I get the following:
{ "cursor" : "BtreeCursor id multi" , "nscanned" : 5020 , "nscannedObjects" : 5020 , "n" : 5020 , "millis" : 23 , "nYields" : 0 , "nChunkSkips" : 0 , "isMultiKey" : false , "indexOnly" : false , "indexBounds" : { "_id" : [ [ "" , { }] , [
{ "$regex" : "^\\Q\\E" , "$options" : ""},
{ "$regex" : "^\\Q\\E" , "$options" : ""}]]} , "allPlans" : [ { "cursor" : "BtreeCursor id multi" , "indexBounds" : { "_id" : [ [ "" , { }] , [
{ "$regex" : "^\\Q\\E" , "$options" : ""},
{ "$regex" : "^\\Q\\E" , "$options" : ""}]]}}] , "oldPlan" : { "cursor" : "BtreeCursor id multi" , "indexBounds" : { "_id" : [ [ "" , { }] , [
{ "$regex" : "^\\Q\\E" , "$options" : ""},
{ "$regex" : "^\\Q\\E" , "$options" : ""}]]}}}
The "explain" says it took 23 milliseconds, but the actual time it takes to do the toArray is closer to 600 ms. This suprises me as I'm doing this testing on localhost, so I would expect the data transfer to go quickly. What can I do to speed this operation up? I want to load all query results into memory as quickly as possible. I took a look in Wireshark and the total data is only 180k, so I'd be surprised if the data transfer were the only issue.
Thanks!