-
Type:
Question
-
Resolution: Done
-
Priority:
Blocker - P1
-
None
-
Affects Version/s: 2.2.2
-
Component/s: Index Maintenance
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Hello,
I have an interesting problem for indexing and i need additional help.
I use mongodb for our multisite CMS system and we put all kind of contents to the same contents collection.
Our document is like:
{
"_id": ObjectId("507eb67f564e66097c7378f7"),
"_t": ["Base","Content","Article"],
"Application": "com.cnnturk",
"Status": 0,
"ContentType": "Article",
"Path": "/dunya/",
"Title" : "...",
"Description": "...",
"Text" : "...",
"StartDate": ISODate("2013-09-20T08:15:10.901Z"),
...
}
_t = C# driver put this field to hold inheritance
Application = We separate different web sites with this field
Status = Active = 0, Passive = 1 etc.
ContentType = content type, Article, Video, Episode etc.
Path = Our folder, for example we put all worlds articles to the /dunya/ path and in any path we have approximately 200000 contents /turkiye/, /ekonomi/ etc.
StartDate= contents appear on the web sites after this date
We have following indexes
{_t:1,Application:1,Status:1,Path:1}
{StartDate:-1,_t:1,Application:1,Status:1,Path:1}
{Application:1,Status:1,Path:1,ContentType:1}
{StartDate:-1,Application:1,Status:1,Path:1,ContentType:1}
and we have mostly following queries
find({_t:"Article",Application:"com.cnnturk",Status:0,Path:/^\/spor\//}).sort({StartDate:-1}).limit(5)
or
find({Application:"com.cnnturk",Status:0,Path:/^\/spor\//,ContentType:'Media'}).sort({StartDate:-1}).limit(5)
(by the way i tried to use sort field on the end of index but i couldn't get good performance)
When i try to explain those queries i see the following results
"cursor" : "BtreeCursor _t_1_Application_1_Status_1_Path_1 multi", "isMultiKey" : true, "n" : 5, "nscannedObjects" : 305, "nscanned" : 305, "nscannedObjectsAllPlans" : 1853, "nscannedAllPlans" : 2453, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 20, ------------- "cursor" : "BtreeCursor Application_1_Status_1_Path_1_ContentType_1 multi", "isMultiKey" : false, "n" : 0, "nscannedObjects" : 0, "nscanned" : 15, "nscannedObjectsAllPlans" : 60, "nscannedAllPlans" : 117, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0,
But in the log file millis and nscanned's are very different and very high for those queries.
Another situation is: despite of those queries don't use indexes that starts with
start date, when i drop those indexes (
) the queries slow down and millis goes to 2000 and higher
Another problem is on count queries: for example the following query
db.Contents.find({ Application: "com.cnnturk", Status: 0, Path: /^\/spor\//, ContentType: { $in: [ "Article", "PhotoGallery", "NewsVideo" ] } } ).count()
looks very slow in log file
Sat Nov 9 11:36:43 [conn40477318] command quark_test_cnn.$cmd command: { count: "Contents", query: { Application: "com.cnnturk", Status: 0, Path: /^/spor//s, ContentType: { $in: [ "Article", "PhotoGallery", "NewsVideo" ] } } } ntoreturn:1 keyUpdates:0 numYields: 4 locks(micros) r:3824466 reslen:48 2432ms
By the way i'm sending the db stats, collection stats and index stats:
> db.stats(1024*1024)
{
"db" : "quark_test_cnn",
"collections" : 36,
"objects" : 4503374,
"avgObjSize" : 7789.841278117252,
"dataSize" : 33455,
"storageSize" : 35775,
"numExtents" : 166,
"indexes" : 50,
"indexSize" : 485,
"fileSize" : 40877,
"nsSizeMB" : 16,
"ok" : 1
}
>
> db.Contents.stats(1024*1024)
{
"ns" : "quark_test_cnn.Contents",
"count" : 394327,
"size" : 1052,
"avgObjSize" : 0.0026678365924727447,
"storageSize" : 1248,
"numExtents" : 16,
"nindexes" : 10,
"lastExtentSize" : 329,
"paddingFactor" : 1.2190000000000272,
"systemFlags" : 0,
"userFlags" : 0,
"totalIndexSize" : 327,
"indexSizes" : {
"_id_" : 10,
"Application_1_Status_1_Url_1" : 45,
"Application_1_Status_1_IxName_1" : 28,
"Application_1_Status_1_Tags_1" : 30,
"Template.Regions.Controls.ContentViews.Content._id_1" : 0,
"Template.Regions.Controls.Controls.ContentViews.Content._id_1" : 0,
"Application_1_Status_1_Path_1_ContentType_1" : 23,
"_t_1_Application_1_Status_1_Path_1" : 74,
"StartDate_-1_Application_1_Status_1_Path_1_ContentType_1" : 27,
"StartDate_-1__t_1_Application_1_Status_1_Path_1" : 85
},
"ok" : 1
}
> db.Contents.totalIndexSize()
343334768
---------
I have 3 machine replica set
64 GB Ram (Mongo uses 60 GB)
RAID 10 Spindle disk
Redhat Enterprise Linux
I set all ulimits as on your documents.
Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
Thanks for your helps in advance