Introduction
I have that contains 30,650,965 elements and each elements have an average size of 572 bytes.
The data made 16.3Gb and index 6.9Gb.
On this documents, i have a position in [ lon, lat ] format like this (I have remove field that isn't part of the test):
{ "_id" : new BinData(3, "6kMGoQ6pfaMmT+4ozRTtpA=="), "creationDate" : new Date("12/3/2013 14:25:46"), "detection" : { "position" : [0.13837920380832269, 45.990779665547429] }, }
Until 2.2 i use 2 indexes like that :
{ "detection.position": "2d", "creationDate": 1 } { "creationDate": 1 }
Now i am in 2.4 and would like use "2dsphere" index because i can put the creationDate first and keep only one index. The data is one year of document and i request generally only one day on a zone.
Test with old index
With a "2d" indexes i have the following request time :
> db.ntr_tracks.find({creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") }}).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "BtreeCursor creationDate_1", "isMultiKey" : false, "n" : 1000, "nscannedObjects" : 1000, "nscanned" : 1000, "nscannedObjectsAllPlans" : 1000, "nscannedAllPlans" : 1000, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 7, "indexBounds" : { "creationDate" : [ [ ISODate("2012-08-01T00:00:00Z"), ISODate("2012-08-02T00:00:00Z") ] ] }, "server" : "pc-uvh-2:27017" }
> db.ntr_tracks.find({"detection.position": { $within: { $box: [ [ 0.7724536441932998, 50.61956018018893 ], [ 2.483031806726593, 51.42851043057254 ] ] } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain()
First time:
{ "cursor" : "GeoBrowse-box", "isMultiKey" : false, "n" : 1000, "nscannedObjects" : 1786, "nscanned" : 1786, "nscannedObjectsAllPlans" : 1786, "nscannedAllPlans" : 1786, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 231, "nChunkSkips" : 0, "millis" : 30551, "indexBounds" : { "detection.position" : [ ] }, "lookedAt" : NumberLong(959532), "matchesPerfd" : NumberLong(628029), "objectsLoaded" : NumberLong(1786), "pointsLoaded" : NumberLong(0), "pointsSavedForYield" : NumberLong(0), "pointsChangedOnYield" : NumberLong(0), "pointsRemovedOnYield" : NumberLong(0), "server" : "pc-uvh-2:27017" }
Other time (even when changing period):
{ "cursor" : "GeoBrowse-box", "isMultiKey" : false, "n" : 1000, "nscannedObjects" : 1744, "nscanned" : 1744, "nscannedObjectsAllPlans" : 1744, "nscannedAllPlans" : 1744, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 378, "nChunkSkips" : 0, "millis" : 8657, "indexBounds" : { "detection.position" : [ ] }, "lookedAt" : NumberLong(959532), "matchesPerfd" : NumberLong(628029), "objectsLoaded" : NumberLong(1744), "pointsLoaded" : NumberLong(0), "pointsSavedForYield" : NumberLong(0), "pointsChangedOnYield" : NumberLong(0), "pointsRemovedOnYield" : NumberLong(0), "server" : "pc-uvh-2:27017" }
If i use $geoWithin with a $box i have the same result.
Create a new index that replace two old index.
Now i delete the two index and create one:
{ "creationDate": 1, "detection.position": "2dsphere" }
The first query on date only is very long and don't use the index...
> db.ntr_tracks.find({creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") }}).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 1000, "nscannedObjects" : 30925306, "nscanned" : 30925306, "nscannedObjectsAllPlans" : 30925306, "nscannedAllPlans" : 30925306, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 414, "nChunkSkips" : 0, "millis" : 451928, "indexBounds" : { }, "server" : "pc-uvh-2:27017" }
If i made a geo query like this, mongo don't use index neither on geo query or date:
> db.ntr_tracks.find({"detection.position": { $within: { $box: [ [ 0.7724536441932998, 50.61956018018893 ], [ 2.483031806726593, 51.42851043057254 ] ] } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain() > db.ntr_tracks.find({"detection.position": { $geoWithin: { $box: [ [ 0.7724536441932998, 50.61956018018893 ], [ 2.483031806726593, 51.42851043057254 ] ] } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 1000, "nscannedObjects" : 30925306, "nscanned" : 30925306, "nscannedObjectsAllPlans" : 30925306, "nscannedAllPlans" : 30925306, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 520, "nChunkSkips" : 0, "millis" : 586488, "indexBounds" : { }, "server" : "pc-uvh-2:27017" }
Using the following query is better, but always more slower than "2d" index.
> db.ntr_tracks.find({"detection.position": { $geoWithin: { $geometry: { type: "Polygon", coordinates: [ [ [ 0.7724536441932998, 50.61956018018893 ], [ 0.7724536441932998, 51.42851043057254 ], [ 2.483031806726593, 51.42851043057254 ], [ 2.483031806726593, 50.61956018018893 ], [ 0.7724536441932998, 50.61956018018893 ] ] ] } } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "S2Cursor", "isMultiKey" : true, "n" : 1000, "nscannedObjects" : 1785, "nscanned" : 1266126, "nscannedObjectsAllPlans" : 1785, "nscannedAllPlans" : 1266126, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 1, "nChunkSkips" : 0, "millis" : 73671, "indexBounds" : { }, "nscanned" : 1266126, "matchTested" : NumberLong(1264341), "geoTested" : NumberLong(5664), "cellsInCover" : NumberLong(9), "server" : "pc-uvh-2:27017" }
Creating a date index
Creating an index on creationDate only (and keeping the geo index) will give the follwing result:
> db.ntr_tracks.find({creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") }}).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "BtreeCursor creationDate_1", "isMultiKey" : false, "n" : 1000, "nscannedObjects" : 1000, "nscanned" : 1000, "nscannedObjectsAllPlans" : 1000, "nscannedAllPlans" : 1000, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 28, "indexBounds" : { "creationDate" : [ [ ISODate("2012-08-01T00:00:00Z"), ISODate("2012-08-02T00:00:00Z") ] ] }, "server" : "pc-uvh-2:27017" }
> db.ntr_tracks.find({"detection.position": { $within: { $box: [ [ 0.7724536441932998, 50.61956018018893 ], [ 2.483031806726593, 51.42851043057254 ] ] } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain() > db.ntr_tracks.find({"detection.position": { $geoWithin: { $box: [ [ 0.7724536441932998, 50.61956018018893 ], [ 2.483031806726593, 51.42851043057254 ] ] } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "BtreeCursor creationDate_1", "isMultiKey" : false, "n" : 1000, "nscannedObjects" : 47625, "nscanned" : 47625, "nscannedObjectsAllPlans" : 47625, "nscannedAllPlans" : 47625, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 1, "nChunkSkips" : 0, "millis" : 240, "indexBounds" : { "creationDate" : [ [ ISODate("2012-08-01T00:00:00Z"), ISODate("2012-08-02T00:00:00Z") ] ] }, "server" : "pc-uvh-2:27017" }
In this two case the query use the creationDate. Very quick for the selected period of time. Lesser when the period grow because without index. But if mongo use the coupound index correctly with
creationDate and position, he shouldn't be slower that creationDate only ....
If i force (with hint) the compound index, the query is slow.
> db.ntr_tracks.find({"detection.position": { $geoWithin: { $geometry: { type: "Polygon", coordinates: [ [ [ 0.7724536441932998, 50.61956018018893 ], [ 0.7724536441932998, 51.42851043057254 ], [ 2.483031806726593, 51.42851043057254 ], [ 2.483031806726593, 50.61956018018893 ], [ 0.7724536441932998, 50.61956018018893 ] ] ] } } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "S2Cursor", "isMultiKey" : true, "n" : 1000, "nscannedObjects" : 1785, "nscanned" : 1266126, "nscannedObjectsAllPlans" : 1785, "nscannedAllPlans" : 1266126, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 1, "nChunkSkips" : 0, "millis" : 50013, "indexBounds" : { }, "nscanned" : 1266126, "matchTested" : NumberLong(1264341), "geoTested" : NumberLong(5664), "cellsInCover" : NumberLong(9), "server" : "pc-uvh-2:27017" }
This last query use the S2Cursor is just slow (comparing the 2d index).
Last test
For the last test i replace the geo index by :
{"detection.position": "2dsphere", creationDate:1}
The difference is only for this query:
> db.ntr_tracks.find({"detection.position": { $geoWithin: { $geometry: { type: "Polygon", coordinates: [ [ [ 0.7724536441932998, 50.61956018018893 ], [ 0.7724536441932998, 51.42851043057254 ], [ 2.483031806726593, 51.42851043057254 ], [ 2.483031806726593, 50.61956018018893 ], [ 0.7724536441932998, 50.61956018018893 ] ] ] } } }, creationDate: { $gt: new ISODate("20120801T000000"), $lt: ISODate("20120802T000000") } }).sort({ creationDate: 1 }).limit(1000).explain() { "cursor" : "S2Cursor", "isMultiKey" : true, "n" : 1000, "nscannedObjects" : 1747, "nscanned" : 614506, "nscannedObjectsAllPlans" : 1747, "nscannedAllPlans" : 614506, "scanAndOrder" : true, "indexOnly" : false, "nYields" : 9, "nChunkSkips" : 0, "millis" : 12019, "indexBounds" : { }, "nscanned" : 614506, "matchTested" : NumberLong(612759), "geoTested" : NumberLong(3220), "cellsInCover" : NumberLong(9), "server" : "pc-uvh-2:27017" }
Question
Removing the sort doen't change the response time.
After some test for the same order of index, S2Cursor is a little slower than old 2d index. Is it normal ?
On 2dsphere index i can't made $box query as before. Is it normal ?
Using a compound index with the creationDate doesn't work as i expect
- search on creation date only deosn't use the index.
- search with a geo query is slower than a seach with an index on creation date.....
Is a way to optimize this ?
Thanks for reading, and for mongo.
- is related to
-
SERVER-31620 Compound index with datetime and geospatial fields doesn't work
- Closed