-
Type: Improvement
-
Resolution: Won't Fix
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Index Maintenance, MMAPv1
-
Labels:None
-
Storage Execution
some people create their index thinking about the usual query more than inserts.
For example for gridfs, people usually will query db.files.find(
).sort(
{uploadDate: -1}) to have most recent first.
And so index may get created like
(which is what Python driver does).
But the actual inserts go with increasing date, so the btree grows by splitting 50/50 (left side).
The index ends up being twice as large as if the index was created in reverse order.
Inserted 100000 increasing numbers into ascending index:
> db.time.$time_1.stats()
{
"ns" : "mydb.time.$time_1",
"sharded" : false,
"primary" : "shard0000",
"ns" : "mydb.time.$time_1",
"count" : 452,
"size" : 3702784,
"avgObjSize" : 8192,
"storageSize" : 13191168,
"numExtents" : 4,
"nindexes" : 0,
"lastExtentSize" : 10438656,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {
},
"ok" : 1
}
Inserted 100000 decreasing numbers into ascending index:
> db.time.$time_1.stats()
{
"ns" : "test.time.$time_1",
"sharded" : false,
"primary" : "shard0001",
"ns" : "test.time.$time_1",
"count" : 813,
"size" : 6660096,
"avgObjSize" : 8192,
"storageSize" : 11141120,
"numExtents" : 4,
"nindexes" : 0,
"lastExtentSize" : 8388608,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {
},
"ok" : 1
}
we should split 90/10 on left of leftmost so that the order of index really doesnt matter.