|
Right now we ignore records headers when computing collection data size (mongo/db/namespace_details.h):
DiskLoc deletedList[Buckets];
|
// ofs 168 (8 byte aligned)
|
struct Stats {
|
// datasize and nrecords MUST Be adjacent code assumes!
|
long long datasize; // this includes padding, but not record headers
|
long long nrecords;
|
} stats;
|
However, the value includes padding and also includes the alignment that we use in our storage engine.
When the number of records hits 500,000,000 the headers overhead is 7+ GB, which is perceived as a lost disk space. I suggest to include record headers size into the coll.stats().size metric, or to introduce an additional sizeWithHeaders metric to avoid confusion.
|