[SERVER-4457] avgObjSize didn't shrink as expected after updating documents and compacting Created: 08/Dec/11  Updated: 29/Feb/12  Resolved: 11/Dec/11

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Dan Spinosa Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

3 mongo production cluster on Unbuntu 10.10 cloud machines


Operating System: Linux
Participants:

 Description   

I recently transitioned a collection (> 30MM documents) from long, human readable key names to short, 1 character keys to save on storage space (on disk and in memory) & improve performance.

I dropped the old indexes, ran a big db.mycollection.update(

{...}

), compacted the collection (on primary and 2 replicas) then built new indexes.

The storageSize and totalIndexSize were cut in half (about what I expected). But avgObjSize hasn't change significantly for that collection! I've also noticed that paddingFactor is significantly different on each server (1.01, 1.49 on replicas, 1.36 on primary).

Why wouldn't the avgObjSize drop by about half?

---BEFORE---

PRIMARY> db.broadcasts.stats()
{
"ns" : "redacted.broadcasts",
"count" : 32370008,
"size" : 91792986284,
"avgObjSize" : 2835.7418473297876,
"storageSize" : 91835727232,
"numExtents" : 43,
"nindexes" : 5,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1.5799999997279606,
"flags" : 1,
"totalIndexSize" : 9271265136,
"indexSizes" :

{ "_id_" : 1305780784, "rebroadcast_of_1" : 903390768, "shortened_permalink_1" : 1881060496, "channel_id_1_created_at_-1_video_player_1" : 2400301904, "created_at_-1_shortened_permalink_1__id_-1" : 2780731184 }

,
"ok" : 1
}

---AFTER---

PRIMARY> db.broadcasts.stats()
{
"ns" : "redacted.broadcasts",
"count" : 32370008,
"size" : 91792986284,
"avgObjSize" : 2835.7418473297876,
"storageSize" : 49060732592,
"numExtents" : 23,
"nindexes" : 4,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1.00999999972796,
"flags" : 1,
"totalIndexSize" : 4578404656,
"indexSizes" :

{ "_id_" : 945219184, "a_1_R_-1_u_1" : 1504187776, "M_1_R_-1" : 1571999520, "D_1" : 556998176 }

,
"ok" : 1
}



 Comments   
Comment by Eliot Horowitz (Inactive) [ 11/Dec/11 ]

Running compact would do it.

Comment by Dan Spinosa [ 08/Dec/11 ]

Is there any way to force a paddingFactor re-calculation? Seems like documents (particularly new ones) are using significantly more space than necessary under the above conditions...

Comment by Eliot Horowitz (Inactive) [ 08/Dec/11 ]

avgObjSize includes padding - so making documents smaller doesn't change this.
We probably need to add an other field that is actual size.

Generated at Thu Feb 08 03:06:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.