[SERVER-14088] Excessive DB fileSize growth for large documents and powerOf2 Created: 29/May/14  Updated: 02/Aug/18  Resolved: 18/Oct/14

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 2.7.8

Type: Task Priority: Major - P3
Reporter: Steve Briskin (Inactive) Assignee: Mathias Stearn
Resolution: Done Votes: 1
Labels: brs
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File fragmentation.js    
Issue Links:
Duplicate
is duplicated by SERVER-15025 Disk usage increases linearly with co... Closed
Participants:

 Description   

We have a collection with powerOf2 and a TTL index. The collection takes a lot of inserts of large documents (4-16MB) which are deleted by the TTL in 30 hours. There are no updates.

The dataSize of the collection is roughly constant, but over time we see the fileSize and storageSize grow reaching 20x of dataSize.

My understanding is that this has to do with how the freelist is managed for documents >4MB with powerOf2 enabled.

A repro test attached.
Output:

$ mongo localhost:26010/frag_test fragmentation.js 
MongoDB shell version: 2.6.0
connecting to: localhost:26010/frag_test
Before inserts. Filesize: 0, storageSize: 0
Iteration 1, Filesize: 3102736384, storageSize: 2526687200
Iteration 2, Filesize: 4175953920, storageSize: 3599888320
Iteration 3, Filesize: 5785780224, storageSize: 5209690000
Iteration 4, Filesize: 6858997760, storageSize: 6282891120
Iteration 5, Filesize: 8468824064, storageSize: 7892692800
Iteration 6, Filesize: 9542041600, storageSize: 8965893920
Iteration 7, Filesize: 10615259136, storageSize: 10039095040
Iteration 8, Filesize: 12225085440, storageSize: 11648896720


Generated at Thu Feb 08 03:33:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.