[SERVER-35843] New Parameter to Limit data files sizes Created: 27/Jun/18  Updated: 06/Dec/22  Resolved: 16/Mar/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Alex Leong Assignee: Backlog - Storage Execution Team
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Storage Execution
Participants:

 Description   

Currently we have collections of 4TB in size; the wiredTiger data file is one-large 4TB file, making it very difficult for us to rsync, copy, etc.

I would like to suggest an Improvement  to MongoDB whereby the file size could be limited (eg, maxFileSize parameter), so we could have many smaller files instead of one large file. Probably similar requirement for index file while you are at it ?

I previously spoke to Kirby (Mongo U) and Muthu Chinnasamy regarding this suggestion at MongoDB World 2018.

Please let me know if you have any questions.

 

Thank you

 

Alex Leong



 Comments   
Comment by Connie Chen [ 16/Mar/20 ]

Team reviewed in backlog grooming and decided this is not in line with our current direction

Comment by Alex Leong [ 15/Nov/19 ]

Hi again Brian

 

And in case you are wondering, this collection is already sharded and performing well.  We are not going to add any more shards just for the sake to make the .wt file smaller; that is not logical at all.

 

Thanks

Alex

Comment by Alex Leong [ 15/Nov/19 ]

Hi Brian

Look, we should not mix and confused logical design with physical implementation.

My complain is the physical implementation, and I am providing Mongo engineers feedback to improve your product; speaking from many years of experience as a DB administrator/architect.

The files sizes should not be lop-sided like these, makes the large file very hard to copy, rsync, etc.

profiodb_prod/collection:
total 274955588
-rwxr--r-- 1 mongo mongo 272947613696 Nov 15 11:39 32-9146723691851087539.wt
-rw------- 1 mongo mongo       233472 Nov 15 03:01 3-404202816176184934.wt
-rw------- 1 mongo mongo     20520960 Nov 15 03:01 5-404202816176184934.wt 

 

Sorry but your response was not what I was expecting.

 

Thank you

Alex Leong

Comment by Brian Lane [ 15/Nov/19 ]

Hi aleong@indeed.com,

This issue still remains in our backlog and has not been scoped or scheduled for a future release. I would be interested to learn more about what is leading to such large collection sizes and perhaps we could adjust your schema in a way to reduce these large files in addition we could look at alternative compression techniques. Would you be open to such a discussion? I can reach out to you directly to schedule a meeting.

-Brian

Comment by Alex Leong [ 06/Nov/19 ]

Ramon, any updates on this ?

Comment by Ramon Fernandez Marina [ 27/Jun/18 ]

Thanks for your report aleong@indeed.com, sending your suggestion to the Storage team for consideration.

Regards,
Ramón.

Generated at Thu Feb 08 04:41:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.