[SERVER-1232] C++ GridFS Client should support larger Chunk Size Created: 14/Jun/10  Updated: 12/Jul/16  Resolved: 14/Jun/10

Status: Closed
Project: Core Server
Component/s: Internal Client
Affects Version/s: None
Fix Version/s: 1.5.3

Type: Improvement Priority: Major - P3
Reporter: Kazuki Ohta Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongodb-cpp-gridfs-set-chunksize.patch    
Participants:

 Description   

Current C++ GriidFS client doesn't support setting the chunk size. It uses default chunk size (256K), and which is small for multi-gigabyte data.

I created the patch to add GridFS::setChunkSize(unsigned int size), to set the chunk size. Please review/modify the patch and commit.



 Comments   
Comment by Artem [ 28/Apr/12 ]

The massert in the patch rejects any chunk size except 0!
It should assert `size != 0`, not `size == 0`.

Comment by Adrien Mogenet [ 01/Jul/10 ]

To complete your comments guys,

I recently benchmarked GridFS, loading small (< 500 kb) and large ( > 1 gb) files, with different chunk size, thanks to PHP driver.
The default size was the faster way to insert everything, and the difference was significant.

Comment by auto [ 14/Jun/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: option to set chunk size on gridfs client SERVER-1232
http://github.com/mongodb/mongo/commit/76635118f786178abb5a1452ecda195ffe5b5e0b

Comment by Eliot Horowitz (Inactive) [ 14/Jun/10 ]

I don't think a larger chunk size will make loading any faster actually - possibly even the opposite depending on what you do with it.
Thought you can certainly experiment?

Did you see my email about the contributor agreement?

Comment by Kazuki Ohta [ 14/Jun/10 ]

> but i don't think the chunk size is small for multi-gigabyte data.

We're now considering to store 10s of 20gb data (search indices). If 256k, the number of chunks for each data is 80000. This is an acceptable number for the database? The cost of the transfer could be reduced if the # of chunks is small?

> this keeps resource requirements low on clients

Yes, definitely.

And also, the motivation of this patch is Ruby client has an interface to change the chunk size.

Comment by Eliot Horowitz (Inactive) [ 14/Jun/10 ]

the patch is probably a good idea, but i don't think the chunk size is small for multi-gigabyte data.
more chunks isn't a problem, and this keeps resource requirements low on clients

Generated at Thu Feb 08 02:56:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.