[CDRIVER-872] GridFS files can be written to after saving Created: 22/Sep/15  Updated: 11/Sep/19  Resolved: 01/Oct/15

Status: Closed
Project: C Driver
Component/s: GridFS
Affects Version/s: None
Fix Version/s: 1.3.0-beta0

Type: Task Priority: Major - P3
Reporter: Kyle Suarez Assignee: Unassigned
Resolution: Done Votes: 0
Labels: gridfs
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

All


Issue Links:
Related
is related to CDRIVER-511 Files written to GridFS must be saved... Closed

 Description   

We allow

mongoc_gridfs_file_save (file);
mongoc_gridfs_file_writev (file, ...);

Should this be prohibited?



 Comments   
Comment by Hannes Magnusson [ 01/Oct/15 ]

Not sure why the other commit didn't show up.

https://github.com/mongodb/mongo-c-driver/commit/a7ea97eb3f64535ef4e6e6f15f8a144d388beaa2

Comment by Githook User [ 01/Oct/15 ]

Author:

{u'username': u'ksuarz', u'name': u'Kyle Suarez', u'email': u'ksuarz@gmail.com'}

Message: CDRIVER-511 tests don't save between reads/writes

This also makes progress towards fulfilling CDRIVER-872.
Branch: 1.3.0-dev
https://github.com/mongodb/mongo-c-driver/commit/e7a7d9c2bce8f8b968f9ee05c5efc9ea33c37d38

Comment by Jose Battig [ 22/Sep/15 ]

Jesse, simply put: the more flexibility provided, the better for us. We prefer the ability to read/write liberally to a GridFS file from a common GridFS file object, as if it were a traditional Stream object. Anything different than that will force us to implement sub-optimal approaches to mimic bi-directional stream operations.

We have a component that uses GridFS as a backend storage for files which can be consumed directly from our apps as Streams and even exposed to the OS using a volume mounter application.
When exposing to the OS, we implemented a locking scheme on top of GridFS to make our mounted drive as compliant as possible as any user level volume mounted in the OS. On that regard, we do offer liberal combinations of read/writes at the same time given the fact we can lock files on GridFS when using our higher level software layer on top of GridFS.

I would argue that nothing is gained by completely restricting mixed read and writes because even if assume that files become readable once "saved" that will work as long as the header is written after all chunks are written, which by itself is not good strategy because you can easily leave orphan chunks on the database if the writer process crashes.
Without having a comprehensive locking scheme on GridFS, we should at least have a state field on the header to tell the state of the file. Could be something as simple a "writing" and "ready" or something along those lines. While "writing" you may prevent other GridFS clients from opening or even "seeing" the file existance, and also will give you a clean way to query for partially written files in case the writer process crashed before doing a clean save.

Just adding that state field could allow to provide mixed read/write access in a way that writers will have to obtain exclusive access to write and Mongo support ways to atomically and concurrently modify an attribute that can be used to achieve this (simplistic locking scheme). Now, this might be a departure from current GridFS spec, but again, could be a MongoC enhancement to the spec.

Comment by A. Jesse Jiryu Davis [ 22/Sep/15 ]

jsbattig@convey.com could you comment on whether you depend on saving a file to GridFS before reading it back, please? If we allow you to mix reads and writes on an unsaved file but not on a saved one, could you work within that restriction? Thanks.

Comment by A. Jesse Jiryu Davis [ 22/Sep/15 ]

The objection to allowing writes to a saved file is, another process can begin reading a file once it's saved. (Details: saving a file creates an entry for it in the db.fs.files collection, so another process, either the C driver or another client, could query it by name, get its files_id, and start reading chunks.)

It is sort of reasonable to allow a mix of reads and writes on an unsaved file, and it's too late to change our minds about that now. I hadn't realized we allowed writes to a saved file, I would really like to prohibit that.

Generated at Wed Feb 07 21:10:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.