[NODE-157] Concurrent GridStore file append operations result in file corruption / data loss, uncaught throw Created: 17/Mar/14  Updated: 25/Feb/15  Resolved: 18/Sep/14

Status: Closed
Project: Node Driver
Component/s: None
Affects Version/s: 1.3, 1.4, 3.0
Fix Version/s: 2.0

Type: Bug Priority: Critical - P2
Reporter: Vaughn Iverson Assignee: Christian Amor Kvalheim
Resolution: Fixed Votes: 3
Labels: append, crash, dataloss, gridfs, nodeJS
Environment:

All


Attachments: File concurrent_append_failure.js     File gridstore.js    
# Replies: 25
Participants:
Days since reply: 2 years, 38 weeks, 3 days ago
Date of 1st Reply:
Last commenter: Rathi Gnanasekaran
Last comment by Customer: true

 Description   

Interleaving of concurrent appended writes to a GridStore file result in an uncaught throw in the driver resulting from file corruption and/or data loss due to duplicate chunk numbering when the writes cause one or more chunks to be created or added to.

Attached script synthetically reproduces the issue in a single process thread. The same issue may occur with multiple processes accessing the same GridFS collection simultaneously.



 Comments   
Comment by Vaughn Iverson [ 17/Mar/14 ]

A cursory review of the gridstore code reveals no mechanism for locking the .files collection entry for a file while it undergoes writes to its chunk structure. Since under the GridFS data model, writes to the .chunks and .files collections are independent, no protection for concurrent access is provided by the mongodb core. Some type of atomicity guarantee for file writes perhaps using a findAndModify() based semaphore on the .files document for a file is necessary to safeguard against dataloss under attempted concurrent write access.

Comment by Vaughn Iverson [ 17/Mar/14 ]

Fixed typo in callback returning wrong gridStore variable. Typo does not impact reproducing the bug.

Comment by Christian Amor Kvalheim [ 17/Mar/14 ]

known issue for gridfs in general, fix will have to be in the context spec as a whole.

Comment by Christian Amor Kvalheim [ 17/Mar/14 ]

what is the use case that needs to support concurrent appending to a file ?

Comment by Vaughn Iverson [ 17/Mar/14 ]

You say this is "known"; is there an outstanding issue filed on the GridFS data model that you are aware of?

Comment by Christian Amor Kvalheim [ 17/Mar/14 ]

Yeah the gridfs model is just a layered document representation on top of mongodb. it makes no guarantees for thread safety. Fixing this will require a lot of work as we need to create a new more comprehensive model for gridfs that would allow for some sort of semi-transaction/locking mechanism and it would need to be backwards compatible and work across all our drivers and the shell. The good news is that it will happen, the bad news is that not right now.

the other good news is that since it's just a convention it means one could easily implement an alternative to gridfs on top of mongodb.

I'm still curious about what the use case is for concurrent appends (logging?)?

Comment by Vaughn Iverson [ 17/Mar/14 ]

I don't mean to be flip, but any useful append operation on a system that supports concurrent access needs to be atomic. My specific test case was logging to a gridFS file in a multiprocess environment; but I could think of tons of other examples. My first impression opinion is that "w+" access to a file should not exist in the driver if it cannot be made safe under concurrent access. Node is an async system and many real world applications will also involve multiple node processes accessing the same database simultaneously. I understand your point that to really "fix" this problem the model needs to change, because otherwise nothing prevents the Python and Ruby drivers from interacting poorly with each other and the Node driver, etc. I don't know though, do other language drivers even implement append operations? Bottom line, I think append is a very useful feature, but as currently implemented in the Node driver it is a dangerous one unless you can guarantee synchronous access.

Comment by Christian Amor Kvalheim [ 17/Mar/14 ]

it's quite possible I'll pull the w+ access in the driver as it obviously is causing a problem. but if you are logging data you are better of using one of the logging pattern as gridfs was never intended for this kind of usage and is suboptimal for this.

http://docs.mongodb.org/ecosystem/use-cases/storing-log-data/

Comment by Vaughn Iverson [ 17/Mar/14 ]

Understood re: logging, however I'm beating hard on GridStore right now (can you tell?) because I am trying to use it for a distributed scientific data analysis pipeline application. This application will have multiple/many concurrent workers running on distributed compute nodes. I have been hoping to use gridFS to manage the underlying data store (lots of large non-JSON documents of various types), but instability and potential for data loss are unacceptable. So I've been abusing GridStore hard to evaluate if it's stable and reliable enough for production. Things like this do not build confidence.

Comment by Christian Amor Kvalheim [ 17/Mar/14 ]

yeah for concurrent writes there really is no way to do append write and guarantee consistency as you can easily get into situations like.

1. Process 1 writes at the end of chunk 4
2. Process 2 does not know about Process 1 and writes to the same chunk
3. Data is lost.

it's better for logging purposes to write data in discreet sized documents and sort them by objectid. You might end up with varying sized documents but they will always be in order due to the objectid meaning there is no chance of corruption happening as in append writes on gridstore objects.

Comment by Vaughn Iverson [ 17/Mar/14 ]

One more addition: my other specific use case that I am testing against was chunked upload of large files across slow/unreliable connections. So I've been testing with the [resumable.js](http://resumablejs.com/) library. Due to the async nature of node, I encountered this issue as an intermittent bug where the next chunk arrived and tried to append to the file before the previous close completed. There are obvious ways around this, I can implement my own locking scheme easily enough, but I was disturbed to find that provision for concurrent access had been made in the driver without even a warning about the potential for data loss in the documentation.

I would say that leaving w+ access in the driver is fine, so long as the documentation prominently states that concurrent access under append is dangerous and not supported. If you do this, I will probably write an npm package to provide "safe" concurrent writes using GridStore, although ideally that wouldn't be necessary because it would be built-in.

Comment by Christian Amor Kvalheim [ 17/Mar/14 ]

I agree with making the docs more explicit. Unfortunately nothing short of a two-phase commit will fix this properly with it's added complexities and potential for slowdown. I would be happy to look at anything you come up with.

Comment by Vaughn Iverson [ 17/Mar/14 ]

My first cut idea would be to extend the GridStore class to add optional lock() and unlock() operations by file _id. They would be implemented via a uniquely named ad hoc field in the metadata subdocument, with distributed mutual exclusion guaranteed via mongo.findAndModify(). If I implement that, would you be interested in pulling it into the driver?

Comment by Christian Amor Kvalheim [ 17/Mar/14 ]

I would definitively take a look at it but I cannot promise to put it in the driver as we are trying to ensure same semantics across all the drivers and put some more rigor into specs for behaviors. But a working solution would be a serious contender.

Comment by Vaughn Iverson [ 28/Mar/14 ]

gridfs-locks is live on npm. https://www.npmjs.org/package/gridfs-locks

Comment by Cristian Ianto [ 31/Mar/14 ]

Shouldn't the mongo driver at least allow the user to handle write errors gracefully?

Not sure if this is a new issue or the same, but concurrent GridStore file writes (either in append or truncate mode) cause the md5 command in buildMongoObject to fail with "chunks out of order" exception, and buildMongoObject does not handle the error in any way. It should be easy enough to make buildMongoObject callback in a standard (err, mongoObject) fashion so that the error can be propagated and handled.

Comment by Vaughn Iverson [ 31/Mar/14 ]

Sounds like the same issue. This issue often creates multiple chunks with the same chunk number, which corrupts the file and causes the unhandled throw in the md5 code you reference above.

Comment by Cristian Ianto [ 01/Apr/14 ]

Here is an updated gridstore.js with md5 error handling in place, that doesn't necessarily kill the application when the md5 fails.

Comment by Cristian Ianto [ 01/Apr/14 ]

@Vaughn thanks for the gridfs-locks, good stuff.

Comment by Christian Amor Kvalheim [ 01/Apr/14 ]

please add the code as a pull request at so that you get the correct attribution if the commit is pulled in.

https://github.com/mongodb/node-mongodb-native/pulls

Comment by Cristian Ianto [ 01/Apr/14 ]

Sent pull request, thanks.

Comment by Christian Amor Kvalheim [ 18/Sep/14 ]

Disabled w+ mode until there can be a longterm real solution to this.

Comment by Thomas Riccardi [ 24/Feb/15 ]

In node mongodb native drive the "w+" mode is still authorized (it doesn't raise any error), although it's not documented.
https://github.com/mongodb/node-mongodb-native/blob/94045c0b658edf8c7c87d55cee79083494a9762a/lib/gridfs/grid_store.js#L166

Comment by Christian Amor Kvalheim [ 24/Feb/15 ]
  • GridStore only supports w+ for metadata updates, no appending to file as it's not thread safe and can cause corruption of the data
    + seek will fail if attempt to use with w or w+
    + write will fail if attempted with w+ or r
    + w+ only works for updating metadata on a file
Comment by Thomas Riccardi [ 25/Feb/15 ]

Okay.
This could be explained in the documentation, because today we have conflicting information:
at the beginning of GridStore doc: mode can be "r" or "w", and multiple methods docs talk about 'mode "w" or "w+"'.

Generated at Sun Nov 19 12:34:43 UTC 2017 using JIRA 7.2.10#72012-sha1:2651463a07e52d81c0fcf01da710ca333fcb42bc.