Details
Description
Interleaving of concurrent appended writes to a GridStore file result in an uncaught throw in the driver resulting from file corruption and/or data loss due to duplicate chunk numbering when the writes cause one or more chunks to be created or added to.
Attached script synthetically reproduces the issue in a single process thread. The same issue may occur with multiple processes accessing the same GridFS collection simultaneously.
A cursory review of the gridstore code reveals no mechanism for locking the .files collection entry for a file while it undergoes writes to its chunk structure. Since under the GridFS data model, writes to the .chunks and .files collections are independent, no protection for concurrent access is provided by the mongodb core. Some type of atomicity guarantee for file writes perhaps using a findAndModify() based semaphore on the .files document for a file is necessary to safeguard against dataloss under attempted concurrent write access.