[CXX-2148] GridFS corruption Created: 08/Jan/21  Updated: 27/Oct/23  Resolved: 08/Jan/21

Status: Closed
Project: C++ Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Francois EE Assignee: Unassigned
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian buster.
Mongo 4.2.5
Mongo cxx driver 3.4
Mongo cluster is not sharded and has several replicas, some are in Paris (including the current master) and some are in Ohio.


Issue Links:
Related
is related to CXX-2150 Support sessions in GridFS Closed

 Description   

 When the replication lag is non zero ( was several hours in that specific occurrence), we see what looks like gridFS corruption when reading data from secondary nodes:

mongocxx::gridfs_exception: expected file to have 1 chunk(s), but query to chunks collection only returned 0 chunk(s): a GridFS file being operated on was discovered to be corrupted

We don't update documents in gridFS, we only create or delete them.

Let me know if you need any other information.



 Comments   
Comment by Kevin Albertson [ 11/Jan/21 ]

Apologies fechantillac@antidot.net, you are indeed correct! Though not all drivers support it, the the mongocxx::gridfs::bucket class does support sessions being passed to operations. I was mistaken – the C driver does not support sessions in GridFS operations, but the C++ implementation of GridFS does not utilize the C driver's GridFS API (one of the few exceptional cases).

You should be able to read from secondaries by using a session with causal consistency enabled.

This Jira project is for bug reports and feature requests. For help with using the C++ driver, please create a post in our community forum here.

Comment by Francois EE [ 11/Jan/21 ]

Thanks Kevin for the quick response. You say GridFS is not designed to support passing sessions but I do see that various methods of mongocxx::gridfs::bucket accepts a session. May I ask what is missing in the current implementation ?

Sorry if this is not the right place for asking such question, in this case please gently forward me to the correct place.

Comment by Kevin Albertson [ 08/Jan/21 ]

Closing since this is not a bug in the C++ driver.

Comment by Kevin Albertson [ 08/Jan/21 ]

Hi fechantillac@antidot.net,

Reading from a secondary in GridFS may get into situations where the file document has replicated but the chunks documents have not. I initially thought using a read and write concern of majority would solve this. But it is not a complete solution – as there is no guarantee that the selected secondary was among the majority for all inserted chunks.

I checked with the broader drivers team. This is a current limitation of how GridFS is currently specified. Drivers implement GridFS based on this common specification.

A robust solution would be to do operations with a session configured with causal consistency. But, as it is designed, GridFS does not support passing sessions. I have created CXX-2150 to track that work.

In the meantime, I would suggest using a primary read preference and watching CXX-2150 for updates.
 

Comment by Francois EE [ 08/Jan/21 ]

We use default options when opening the gridfs::bucket. Read preference in the connection string is NEAREST.

Comment by Kevin Albertson [ 08/Jan/21 ]

Hi fechantillac@antidot.net,

Thank you for the report! We will look into this soon.

My hypothesis aligns with your hypothetical. I suspect the file document is replicated before all chunks. And reading from the secondary sees the file document before all chunks are replicated.

Note also that the write policy used is the default one.

The default write concern for a gridfs::bucket is w:1 which only requires replication of one node (see the manual for a description).

Are you configuring the gridfs::bucket with a default read concern, and a read preference of SECONDARY_PREFERRED or SECONDARY?

Comment by Francois EE [ 08/Jan/21 ]

Actually it just happened again without any significant replication lag (less than 15 seconds).

Comment by Francois EE [ 08/Jan/21 ]

I should add that the issue disappears after some time without any writes to that gridFS object. We only had the opportunity to check gridFS consistency after the replication lag resorbs.

WIth my little understanding of how gridFS and replication are working, it looks like for some object the layers.files collection documents are successfully written to the secondary before the corresponding chunks. This is purely hypothetical though.

Note also that the write policy used is the default one.

Generated at Wed Feb 07 22:05:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.