[GODRIVER-1925] GridFS downloads incorrectly swallow server-side errors Created: 19/Mar/21 Updated: 28/Oct/23 Resolved: 04/May/21 |
|
| Status: | Closed |
| Project: | Go Driver |
| Component/s: | GridFS |
| Affects Version/s: | 1.4.1 |
| Fix Version/s: | 1.5.2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Liam Brown | Assignee: | Benji Rewis (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Debian 10 |
||
| Issue Links: |
|
||||||||
| Documentation Changes: | Not Needed | ||||||||
| Description |
|
New: **The original ticket describes a case where the driver may be swallowing a server-side CursorNotFound error and overwriting it with io.EOF. This is incorrect as EOF should only be used to indicate that the download is complete without any errors and is generally ignored by applications. Unexpected errors during the download process should be propagated without overwriting. Additionally, in the case of a cursor error in download_stream, we are not Closing the cursor. This should be fixed in the same ticket. Previous: I've got a ~70MB file that is happily stored in GridFS. I can see all the chunks happily in there and can download/save the file sensibly with various tools. However, when using the Go driver to read the file I only get the first 16MB of the file - no errors but it just believes this is the genuine end of the file. This just rings an alarm bell as I know that 16MB is the MongoDB max doc size (but obviously shouldn't apply/be relevant with chunked GridFS). Is there an issue with the Go driver? I'm using very standard code to read the file. I'm using a basic Bucket (only interesting thing about it is that is has a custom "name" rather than the default "fs"), and then just calling OpenDownloadStreamByName and calling Read on that. Anything obvious that I am doing wrong or is this a potential issue? Thanks. |
| Comments |
| Comment by Liam Brown [ 05/May/21 ] |
|
Amazing - thanks again |
| Comment by Benji Rewis (Inactive) [ 04/May/21 ] |
|
mongodb@liambrown.co.uk a fix should be up in version 1.5.2 of the Go driver (the next patch release). Thanks again for your report! |
| Comment by Githook User [ 04/May/21 ] |
|
Author: {'name': 'Benjamin Rewis', 'email': '32186188+benjirewis@users.noreply.github.com', 'username': 'benjirewis'}Message: |
| Comment by Githook User [ 04/May/21 ] |
|
Author: {'name': 'Benjamin Rewis', 'email': '32186188+benjirewis@users.noreply.github.com', 'username': 'benjirewis'}Message: |
| Comment by Benji Rewis (Inactive) [ 27/Apr/21 ] |
|
We've reproduced the error and have a fix in review: https://github.com/mongodb/mongo-go-driver/pull/653 |
| Comment by Benji Rewis (Inactive) [ 27/Apr/21 ] |
|
DRIVERS-1624 should now be public; let me know if you still can't view it. |
| Comment by Liam Brown [ 27/Apr/21 ] |
|
Thanks for the update Benji. Let me know if you have any problems reproducing in the Go driver - hopefully should be straightforward as I could reliably reproduce. I can't access https://jira.mongodb.org/browse/DRIVERS-1624 perhaps due to permissions? If you are able to give me permission to read that then that would be great so I can keep in touch with it. Thanks. |
| Comment by Benji Rewis (Inactive) [ 26/Apr/21 ] |
|
Hello again mongodb@liambrown.co.uk! As an update, we're attempting to reproduce the GridFS download timeout that is hidden by an io.EOF error in the Go driver. DRIVERS-1624 (now linked to this ticket) was made to track potentially preventing GridFS cursors from timing out during downloads across drivers. |
| Comment by Liam Brown [ 24/Mar/21 ] |
|
That's great - thank you for the update |
| Comment by Divjot Arora (Inactive) [ 24/Mar/21 ] |
|
I've updated the ticket as described and moved it into Scheduled. Someone from the team will pick it up and investigate as soon as possible. – Divjot |
| Comment by Divjot Arora (Inactive) [ 24/Mar/21 ] |
|
I'm going to change this ticket to reflect the bug about swallowing errors and returning io.EOF instead of the CursorNotFound error from the server. The driver should only be returning EOF as an error if the file has been fully downloaded. Any unexpected errors during the process should be returned as-is. The request to modify download behavior to keep the cursor alive longer would require changes to the GridFS specification (found here). This isn't something we can do only in the Go Driver, so I've filed a cross-drivers ticket to investigate how this can be done. That will be triaged next Monday and I'll update you as it moves along. – Divjot |
| Comment by Liam Brown [ 24/Mar/21 ] |
|
Thanks for picking this up @Divjot Arora - let me know if you need any more information to recreate/fix. One potential option would just be setting `SetNoCursorTimeout(true)` on the chunksCursor, but I expect that you may want to implement a more in depth fix? If you could give an update on your investigation so far then that would be amazing. Thanks again. |
| Comment by Liam Brown [ 22/Mar/21 ] |
|
From what I can gather from the docs, it will be using default batch size of 16MB and default cursor timeout of 10 minutes without activity. Therefore, if you take more than 10 minutes to process the first 16MB of data (I do) then when it tries to get the next 16MB batch it fails due to having timed out. Is the batch size and/or timeout of the underlying cursor that gets these chunks (in bucket.go) controllable in any way? |
| Comment by Liam Brown [ 22/Mar/21 ] |
|
Just to add, this is returning an error "(CursorNotFound) cursor id 450842949365 not found" resulting in "errNoMoreChunks" and download_stream.go just returns io.EOF (hiding the error somewhat). Is the chunks cursor timing out after the original 16MB of data? |