[JAVA-2546] GridFSBucket requires key filename Created: 26/Jun/17 Updated: 29/Oct/23 Resolved: 30/Jun/17 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | GridFS |
| Affects Version/s: | 3.4.2 |
| Fix Version/s: | 3.5.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kevin Adistambha | Assignee: | Ross Lawley |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Description |
|
The spec for GridFS mentions that filename is optional. However, the Java driver GridFSBucket seems to require this field. For example, this fs.files collection using GridFS (note the missing filename field):
This Java code snippet will fail to download the file:
The error displayed is:
However the same code in Python will work:
There seems to be a discrepancy in GridFS behaviour between the Java driver and the Python driver. |
| Comments |
| Comment by Jan S. [ 27/Oct/17 ] | ||||||||||
|
Hi Ross. so you expect the people to access GridFS as collection and execute the following update command, right? MongoCollection<Document> gridFSCollection = currentDatabase.getCollection("fs.files"); If you do so then please write a migration guide and add this information. | ||||||||||
| Comment by Ross Lawley [ 20/Oct/17 ] | ||||||||||
|
Hi mango, The GridFS implementation follows the GridFS Specification and is a convention built upon the database. Regarding a migration guide, this was new code and the original GridFS implementation still exists. However, it looks like that there are some data inconsistencies, that haven't previously been reported. You are correct that server will allow differing types for fields, however, here the convention expects the fieldname to be a String. As the underlying data is just stored in collections in the database, it is possible that data maybe manipulated outside the GridFS convention, so that it no longer fits the conventions expectations. Also, the drivers team understands that prior to the specification the different language GridFS implementations, did differ slightly. With the new GridFS implementation the driver reads the data from BSON and into the strongly typed BsonDocument before creating a GridFSFile. The error is because the filename is expected to be a BsonString type and not a BsonNull type. Before opening a new ticket to support BsonNull values as the filename, it would be good to findout how a null value was set for the fieldname? Also, as expectations are made for the types of data for the GridFSFile it would be good to understand if there are other alternative types for that metadata. Thanks for your help, Ross | ||||||||||
| Comment by Jan S. [ 19/Oct/17 ] | ||||||||||
|
Is this topic already covered by a migration guide? I did not find anything regarding that topic in the Mongo server changelog of 3.0, 3.2 and 3.4. BTW: What is allowed or not is IMHO up to the server. If the drivers do disallow certain data to be stored the server also should disallow it. The current situation is totally confusing as it depends on the driver and it's version what is allowed or not. | ||||||||||
| Comment by Ross Lawley [ 19/Oct/17 ] | ||||||||||
|
Hi mango, Yes this was released in 3.5.0 and fixes the issue where the filename is missing. From your report it looks like you have a null value stored which is still invalid. Ross | ||||||||||
| Comment by Jan S. [ 19/Oct/17 ] | ||||||||||
|
@Ross Lawley Does this mean the Java Driver 3.5.0 already does contain the change to "defaulting the filename to an empty string" or is this future work? Because in my tests it does not matter which version I use (3.4.? or 3.5.0) I can't retrieve GridFS objects that have a null filename (org.bson.BsonInvalidOperationException: Value expected to be of type STRING is of unexpected type NULL). | ||||||||||
| Comment by Ross Lawley [ 30/Jun/17 ] | ||||||||||
|
Just to clarify what has been fixed - the 3.5 Java driver will handle legacy GridFS data in the future by defaulting the filename to an empty string. DRIVERS-392 will look into relaxing the filename being required in spec. | ||||||||||
| Comment by Githook User [ 30/Jun/17 ] | ||||||||||
|
Author: {u'username': u'rozza', u'name': u'Ross Lawley', u'email': u'ross.lawley@gmail.com'}Message: Support decoding legacy files that don't contain filenames
| ||||||||||
| Comment by Ross Lawley [ 27/Jun/17 ] | ||||||||||
| Comment by Ross Lawley [ 27/Jun/17 ] | ||||||||||
|
Hi, Thanks for the responses. The filename is required on upload, this is explicit in the specification and defined in the fields list for the files collection. It is unspecified if we support legacy files that were created without a name and there is no test case for that scenario. The reason we see an error in the Java driver is internally it uses a strictly typed collection of GridFSFile (where filename is non optional). The GridFSFile contains the information needed for the download. I've raised DRIVERS-392 to clarify if filename should be optional in the future and asked for a test case to support legacy data. Either way, we should support legacy data going forward - marking to fix in 3.5 Thanks again, Ross | ||||||||||
| Comment by Kevin Adistambha [ 27/Jun/17 ] | ||||||||||
|
Hi ross.lawley, as a sanity check, I tried doing the same using the node driver and the C# driver. Both allowed me to download without the filename field present in the fs.files collection. For reference, this is what I tried using the node driver:
and the same in the C# driver:
As mentioned by andrew.ryder, we're not entirely sure which one is the intended behaviour. Thanks! | ||||||||||
| Comment by Andrew Ryder (Inactive) [ 27/Jun/17 ] | ||||||||||
|
G'day ross.lawley, it isn't clear to me where in the specification it says filename is required. If we take the wording to mean it's required it would imply that metadata is required too, which makes no sense. In any case, even if the filename were required, it still means there is a bug somewhere and we're simply trying to understand where:
Given the specification is not entirely clear – to our reading, please provide a quote if you believe it is clear – and the documentation says the filename is optional, and the Python driver allows it as optional, the Occam's Razor solution says that the Java driver is the odd man out. I think we're fully prepared to accept this may be new and that the Java driver may be the only thing doing it right, we just need some clear indication that is so before raising a series of change requests to fix those others sites. Please let us know. | ||||||||||
| Comment by Ross Lawley [ 26/Jun/17 ] | ||||||||||
|
Hi kevin.adistambha, Thanks for the ticket. The GridFSBucket follows the latest GridFS Specification where filename is required. As a result the GridFSFile class in Java does not allow null values for the filename string. Ross |