[JAVA-3650] GridFSDownloadPublisher can't handle > 2 GB Created: 09/Mar/20  Updated: 28/Oct/23  Resolved: 10/Mar/20

Status: Closed
Project: Java Driver
Component/s: Reactive Streams
Affects Version/s: 3.12.1
Fix Version/s: 4.0.1

Type: Bug Priority: Major - P3
Reporter: Tamara Ockhuijsen Assignee: Ross Lawley
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

After upgrading to Java driver 3.12 and using the Reactive Streams-based asynchronous driver, it is not possible anymore to download files bigger than 2 GB. In the older version using GridFSDownloadStream it was supported, however GridFSDownloadPublisherImpl is limited by int. The code breaks here on allocate when exceeding max value 2147483647, then the remaining intValue() becomes negative:

                    int byteBufferSize = Math.max(chunkSize, bufferSizeBytes);
                    byteBufferSize =  Math.min(Long.valueOf(remaining).intValue(), byteBufferSize);
                    ByteBuffer byteBuffer = ByteBuffer.allocate(byteBufferSize);

Is it possible to use long instead of int for the byte buffer to avoid this?

See also related Jira where it was fixed to handle larger files in the old implementation of GridFSDownloadStreamImpl: https://jira.mongodb.org/browse/JAVA-2548



 Comments   
Comment by Tamara Ockhuijsen [ 11/Mar/20 ]

You are right, thanks Ross! It was indeed not the publisher that blocked the download. I managed to download 4 GB files with this fix 

Comment by Githook User [ 10/Mar/20 ]

Author:

{'name': 'Ross Lawley', 'username': 'rozza', 'email': 'ross.lawley@gmail.com'}

Message: Added Integer size check to the download publisher

JAVA-3650 JAVARS-227
Branch: master
https://github.com/mongodb/mongo-java-driver-reactivestreams/commit/d3dc13bbab582734b7387d109c65d6d75c08f944

Comment by Githook User [ 10/Mar/20 ]

Author:

{'username': 'rozza', 'name': 'Ross Lawley', 'email': 'ross.lawley@gmail.com'}

Message: Added Integer size check to the download publisher

JAVA-3650
Branch: master
https://github.com/mongodb/mongo-java-driver-reactivestreams/commit/7ba3dd32212dfc8dbfcaa5e26a777e0ef5957709

Comment by Githook User [ 10/Mar/20 ]

Author:

{'name': 'Ross Lawley', 'username': 'rozza', 'email': 'ross.lawley@gmail.com'}

Message: Added Integer size check to the download publisher

JAVA-3650
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/488cf07606301cbab784c8077404b6dfdd10841b

Comment by Githook User [ 10/Mar/20 ]

Author:

{'name': 'Ross Lawley', 'username': 'rozza', 'email': 'ross.lawley@gmail.com'}

Message: Added Integer size check to the download publisher

JAVA-3650
Branch: 4.0.x
https://github.com/mongodb/mongo-java-driver/commit/5640ad5aad67d32cf352e051c1346a6b3aac726b

Comment by Ross Lawley [ 10/Mar/20 ]

Hi tamara.ockhuijsen@klm.com,

I believe this should fix the issue,  the  byteBufferSize is used for the publisher which can produce multiple bytebuffers which together represent the file.  For large files you can either stream these chunks into the output destination or you can recreate the whole file by concatenating these streamed chunks.

I hope that will meet your needs,

Ross

 

Comment by Tamara Ockhuijsen [ 10/Mar/20 ]

Thanks Ross for picking up this task so quickly. Next to checking the max size, do you also plan to support files which exceed the 2 GB that can be hold in an integer?

Comment by Ross Lawley [ 10/Mar/20 ]

PR: https://github.com/rozza/mongo-java-driver/pull/374

Generated at Thu Feb 08 09:00:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.