[SERVER-68845] BSONObjectTooLarge when $merge during aggregation Created: 08/Aug/22 Updated: 10/Mar/23 Resolved: 04/Oct/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | namhun song | Assignee: | Chris Kelly |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Sprint: | QE 2022-10-17 | ||||||||
| Participants: | |||||||||
| Description |
| Comments |
| Comment by Mihai Andrei [ 04/Oct/22 ] | |||||||||||||||||||||
|
As david.storch@mongodb.com pointed out, this is likely a manifestation of Please feel free to reach out if you have any other questions! | |||||||||||||||||||||
| Comment by David Storch [ 21/Sep/22 ] | |||||||||||||||||||||
|
I didn't get a chance to review this carefully, but I'm wondering if this is a manifestation of | |||||||||||||||||||||
| Comment by Chris Kelly [ 20/Sep/22 ] | |||||||||||||||||||||
|
Namhun, Specifically, I am suspecting an issue in your aggregation pipeline here. When I investigate this in compass on a standalone MongoDB 5.0.5 instance, I see you are generating a potentially massive array of documents in your $lookup stage. Once you have enough documents at this stage, this array can exceed the maximum document size value. This is an anti-pattern.
For each record that ends up in your initial $match, an extra document is made in your $lookup stage, and each document contains a data value that itself is another copy of the document. If enough documents get to this stage, even if they are not individually over 16MB, they are being put into these nested data arrays and causing each of these documents to become larger. Then you will get this error:
Your end stages didn't seem to have anything obviously wrong with them. I'd encourage you to check this part of your pipeline out, and use explain() or MongoDB Compass to troubleshoot your query further. I will need more information from you (such as a code reproduction, and the output of your explain() when running the aggregation) to evaluate other possibilities here. I'm not exactly sure what your use case is, but you may find suggestions for using $lookup with $mergeObjects in our docs here. I also still encourage you to check out our MongoDB Developer Community Forums as you will likely find further advice for improving your aggregation pipeline there. However, I do see that in our Aggregation Pipeline Limits docs, we assert that:
And I think that this may be somewhat unclear in this case - I will follow up on that. Regards, Christopher | |||||||||||||||||||||
| Comment by namhun song [ 15/Sep/22 ] | |||||||||||||||||||||
|
Hi, Chris There days, i am working on other problem, so I couldn't check this. The last thing I did for this issue, I just add one more aggregation execution when the Exception occurs, and this helps some but not all, still I have an issue BSONObjectTooLarge even after retry. I have found the interesting logs. Our api runs every 10 minutes, and if failed, it try again after 10 minutes, maximum try is 6. Thus I can see the logs 6 failed one, this is the error cases before I added retry when BSONObjectTooLarge. In 6 logs, stack traces are same and error messages are same but BSONObj szie, see the below(sorry but I could not copy and paste the logs directly because of the company's security policy, so I just typed it) I can't sure the logs I just posted has meaning but, I'm just feel weird that the 'BSONObj size' caused by the error is variant in the same queries. Anyway hopes to be any help
By the way, what is your comments at the below mean? I think that the immediate document size does not have any limitation because the Mongo doc says "The limit only applies to the returned documents.". Am I wrong? > the aggregation pipeline has a chance of incidentally creating a single document larger than the max size with your workload just due to the amount of objects being merged (not that any of them are individually large).
Take care | |||||||||||||||||||||
| Comment by Chris Kelly [ 07/Sep/22 ] | |||||||||||||||||||||
|
Hi Namhun, We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide additional information to reproduce what you're experiencing? | |||||||||||||||||||||
| Comment by Chris Kelly [ 16/Aug/22 ] | |||||||||||||||||||||
|
Namhun, Thanks for the response. From what I can tell, the aggregation pipeline has a chance of incidentally creating a single document larger than the max size with your workload just due to the amount of objects being merged (not that any of them are individually large). However, I'm not sure if your Java workload differs from what you've reported. I would be interested in your assertion that it errors out on the first attempt in your Java server, but succeeds a second time. To better investigate this part, a more detailed reproduction featuring your use case would be helpful. Specifically, can you create a code reproduction we can use to verify your claim using the Java driver? As it stands, I don't currently see a problem unless you run in to the issue I mentioned earlier. Regards, | |||||||||||||||||||||
| Comment by namhun song [ 16/Aug/22 ] | |||||||||||||||||||||
|
Thanks for your reply, chris.kelly@mongodb.com I understand your answer and I fully understand it is possible to have a large document which has more than the 16MB size when a field value has the big size such as 10MB, like you mentioned. and I have known the fact that a single doc size over 16MB causes that error.
https://www.mongodb.com/docs/manual/core/aggregation-pipeline-limits/ > Each document in the result set is subject to the 16 megabyte BSON Document Size limit. If any single document exceeds the BSON Document Size limit, the aggregation produces an error. The limit only applies to the returned documents.
However, in our database, the string field which is over the 10MB does not exists, and, as I mentioned in the question, the werid thing is that the query, which occurs the BSONObjectSizeTooLarge, works well when the execution is done on the mongo client such as mongo shell or 'NoSQLBooster for MongoDB' with the same dataset Moreover it works too when the query is executed twice, the first execution fails and the second execution,right after the failure, is done well.
Anyway, thanks your reply, and I will investigate more, too
take care
| |||||||||||||||||||||
| Comment by Chris Kelly [ 15/Aug/22 ] | |||||||||||||||||||||
|
Namhun, The BSONObjectTooLarge error you're getting is likely due to the max document size restriction in MongoDB. EDIT: I misunderstood initially - in your case, you are hitting this BSONObjectTooLarge error not because your documents start out being large, but potentially because of this of the mergeObjects step in your pipeline:
If the amount of data you end up merging here is over 16MB, you will receive this error. This is because $mergeObjects combines multiple documents into a single document. I was able to reproduce this issue when the documents being merged at this stage were over 16MB when combined (I did this by inflating the size of devicdid and appid to ~10MB, then merging) Since this is working as designed, we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums. If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project. Regards, |