[SERVER-31755] Raise intermediate $lookup document size to 100MB, and make it configurable Created: 27/Oct/17  Updated: 30/Oct/23  Resolved: 14/Dec/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 3.6.11, 4.0.6, 4.1.7

Type: Bug Priority: Major - P3
Reporter: Brian Samek Assignee: Brigitte Lamarche (Inactive)
Resolution: Fixed Votes: 1
Labels: storch
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Documented
is documented by DOCS-12573 Document internalLookupStageIntermedi... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Sprint: Query 2018-12-17, Query 2018-12-31
Participants:
Case:

 Description   
Issue Status as of Jan 24, 2019

FEATURE DESCRIPTION

The maximum size of a document during an intermediate $lookup stage will now be configurable with a setParameter called internalLookupStageIntermediateDocumentMaxSizeBytes. The default value is 100 MB. This parameter has a minimum value of 16 MB, the maximum document size allowed for a document being returned to the user.

RATIONALE

In general, documents are allowed to grow beyond the 16MB limit as they move through the pipeline--as long as the final documents which we serialize and send over the wire are within the limit.

OPERATION

During runtime, this parameter can be set on each mongod using the following command:

db.adminCommand({setParameter: 1, internalLookupStageIntermediateDocumentMaxSizeBytes: <long>})

On startup, each mongod can set this parameter with the following flag:

--setParameter internalLookupStageIntermediateDocumentMaxSizeBytes=<long>

To persist this setting in the configuration file, use the the setParameter option:

setParameter:
   internalLookupStageIntermediateDocumentMaxSizeBytes: <long>

FIX VERSIONS

This feature will be included in MongoDB 3.6.11 and MongoDB 4.0.6.

Original description

In 3.4.7 $lookup appears to have an intermediate 16MB document limit. An aggregation that has a $lookup as an intermediate stage returned this error https://github.com/mongodb/mongo/blob/r3.4.7/src/mongo/db/pipeline/document_source_lookup.cpp#L159-L163.

To address this a new setParameter,internalLookupStageIntermediateDocumentMaxSizeBytes, has been added to control the size of an intermediate $lookup stage. 

 



 Comments   
Comment by Githook User [ 22/Jan/19 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: SERVER-31755 Create intermediate $lookup stage document size limit knob
Branch: v3.6
https://github.com/mongodb/mongo/commit/128c43eb4ab29a1477e51d6ff9ef1517df4376f6

Comment by Githook User [ 22/Jan/19 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: SERVER-31755 Create intermediate $lookup stage document size limit knob
Branch: v4.0
https://github.com/mongodb/mongo/commit/1adb99ed363acc49f957d6106a0ee7824c3e94dd

Comment by Steve Gaylord [ 16/Jan/19 ]

Thank you Asya for that information.  Any timeline on when 3.6.11 might be released?  I think that is the next dot release you are referring to.

Comment by Asya Kamsky [ 16/Jan/19 ]

sgaylord@facilitron.com this fix has been approved to be backported to 4.0 and 3.6 and the work is being done, unfortunately the backport won't be in the upcoming 3.6 dot release as it is already in code freeze. It should be in the dot release after the next one.

Comment by Steve Gaylord [ 14/Jan/19 ]

Is there a timeline to "backport" this fix to 3.6?  We are on Atlas 3.6.9 and have been working to try to optimize our queries and continue to run into this where the $unwind and then $group basically just kills the work we are doing to optimize.

Comment by Githook User [ 11/Jan/19 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: Revert "SERVER-31755 Create intermediate $lookup stage document size limit knob"

This reverts commit 277e40c1fc64c51c2407aa1371e8b925b6349c40.
Branch: v4.0
https://github.com/mongodb/mongo/commit/a3efc0dc6639c164ad35c57d8ce077ed6461aa1b

Comment by Githook User [ 11/Jan/19 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: SERVER-31755 Create intermediate $lookup stage document size limit knob
Branch: v4.0
https://github.com/mongodb/mongo/commit/277e40c1fc64c51c2407aa1371e8b925b6349c40

Comment by Brigitte Lamarche (Inactive) [ 19/Dec/18 ]

The maximum size of a document during an intermediate $lookup stage will now be configurable with a setParameter called internalLookupStageIntermediateDocumentMaxSizeBytes. The default value is 100 MB. This parameter has a minimum value of 16 MB, the maximum document size allowed for a document being returned to the user.

It can be set on the server thusly:

--setParameter internalLookupStageIntermediateDocumentMaxSizeBytes=long

 

Comment by Githook User [ 14/Dec/18 ]

Author:

{'username': 'bLamarche413', 'email': 'thesisiwbl@gmail.com', 'name': 'Brigitte Lamarche'}

Message: SERVER-31755 Create intermediate $lookup stage document size limit knob
Branch: master
https://github.com/mongodb/mongo/commit/dfc46be8d31620ed662898ee01bf473246ae49d3

Comment by Asya Kamsky [ 03/Dec/18 ]

> are you saying that you should always do the $unwind right after the $lookup even if you need the looked up documents in an array? 

I'm definitely not saying that, sorry if that wasn't clear.   If you need the document with looked up data as array then it's probably smaller than 16MBs though.

Comment by Steve Gaylord [ 03/Dec/18 ]

Thank you Asya. Is there any documentation that I should know about for
these types of best practices?

Also are you saying that you should always do the $unwind right after the
$lookup even if you need the looked up documents in an array? So something
like $lookup then $unwind then $group?

Also, are there any global things I can do so I don't have to rewrite them
all?

Thanks,
Steve

Steve Gaylord
Chief Software Engineer
800-272-2962

Comment by Asya Kamsky [ 01/Dec/18 ]

sgaylord@facilitron.com yes, the correct order would be to have $unwind immediate after $lookup - it will get incorporated into the $lookup stage and you won't hit the limit.

In general it not a good idea to use $project to reduce the size of documents as normally the pipeline analyzes which fields it needs by itself and explicit $project stages can defeat this (and other) optimizations.

 

Comment by Steve Gaylord [ 30/Nov/18 ]

We are attempting to upgrade from 3.2 to 3.6.8 on Atlas and this issue is hitting us to the point where I don't know if we can make the move.  We have many aggregates where we do a $lookup and then a $project to reduce the size of the documents and then do a $unwind, and many of these cause this error.  So, we would have to change all these to move the $unwind to directly after the $lookup and then adjust the rest of the pipeline.

Are there any more "global" type changes that can be done to help with this issue?

Comment by David Storch [ 02/Jan/18 ]

I'm not convinced that this is "Works as Designed". In general, documents are allowed to grow beyond the 16MB limit as they move through the pipeline---as long as the final documents which we serialize and send over the wire are within the limit. I'm re-opening this ticket for triage.

Comment by Brian Samek [ 03/Nov/17 ]

This happens when you don't have an $unwind immediately following $lookup, is that correct?

Correct.

Or is it a single $lookup on one source document that matches 16MBs?

The $lookup finds a large number of documents.

Comment by Asya Kamsky [ 03/Nov/17 ]

This happens when you don't have an $unwind immediately following $lookup, is that correct?

Or is it a single $lookup on one source document that matches 16MBs?

Generated at Thu Feb 08 04:28:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.