[DOCS-12262] Do we / should we document $pull performance considerations? Created: 12/Dec/18  Updated: 30/Oct/23

Status: Closed
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Improvement Priority: Major - P3
Reporter: James Wahlin Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to DOCS-3219 document functional limitations and b... Closed
Participants:
Days since reply: 1 year, 14 weeks, 2 days ago
Epic Link: DOCSP-1769

 Description   

Description

Scope of changes

Impact to Other Docs

MVP (Work and Date)

Resources (Scope or Design Docs, Invision, etc.)



 Comments   
Comment by Education Bot [ 31/Oct/22 ]

Hello! This ticket has been closed due to inactivity. If you believe this ticket is still important, please reopen it and leave a comment to explain why. Thank you!

Comment by Brian Samek [ 13/Dec/18 ]

The simplest solution is a general purpose warning that performing update operations on large arrays (i.e. an array whose size on disk is > `n`) may result in large oplog entries for each document modified, which in turn can create memory pressure as those entries sit in memory. This effect may be pronounced on systems which are already close to capacity on memory usage.

That works for me. A note: I don't think it's just the array's size on disk that can trigger this. What matters is both the array's size and the number of update operations.

Comment by Asya Kamsky [ 13/Dec/18 ]

This used to be something that we explicitly mentioned in some of the training ... 

Don't know if we still do...

 

Comment by Ravind Kumar (Inactive) [ 13/Dec/18 ]

So based on comments in HELP-8388 and SERVER-9784, it seems like the core issue is that some update operations on arrays result in an oplog entry that includes the full array. These 'jumbo' entries have a few consequences:

  • Memory pressure as these jumbo entries get created, increasing with the number of large arrays updated
  • since oplog entries can be very large, they could fill the oplog rapidly and push older entries off the oplog. Might be an issue if you had previously sized your oplog for 24H of data, but now due to large
  • 4.0+ is supposed to grow the oplog to keep the last majority commit point from falling off. I feel like these sort of operations might lead to that behavior, such that the oplog itself suddenly begins to grow in size (possibly significantly?)
  • Not all array update operators showcase this behavior - e.g. in SERVER-9784, the $push operator only showcases this behavior if it results in a change in array order (e.g. $push with $position , $sort, $slice )

We'd need to do some testing to figure out which array operators would exhibit this behavior. I'm also concerned about how to define 'large' - it seems like we'd need to be specific that it's not the number of elements in the array, but the size on disk of those elements that cause an issue. Exactly where that marker is also might depend on the host hardware, e.g. one with more memory might better withstand the memory pressure of those kinds of operations.

The simplest solution is a general purpose warning that performing update operations on large arrays (i.e. an array whose size on disk is > `n`) may result in large oplog entries for each document modified, which in turn can create memory pressure as those entries sit in memory. This effect may be pronounced on systems which are already close to capacity on memory usage.

The bigger task would be to add some notes to the Production Notes section (possibly) about large size arrays and their consequences, possibly pulling from AskAsya -> Large Embedded Arrays

brian@mongodb.com asya I'd appreciate your thoughts on the above.

 

 

Generated at Thu Feb 08 08:04:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.