[SERVER-77460] Handle large documents with nested objects in the bucket catalog Created: 24/May/23  Updated: 29/Oct/23  Resolved: 29/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Task Priority: Major - P3
Reporter: Shin Yee Tan Assignee: Shin Yee Tan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-81405 Account for field name map in flat bs... Closed
related to SERVER-81402 Add testing for flat bson memory usage Closed
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Sprint: Execution NAMR Team 2023-07-24, Execution NAMR Team 2023-08-07, Execution NAMR Team 2023-09-18, Execution NAMR Team 2023-10-02, Execution NAMR Team 2023-10-16
Participants:
Case:

 Description   

It is possible for the bucket catalog memory limit to not be respected. We observed this in a customer case where they had large documents and many nested objects.

We should investigate how we might be underestimating memory usage tracking in the bucket catalog regarding minmax and schema structures to actually respect the 6GB limit.



 Comments   
Comment by Githook User [ 29/Sep/23 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: SERVER-77460 Calculate memory usage of FlatBSON
Branch: master
https://github.com/mongodb/mongo/commit/6297630d620bfec5c04b8adc69bb9894fe5a35f2

Comment by Shin Yee Tan [ 14/Aug/23 ]

Things I want to keep in mind when implementing memory usage tracking for FlatBsonStore:

  • Memory usage changes when we insert, update, and delete
  • We are storing data in element for each entry in the store
  • There is additional memory usage in _fieldNameToIndex
  • We'll need to report FlatBsonStore memory usage to the bucket catalog
Comment by Shin Yee Tan [ 18/Jul/23 ]

Had a chat with dan.larkin-york@mongodb.com. Currently to track memory usage in the bucket catalog for minmax and schema structures, we estimate upfront the memory usage. As this is an estimate of the size of the first measurement inserted in the bucket, we're not really tracking the actual memory usage and we're also unable to account for memory changes when we update. 

I'm going to take some time to get familiar with flat bson but eventually we'll want to be able to take the actual size of objects being used to track our bucket catalog memory usage.

Generated at Thu Feb 08 06:35:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.