[SERVER-80928] KafkaPartitionConsumer should have a byte limit on prefetching. Created: 09/Sep/23  Updated: 04/Dec/23  Resolved: 04/Dec/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sandeep Dhoot Assignee: Kunaal Kumar
Resolution: Fixed Votes: 0
Labels: init-337-m3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-83187 memoryUsageBytes should accurately re... Closed
Assigned Teams:
Atlas Streams
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sprint 35, Sprint 36, Sprint 37
Participants:

 Description   

Currently it uses `maxNumDocsToPrefetch` to decide how many docs to prefetch.

Instead we should make it prefetch up to 10 batches each of size kDataMsgMaxByteSize. Note that it already emits batches of size kDataMsgMaxByteSize.



 Comments   
Comment by Aadesh Patel (Inactive) [ 13/Nov/23 ]

yeah agreed, we'll wanna track kafka consumer mem usage, as well as change stream now, since the change stream max buffer size is 200k docs now

Comment by Sandeep Dhoot [ 13/Nov/23 ]

Very good question! The answer is most likely yes. But I think we can err on the side of keeping things simple for now and let the stream processor have higher memory usage in this case. We will hopefully have per-tenant feature flags very soon which we can use to temporarily mitigate any production issues that occur because of this.

But aadesh.patel@mongodb.com this makes me think KafkaConsumer should also report its memory usage to MemoryAggregator. I was previously of the opinion that it does not have to, but probably it does.

Comment by Aadesh Patel (Inactive) [ 13/Nov/23 ]

kunaal.kumar@mongodb.com sandeep.dhoot@mongodb.com do we want to enforce a global limit across all partitions? in case where we have a topic with 100s of partitions

Comment by Sandeep Dhoot [ 10/Nov/23 ]

Thank you! 

Comment by Kunaal Kumar [ 10/Nov/23 ]

Sounds good - I'll look into that as well.

Comment by Sandeep Dhoot [ 10/Nov/23 ]

We want to do this in ChangeStreamOperator as well.

Generated at Thu Feb 08 06:44:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.