[JAVA-591] Long-running tailable cursors consume too much memory Created: 29/Jun/12  Updated: 29/Jan/15  Resolved: 27/Oct/14

Status: Closed
Project: Java Driver
Component/s: Query Operations
Affects Version/s: 2.7.2
Fix Version/s: 2.13.0

Type: Bug Priority: Major - P3
Reporter: Adam Warski Assignee: Jeffrey Yemin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related

 Description   

I already posted it on the mailing list, it didn't receive any replies, but I still think it's a bug. Please correct me if I'm wrong.

A tailable cursor on a capped collection can live potentially for a long time and return multiple batches of data.
However, it seems to me that the size of each batch that is read from Mongo is added to the list: DBApiLayer.Result._sizes, in the DBApiLayer.Result.init method, called from the _advance method.

The list is available via DBCursor.getSizes().

If the list grows and grows, especially if there's a lot of data added to the capped collection, eventually it will OOM.

Is that right?

UPDATE:

Added DBCursor.disableBatchSizeTracking() to work around this problem.



 Comments   
Comment by Jeffrey Yemin [ 29/Jan/15 ]

2.13.0 has been released. Closing issue.

Comment by Githook User [ 27/Oct/14 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: Added a way to disable batch size tracking on DBCursor in order to avoid the list which tracks each batch size from growing
indefinitely (a problem mostly for high-volume tailable cursors).

JAVA-591
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/c1e298005c0319a92d2e61ec874ae1c2ee8a73c7

Comment by Adam Warski [ 02/Jul/12 ]

No, I'm not seeing this problem in practice yet, though the possibility is disturbing, especially that tailing cursors may live for a very long time, right? (I'm not an expert on mongo so I might be missing something here).

I think using a list which holds last N batch sizes would be best.

Comment by Jeffrey Yemin [ 02/Jul/12 ]

Are you seeing this problem in practice? It would take a lot of batches before you get to any significant size array.

Couple of options I can see:

  • Add a clearSizes method to DBCursor so that you can control the size of the list.
  • Change the contract of getSizes so that it returns the last N batch sizes, instead of infinitely growing.
Comment by Adam Warski [ 29/Jun/12 ]

The forum entry is:
https://groups.google.com/forum/?fromgroups#!topic/mongodb-user/bE6j_u45y2w

Generated at Thu Feb 08 08:52:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.