[JAVA-221] Try to remove synchronized in DBApiLayer finalize Created: 28/Nov/10  Updated: 22/Dec/10  Resolved: 13/Dec/10

Status: Closed
Project: Java Driver
Component/s: Performance
Affects Version/s: None
Fix Version/s: 2.4

Type: Improvement Priority: Major - P3
Reporter: Eliot Horowitz (Inactive) Assignee: Antoine Girbal
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by JAVA-207 performance impact with multiple conc... Closed

 Description   

See JAVA-207



 Comments   
Comment by auto [ 10/Dec/10 ]

Author:

{u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}

Message: JAVA-221: remove call to killcursor in finalize, to prevent blocking GC
https://github.com/mongodb/mongo-java-driver/commit/a25e594f5b7012a8c29843000b6e5dc2c3b53e59

Comment by auto [ 02/Dec/10 ]

Author:

{'login': 'agirbal', 'name': 'agirbal', 'email': 'antoine@10gen.com'}

Message: JAVA-221: kill cursor should not be called if id is 0
SUPPORT-76: several bugs found in input reader, which may read too much
/mongodb/mongo-java-driver/commit/677125648a00002b99462e2406ab8e6a0192ae6f

Comment by auto [ 02/Dec/10 ]

Author:

{'login': 'agirbal', 'name': 'agirbal', 'email': 'antoine@10gen.com'}

Message: JAVA-221: made finalize call killcursors right away
/mongodb/mongo-java-driver/commit/5a7f983f5dcd250b96749864575fed3762ba4fd7

Comment by auto [ 02/Dec/10 ]

Author:

{'login': 'agirbal', 'name': 'agirbal', 'email': 'antoine@10gen.com'}

Message: JAVA-221: Try to remove synchronized in DBApiLayer finalize
Addition of a background thread per database, that takes care of cleaning up dead cursors.
/mongodb/mongo-java-driver/commit/4f49990a7d304d9806cb6f80ef92c517fc3a449d

Comment by Eliot Horowitz (Inactive) [ 01/Dec/10 ]

Once a cursor is exhausted, the server closes automatically and we should not call kill cursors on it.

The reason .close() is so important is for people who do this

c = coll.find().sort( ... )

while ( c.next() )
if ( x )
break;

i..e they read some non-pre determined amount of data from the cursor.

some people can do thousands of these per second, which can cause problems on the sever.

Comment by Antoine Girbal [ 01/Dec/10 ]

ok just note that it will generate a lot more individual msg to servers.
Also if the thread interval is 1s, that should be short enough to prevent memory issue on server.
It depends how many cursors we can expect to potentially open in 1s.
Anyway I changed code to send msg per cursor.

On another note: isnt the cursor supposed to close itself on server side once exhausted?
My test does a cursor.toArray() which grabs all objects.
Then when cursor is finalized, driver sends a msg to server to kill cursor, but server prints out line:
Wed Dec 1 13:38:15 [conn61] killcursors: found 0 of 1
I've tested with old/new driver and with both the server prints out thousands of these lines.
Is this expected behavior, or should driver know not to send kill msg if it read all objs?

Comment by Eliot Horowitz (Inactive) [ 01/Dec/10 ]

The reason why close() exists is that its easy for a client to eat a ton of resources on the server in some cases, and its much better to get them closed asap.

Also its problematic if the JVM dies before the background thread cleans them up.

Most people don't call close explicitly, so I think its better for it to kill the cursor immediately.

Can always adjust later.

Comment by Antoine Girbal [ 01/Dec/10 ]

basically it's a tradeoff.
Solution A:

  • close() sends msg to server right away to close single cursor
  • only adds to deadCursorIds in case there was an exception during close, so that thread will take care of it later
  • this means that under normal operation, driver will send msg to DB upon close of every cursor

Solution B:

  • close() adds cursor to deadCursorIds
  • at regular interval (e.g. 1s) a background thread sends a msg to DB to close cursors, with multiple cursors in 1 msg
  • basically it reduces messaging and can be more scalable. Drawback is that there may be a slight delay before actual close (e.g. 1s)

Right now I chose solution B because it should be faster under load.
But if we need A for a reason I'm missing, let me know.

Comment by Eliot Horowitz (Inactive) [ 01/Dec/10 ]

Yes, cursor.close() should close immediately.

Comment by Scott Hernandez (Inactive) [ 01/Dec/10 ]

If I call close on the cursor I expect it to be closed immediately. What is the reason for not doing that?

Comment by Antoine Girbal [ 01/Dec/10 ]

committed patch earlier, commit ebf3c729ff070ba9bc9dd862fccc5a6d42ca0982
only difference with original plan is that cursor.close() or finalize() does not send msg to db by itself, instead it adds entry in dead cursor list which is a concurrent queue.
This is so that the cleaner thread can close dead cursors as a batch.
Concurrent queue as good concurrency speed so shouldnt be an issue.
Patched driver shows 10-20% speed improvement.

Comment by Eliot Horowitz (Inactive) [ 30/Nov/10 ]

Looks good.

Comment by Antoine Girbal [ 29/Nov/10 ]

implementation plan:

  • dead cursors are stored a synced list
  • cleanup() is only called by a cleanup thread. Right now that thread lives in the DBApiLayer class (looks like most specific code is in there)
  • find() does not call cleanup()
  • result.close() will try to close cursor by itself, if there is exception then it will add it to the list of dead cursor

The thread will check every 1s if there are cursors to be closed, should be good enough.
this should result in no syncing for queries under normal operations.
let me know if sounds good.

Comment by Eliot Horowitz (Inactive) [ 29/Nov/10 ]

Moving cleanCursor logic to a background thread would make sense.

Comment by Antoine Girbal [ 29/Nov/10 ]

ok trying to understand the principle of cursor cleanup.
it seems:
1) whenever find() is called, it will try to clean up cursors using cleanCursors()
2) when close() is called on a Result object, it syncs and adds its cursor in dead list. Then calls cleanCursors()
3) cleanCursors() does a sync to swap the dead cursors list to a new one

This means that:

  • 1 sync is done for every find()
  • 2 syncs are done for every close()

we can try to get rid of locks, the cleanest would be to use a background thread in charge of cleaning dead cursors.
In the spirit of this comment:
// check without synchronisation ( double check pattern will avoid having two threads do the cleanup )
// maybe the whole cleanCursor logic should be moved to a background thread anyway

It also has the advantage of not slowing user threads for cursor cleanup.

Generated at Thu Feb 08 08:51:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.