Details
-
Bug
-
Resolution: Done
-
Minor - P4
-
None
-
1.4
-
None
-
None
-
mongo java driver 1.4
mongo server 1.4.2
default settings for max connections, etc.
Description
After investigating an issue where a service was using 100% of available CPU yet not accomplishing much, I saw that most threads had the following at the top of the java stack trace:
at java.lang.Thread.sleep(Thread.java)
at com.mongodb.util.ThreadUtil.sleep(ThreadUtil.java:37)
at com.mongodb.util.SimplePool._get(SimplePool.java:162)
at com.mongodb.util.SimplePool.get(SimplePool.java:106)
at com.mongodb.util.SimplePool.get(SimplePool.java:95)
at com.mongodb.ByteEncoder.get(ByteEncoder.java:66)
at com.mongodb.DBMessage.<init>(DBMessage.java:52)
at com.mongodb.DBApiLayer$MyCollection.find(DBApiLayer.java:282)
at com.mongodb.DBCursor._check(DBCursor.java:253)
at com.mongodb.DBCursor._hasNext(DBCursor.java:374)
at com.mongodb.DBCursor.hasNext(DBCursor.java:399)
Our software was running with a few (maybe ~2x) more threads than connections, and was doing a fair mix of read and write in mongo. Our reads are typically cursoring over many objects. When we write to mongo we always call resetError() and getLastError() to verify the write.
I read through the 1.4 java driver code and saw that for every DBMessage created (in our case, many thousands per second), the thread enters what I consider a busy wait (despite a 15 millisecond Thread.sleep()) while it waits for a ByteEncoder. This ended up accounting for a significant percentage of our overall CPU load.
Have you considered using wait() and notify()? Or different ByteEncoder creation/allocation strategies (such as ThreadLocal)?