[JAVA-2169] ConcurrentPool writes to same connection buffer then keeps waiting Created: 07/Apr/16  Updated: 11/Sep/19  Resolved: 02/Oct/16

Status: Closed
Project: Java Driver
Component/s: Connection Management
Affects Version/s: 3.2.2
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Gaurav Shah Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

ubuntu 12


Attachments: Text File mongo-java-driver trace.txt    

 Description   

This would happen only when most/all writes are unacknowledged.

Each write happens to last available connection based on code in:
mongo-java-driver-master/driver-core/src/main/com/mongodb/internal/connection/ConcurrentPool.java:130
`T t = available.pollLast();`

Each release of connection again appends back the available connection to end
/goshposh/mongo-java-driver-master/driver-core/src/main/com/mongodb/internal/connection/ConcurrentPool.java:84
`available.addLast(t);`

Situation:
Trying to use mongo java driver in jruby, seeing lower performance than ruby driver.
We see that `collection.updateOne` takes about 20 seconds on client but mongo server logs have only 100ms as the time taken for query.
JavaStack reveals that thread spend most of the time in `com.mongodb.connection.SocketStream.write`

Logical explanation:
Assume Mongo is takes 10 seconds to execute a query.
You have 2 threads and 100 connection pool size.

Thread 1 writes to 100th connection buffer - fire & forget
Thread 2 writes to 99th connection buffer - fire & forget
thread 1 completes writing to buffer ( mongo::unacknowledged), releases back the connection to pool at the last.
thread 2 completes writing to buffer ( mongo::unacknowledged), releases back the connection to pool at the 100.

Now thread 1 again makes a write, this time it again picks the last buffer ( originally 99th). This TCP send buffer already had something that had to be picked for mongo, but we wrote into the same buffer since it was available to write.
Now thread 2 will write to 100th buffer , which is again non-empty.

Eventually there will be a state where buffer are full and mongo client will just wait on java.net.SocketOutputStream.socketWrite0.

Although the buffer 1 to 98 are empty it will not be used since connection is available on 99 & 100 ( but the TCP Send buffer is full )

Trace attached.

Ruby driver ( we use 1.8.5) doesn't have this problem since it does a random on connection:
mongo-ruby-driver-1.8.5/lib/mongo/util/pool.rb
socket = available[rand(available.length)]



 Comments   
Comment by Gaurav Shah [ 09/Apr/16 ]

True, I do not see a way to check the state of buffer from java. Currently I have set minConnectionPerHost and applied a patch that releases connection to the pool at the beginning ( instead of end). This solves the problem for now. Will send in a pull request after testing it for couple of days.

Comment by Jeffrey Yemin [ 08/Apr/16 ]

Hi Gaurav,

I didn't mean to suggest that the proposal does not have merit, only that there may be a workaround that is acceptable. Given that the default stack size is 1MB for 64-bit JVMs, you could scale up the number of threads for what I would consider a modest amount of extra memory.

In your last response, you mentioned that the driver could check the state of the send buffer. I don't see a way to do that via Java's Socket API. Is there a technique that you're aware of that would allow the driver to do that?

Regards,
Jeff

Comment by Gaurav Shah [ 08/Apr/16 ]

sorry for the confusion, the high CPU is on the old code, which means we were getting enough throughput.

On the new code as you told the CPU is low, but if I increase the application threads I would be wasting memory. It feels wrong to increase threads for this reason. The driver could be more smart by checking the state of buffer if its more than 50% full and then move to next connection

Comment by Jeffrey Yemin [ 08/Apr/16 ]

If both threads are blocking on java.net.SocketOutputStream.socketWrite0, then CPU usage should be quite low. How are you measuring CPU usage for the application?

Comment by Gaurav Shah [ 08/Apr/16 ]

We have 20 machines each running two threads. The machines are already at spike on CPU, so cannot increase threads. If we add more machines , we would be wasting money on it, since the throughput that we are getting is good enough

Comment by Gaurav Shah [ 08/Apr/16 ]

checking the TCP buffer if its full will also not help since tcp buffer could have capacity of 4096 but the buffer might be filled only till 4080 and the next bytes to be written are 100 .

Comment by Gaurav Shah [ 08/Apr/16 ]

the application is CPU & mongo intensive with two threads, if I add in more threads it might not work well on CPU end of the application.

Comment by Jeffrey Yemin [ 08/Apr/16 ]

I see. What about using multiple threads to spread the write load to more sockets?

Comment by Gaurav Shah [ 08/Apr/16 ]

Hi Jeff,

The complete application uses only unacknowledged writes

Comment by Jeffrey Yemin [ 08/Apr/16 ]

Hi Gaurav,

Have you considered using a dedicated MongoClient instance for the application's unacknowledged writes?

Comment by Gaurav Shah [ 08/Apr/16 ]

changing the order of release will still not help since the new connections are not even open. So although the max_pool size is 100. The current pool size will only be 2, Not sure how to get around it.

Generated at Thu Feb 08 08:56:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.