-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Labels:None
-
Environment:Linux infrabox 2.6.28-11-server, i686, Rails 2.1.2 mongo-1.0.1
sometimes, when we do a very big query from our rails app. The ruby driver only sends the beginning of the query and starts waiting for the answer. Of course, the server is waiting for the end of the query, which never comes. This results in a deadlock.
investigation leads to this method in connection.rb, inside the driver around line 700
def send_message_on_socket(packed_message, socket)
begin
socket.send(packed_message, 0)
rescue => ex
close
raise ConnectionFailure, "Operation failed with the following exception: #
"
end
end
packed_message is a huge string and the method 'send' sent on 'socket' performs a partial write (returning the number of bytes written).
Here, it is implicitly assumed that the write would be total and the return value is not checked. This explains the partial write and the deadlock.
I did modify the code to check that the write IS partial.
Moreover, if I replace the line
socket.send(packed_message, 0)
by this:
while packed_message.size > 0
byte_sent = socket.send(packed_message, 0)
packed_message.slice!(0,byte_sent)
end
The issue is gone.
I'm not an expert on socket. It is ambiguous whether it is legitimate or not to assume that a write must be total if the socket is not set to non blocking mode:
http://www.opengroup.org/onlinepubs/009695399/functions/send.html
There are some controversy on the topic, google showed me this:
http://fixunix.com/unix/529666-posixs-send-non-blocking-sockets-may-partially-write.html