-
Type:
Bug
-
Resolution: Done
-
Priority:
Critical - P2
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Using 2.0.2, but I believe this is still an issue in master.
The problem is here:
def read(length)
handle_errors do
data = read_from_socket(length)
while data.length < length
data << read_from_socket(length - data.length)
end
data
end
end
read_from_socket does not check if the socket is half-closed when it returns 0, so you spin on the read, leading to 100% cpu usage and no progress.
We diagnosed this by noticing the cpu spike, stracing the process which showed that the process was spinning on calling read with a return value of 0. We correlated the fd being used in read to the description in procfs, then grepped lsof for that socket id, which had the status of "can't identify protocol", which is the smoking gun for a half-open TCP connection.
I searched for this issue here, but did not find any others.
In the 1.x-stable branch, the driver raises a connection issue if read returns 0 length:
def receive_data(length, socket)
message = new_binary_string
socket.read(length, message)
raise ConnectionFailure, "connection closed" unless message && message.length > 0
if message.length < length
chunk = new_binary_string
while message.length < length
socket.read(length - message.length, chunk)
raise ConnectionFailure, "connection closed" unless chunk.length > 0
message << chunk
end
end
message
end