-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: None
-
Component/s: None
-
None
Using 2.0.2, but I believe this is still an issue in master.
The problem is here:
def read(length) handle_errors do data = read_from_socket(length) while data.length < length data << read_from_socket(length - data.length) end data end end
read_from_socket does not check if the socket is half-closed when it returns 0, so you spin on the read, leading to 100% cpu usage and no progress.
We diagnosed this by noticing the cpu spike, stracing the process which showed that the process was spinning on calling read with a return value of 0. We correlated the fd being used in read to the description in procfs, then grepped lsof for that socket id, which had the status of "can't identify protocol", which is the smoking gun for a half-open TCP connection.
I searched for this issue here, but did not find any others.
In the 1.x-stable branch, the driver raises a connection issue if read returns 0 length:
def receive_data(length, socket) message = new_binary_string socket.read(length, message) raise ConnectionFailure, "connection closed" unless message && message.length > 0 if message.length < length chunk = new_binary_string while message.length < length socket.read(length - message.length, chunk) raise ConnectionFailure, "connection closed" unless chunk.length > 0 message << chunk end end message end