The original summary for this issue was "Calling mongoc_client_reset() in child may affect SSL sockets in parent". This issue can be addressed by removing the mongoc_cluster_disconnect call in mongoc_client_reset.
While integrating CDRIVER-2857 in PHPC-1274, I observed reproducible test failures when connected to a standalone server using SSL. The tests in question do the following:
- Connect to a standalone using SSL
- Insert three documents into a collection
- Query the collection with batchSize=2. Query execution in PHP will entail creating a cursor with the first batch of results.
- Fork the process, such that the child immediately exits
- The parent will wait on the child to exit, and then proceed to iterate the cursor to completion
This test is implemented in bug1274-001.phpt. It fails due to the parent's connection being closed sometime between forking and iterating the cursor (i.e. executing a getMore).
In this first log, I record the script output (above steps) and also include a relevant portion of the server log that shows the connection terminating when the child exits:
find command specifies batchSize: 2 Child exits Parent waited for child to exit getMore command specifies batchSize: 2 Fatal error: Uncaught MongoDB\Driver\Exception\ConnectionTimeoutException: Failed to send "getMore" command with database "phongo": Failed to read 4 bytes: socket error or timeout in /home/jmikola/workspace/mongodb/phpc/tests/cursor/bug1274-001.php:59 2019-05-07T09:56:47.264-0400 I NETWORK [listener] connection accepted from 127.0.0.1:40904 #53 (1 connection now open) 2019-05-07T09:56:47.283-0400 I NETWORK [conn53] received client metadata from 127.0.0.1:40904 conn53: { driver: { name: "mongoc / ext-mongodb:PHP", version: "1.14.0 / 1.6.0alpha2-dev" }, os: { type: "Linux", name: "Ubuntu", version: "18.04", architecture: "x86_64" }, platform: "cfg=0xd15ea8e9 posix=200809 stdc=201112 CC=GCC 7.4.0 CFLAGS="-g -O0" LDFLAGS="" / PHP 7.2.12" } 2019-05-07T09:56:47.284-0400 I STORAGE [conn53] createCollection: phongo.cursor_bug1274_001 with generated UUID: 6a9426ee-d4d3-420a-ba6d-0f1f77867eb0 2019-05-07T09:56:47.287-0400 I STORAGE [conn53] Registering catalog entry phongo.cursor_bug1274_001 with UUID 6a9426ee-d4d3-420a-ba6d-0f1f77867eb0 2019-05-07T09:56:47.287-0400 I STORAGE [conn53] Registering collection object phongo.cursor_bug1274_001 with UUID 6a9426ee-d4d3-420a-ba6d-0f1f77867eb0 2019-05-07T09:56:47.289-0400 I INDEX [conn53] index build: done building index _id_ on ns phongo.cursor_bug1274_001 2019-05-07T09:56:47.293-0400 I NETWORK [conn53] end connection 127.0.0.1:40904 (0 connections now open) 2019-05-07T09:56:47.784-0400 I NETWORK [listener] connection accepted from 127.0.0.1:40906 #54 (1 connection now open) 2019-05-07T09:56:47.803-0400 I NETWORK [conn54] end connection 127.0.0.1:40906 (0 connections now open)
In this second log, I have the child sleep five seconds before exiting and disabled the waitpid() in the parent so that iteration happens before the child exits:
find command specifies batchSize: 2 getMore command specifies batchSize: 2 Parent fully iterated cursor for 3 documents ===DONE=== Child exits 2019-05-07T09:57:49.242-0400 I NETWORK [listener] connection accepted from 127.0.0.1:40910 #56 (1 connection now open) 2019-05-07T09:57:49.262-0400 I NETWORK [conn56] received client metadata from 127.0.0.1:40910 conn56: { driver: { name: "mongoc / ext-mongodb:PHP", version: "1.14.0 / 1.6.0alpha2-dev" }, os: { type: "Linux", name: "Ubuntu", version: "18.04", architecture: "x86_64" }, platform: "cfg=0xd15ea8e9 posix=200809 stdc=201112 CC=GCC 7.4.0 CFLAGS="-g -O0" LDFLAGS="" / PHP 7.2.12" } 2019-05-07T09:57:49.263-0400 I STORAGE [conn56] createCollection: phongo.cursor_bug1274_001 with generated UUID: 055f23dc-1ac4-40e9-8242-7e23c1d400fa 2019-05-07T09:57:49.264-0400 I STORAGE [conn56] Registering catalog entry phongo.cursor_bug1274_001 with UUID 055f23dc-1ac4-40e9-8242-7e23c1d400fa 2019-05-07T09:57:49.264-0400 I STORAGE [conn56] Registering collection object phongo.cursor_bug1274_001 with UUID 055f23dc-1ac4-40e9-8242-7e23c1d400fa 2019-05-07T09:57:49.265-0400 I INDEX [conn56] index build: done building index _id_ on ns phongo.cursor_bug1274_001 2019-05-07T09:57:49.270-0400 I NETWORK [conn56] end connection 127.0.0.1:40910 (0 connections now open)
In both cases, it looks like resetting an SSL connection from one process affects the other. I'm not sure if this is expected due to SSL interfering with being able to have two client sockets connected to the same server socket (shared file descriptor through the fork). Perhaps SSL is dropping the entire connection once it observes the child hang up?
- is depended on by
-
PHPC-1274 Reset libmongoc client after forking to avoid interacting with parent resources in child processes
- Closed
- is related to
-
CDRIVER-2857 Provide a reset method to be called on clients after forking
- Closed
- related to
-
CDRIVER-3438 Destroy exhaust cursor socket in mongoc_cursor_destroy regardless of client generation
- Closed
-
CDRIVER-3391 Document fork safety
- Closed