-
Type:
Bug
-
Resolution: Won't Fix
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Networking & Observability
-
ALL
-
-
200
-
None
-
3
-
TBD
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
0
I've seen this test fail in burn-in on Windows in two ways:
1. (more rarely) Socket reuse? Here's an example:
[js_test:predictive_connpool] @jstests\noPassthrough\network\predictive_connpool.js:167:4 [js_test:predictive_connpool] couldn't connect to server EC2AMAZ-TIEA7HP:21545, connection attempt failed: SocketException: Error connecting to EC2AMAZ-TIEA7HP:21545 (10.128.168.40:21545) :: caused by :: syncConnect connect error :: caused by :: Only one usage of each socket address (protocol/network address/port) is normally permitted.
2. (more commonly) One of the checks fails because the expected number of in-use connections doesn't rise as high as expected, as though the mongos has deadlocked. Here's an example:
[js_test:predictive_connpool] uncaught exception: Error: assert.soon failed (timeout 600000ms), msg : Check #4 failed The hang analyzer is automatically called in assert.soon functions. If you are *expecting* assert.soon to possibly fail, call assert.soon with {runHangAnalyzer: false} as the fifth argument (you can fill unused arguments with `undefined`). : [js_test:predictive_connpool] doassert@src/mongo/shell/assert.js:20:14 [js_test:predictive_connpool] _doassert@src/mongo/shell/assert.js:124:13 [js_test:predictive_connpool] assert.soon@src/mongo/shell/assert.js:431:22 [js_test:predictive_connpool] hasConnPoolStats@jstests\noPassthrough\network\predictive_connpool.js:97:12 [js_test:predictive_connpool] walkThroughBehavior@jstests\noPassthrough\network\predictive_connpool.js:132:21 [js_test:predictive_connpool] @jstests\noPassthrough\network\predictive_connpool.js:153:20 [js_test:predictive_connpool] Error: assert.soon failed (timeout 600000ms), msg : Check #4 failed The hang analyzer is automatically called in assert.soon functions. If you are *expecting* assert.soon to possibly fail, call assert.soon with {runHangAnalyzer: false} as the fifth argument (you can fill unused arguments with `undefined`). [js_test:predictive_connpool] failed to load: jstests\noPassthrough\network\predictive_connpool.js
Note that those links are from a patch where I modified the test to include a bit more logging (logging the full connPoolStats output instead of part of it) and I modified burn-in to run it 10x more than it normally would. Here's another patch where I just added a newline to the end so burn-in would pick it up. I did have to restart burn-in a few times before the failures occurred.
I'm not sure is if this is an issue where only some specific builds fail: can you start a patch and get a "good" build that will never fail, or will any build will eventually fail? On some patches I could restart burn-in 12 times with no failures, and on others it failed pretty readily. I thought our builds were deterministic but that may not apply to Windows.