Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Won't Fix
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Networking & Observability
Operating System:
ALL
Steps To Reproduce:

Hide

Run predictive_connpool.js under burn-in a lot.

Show
Run predictive_connpool.js under burn-in a lot.
Linked BF Score:
200
Confidence Status:
None
Work Order:
3
Size Category:
TBD
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Estimated Weeks:
0

I've seen this test fail in burn-in on Windows in two ways:

1. (more rarely) Socket reuse? Here's an example:

[js_test:predictive_connpool] @jstests\noPassthrough\network\predictive_connpool.js:167:4
[js_test:predictive_connpool] couldn't connect to server EC2AMAZ-TIEA7HP:21545, connection attempt failed: SocketException: Error connecting to EC2AMAZ-TIEA7HP:21545 (10.128.168.40:21545) :: caused by :: syncConnect connect error :: caused by :: Only one usage of each socket address (protocol/network address/port) is normally permitted.

2. (more commonly) One of the checks fails because the expected number of in-use connections doesn't rise as high as expected, as though the mongos has deadlocked. Here's an example:

[js_test:predictive_connpool] uncaught exception: Error: assert.soon failed (timeout 600000ms), msg : Check #4 failed The hang analyzer is automatically called in assert.soon functions. If you are *expecting* assert.soon to possibly fail, call assert.soon with {runHangAnalyzer: false} as the fifth argument (you can fill unused arguments with `undefined`). :
[js_test:predictive_connpool] doassert@src/mongo/shell/assert.js:20:14
[js_test:predictive_connpool] _doassert@src/mongo/shell/assert.js:124:13
[js_test:predictive_connpool] assert.soon@src/mongo/shell/assert.js:431:22
[js_test:predictive_connpool] hasConnPoolStats@jstests\noPassthrough\network\predictive_connpool.js:97:12
[js_test:predictive_connpool] walkThroughBehavior@jstests\noPassthrough\network\predictive_connpool.js:132:21
[js_test:predictive_connpool] @jstests\noPassthrough\network\predictive_connpool.js:153:20
[js_test:predictive_connpool] Error: assert.soon failed (timeout 600000ms), msg : Check #4 failed The hang analyzer is automatically called in assert.soon functions. If you are *expecting* assert.soon to possibly fail, call assert.soon with {runHangAnalyzer: false} as the fifth argument (you can fill unused arguments with `undefined`).
[js_test:predictive_connpool] failed to load: jstests\noPassthrough\network\predictive_connpool.js

Note that those links are from a patch where I modified the test to include a bit more logging (logging the full connPoolStats output instead of part of it) and I modified burn-in to run it 10x more than it normally would. Here's another patch where I just added a newline to the end so burn-in would pick it up. I did have to restart burn-in a few times before the failures occurred.

I'm not sure is if this is an issue where only some specific builds fail: can you start a patch and get a "good" build that will never fail, or will any build will eventually fail? On some patches I could restart burn-in 12 times with no failures, and on others it failed pretty readily. I thought our builds were deterministic but that may not apply to Windows.

Assignee:: Unassigned
Reporter:: Ryan Berryhill
Participants:: Ryan Berryhill
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: May 28 2025 01:41:55 PM UTC
Updated:: Jun 02 2025 06:42:37 PM UTC
Resolved:: Jun 02 2025 06:20:13 PM UTC

Details

Description

Attachments

Activity

People

Dates