Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-44243

4.2.1: test build/integration_tests/transport_layer_asio_integration_test can fail

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Testing Infrastructure
    • Labels:
      None
    • ALL

      This test's results are inconsistent.  I wanted to report this, especially in case it doesn't just affect the test but also mongodb in some way.  It's probably not something I can really invest more time in, so I'm really hoping this is replicable by mongodb.

      It tends to succeed the first time I run it.  It always seems to fail on additional runs, regardless of if I re-run the same (unmodified) binary that tested successful before, or start with a fresh copy of source and re-compile all over again without having rebooted.

      I've replicated this on another machine.

      `ps` doesn't seem to show anything is running, and `netstat` doesn't show anything is still bound to the port.

      I don't know if previous versions (<= 4.2.0) could successfully run this test run multiple times.  I've certainly never had this test fail on previous versions, but I haven't re-ran tests before.

      I think I had this test fail a single time when it was the first run on fresh source after a reboot, but I can't say this with absolute certainty.

      All of my compilations use the asio supplied by mongodb.  NONE of them use a system asio library.  (Last time I checked, it wouldn't compile with the latest asio.)

      No firewalls are running on the systems I've tried this on.

      After a failure, I tried running the test 30 times in a row, and they all failed.

      Despite the output below, I've had the test both fail and pass regardless of if I'm running it directly or through resmoke.

      You can see the exact commands being ran here: https://aur.archlinux.org/cgit/aur.git/plain/PKGBUILD?h=mongodb&id=70918cda9a4417bca1b5a6a1319dca464d098e8c

      I've also tried only running the `scons core`, `scons unittests` / `dbtest` / `integration_tests` and manually running this specific test, to rule out that it's another one of the tests that's leaving state changed.

       

      When Test Fails

      $ transport_layer_asio_integration_test --connectionString=localhost:20000
      2019-10-24T22:06:27.832-0400 I - [main] Using test fixture with connection string = localhost:20000
      2019-10-24T22:06:27.867-0400 I - [main] going to run suite: TransportLayerASIO
      2019-10-24T22:06:27.867-0400 I - [main] going to run test: HTTPRequestGetsHTTPError
      2019-10-24T22:06:27.867-0400 I NETWORK [main] Connecting to localhost:20000
      2019-10-24T22:06:27.867-0400 I - [main] FAIL: HTTPRequestGetsHTTPError std::exception: connect: Connection refused in test HTTPRequestGetsHTTPError
      2019-10-24T22:06:27.867-0400 I - [main] going to run test: ShortReadsAndWritesWork
      2019-10-24T22:06:27.868-0400 I - [main] FAIL: ShortReadsAndWritesWork DBException: SocketException: Error connecting to localhost:20000 (127.0.0.1:20000) :: caused by :: Connection refused in test ShortReadsAndWritesWork
      2019-10-24T22:06:27.868-0400 I - [main] going to run test: asyncConnectTimeoutCleansUpSocket
      2019-10-24T22:06:27.868-0400 W COMMAND [main] failpoint: transportLayerASIOasyncConnectTimesOut set to: \{ mode: 1, data: {} }
      2019-10-24T22:06:27.868-0400 I NETWORK [thread1] asyncConnectTimesOut fail point is active. simulating timeout.
      2019-10-24T22:06:28.368-0400 W COMMAND [main] failpoint: transportLayerASIOasyncConnectTimesOut set to: \{ mode: 0, data: {} }
      2019-10-24T22:06:28.368-0400 I - [main] DONE running tests
      2019-10-24T22:06:28.368-0400 I - [main] **************************************************
      2019-10-24T22:06:28.368-0400 I - [main] TransportLayerASIO | tests: 3 | fails: 2 | assert calls: 0 | time secs: 0.501
       HTTPRequestGetsHTTPError std::exception: connect: Connection refused in test HTTPRequestGetsHTTPError
       ShortReadsAndWritesWork DBException: SocketException: Error connecting to localhost:20000 (127.0.0.1:20000) :: caused by :: Connection refused in test
      ShortReadsAndWritesWork
      2019-10-24T22:06:28.368-0400 I - [main] TOTALS | tests: 3 | fails: 2 | assert calls: 0 | time secs: 0.501
      
      
      2019-10-24T22:06:28.368-0400 I - [main] Failing tests:
      2019-10-24T22:06:28.368-0400 I - [main] TransportLayerASIO/HTTPRequestGetsHTTPError Failed
      2019-10-24T22:06:28.368-0400 I - [main] TransportLayerASIO/ShortReadsAndWritesWork Failed
      2019-10-24T22:06:28.368-0400 I - [main] FAILURE - 2 tests in 1 suites failed
      

      When Test Succeeds

      [executor:cpp_integration_test:job0] 2019-10-22T12:15:20.565-0400 task_executor_cursor_integration_test:ValidateCollections ran in 0.23 seconds: no failures detected.
      [executor:cpp_integration_test:job0] 2019-10-22T12:15:20.565-0400 Running transport_layer_asio_integration_test...
      build/integration_tests/transport_layer_asio_integration_test --connectionString=localhost:20000
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.566-0400 Starting C++ integration test build/integration_tests/transport_layer_asio_integration_test...
      build/integration_tests/transport_layer_asio_integration_test --connectionString=localhost:20000
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.569-0400 C++ integration test build/integration_tests/transport_layer_asio_integration_test started with pid 76697.
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.579-0400 2019-10-22T12:15:20.579-0400 I - [main] Using test fixture with connection string = localhost:20000
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.595-0400 2019-10-22T12:15:20.595-0400 I - [main] going to run suite: TransportLayerASIO
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.595-0400 2019-10-22T12:15:20.595-0400 I - [main] going to run test: HTTPRequestGetsHTTPError
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.595-0400 2019-10-22T12:15:20.595-0400 I NETWORK [main] Connecting to localhost:20000
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.597-0400 2019-10-22T12:15:20.597-0400 I NETWORK [main] Sending HTTP request
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.597-0400 2019-10-22T12:15:20.597-0400 I NETWORK [main] Waiting for response
      [MongoDFixture:job0] 2019-10-22T12:15:20.597-0400 I NETWORK [listener] connection accepted from 127.0.0.1:47818 #80 (3 connections now open)
      [cpp_integration_test:transport_layer_asio_integration_test] 2019-10-22T12:15:20.597-0400 2019-10-22T12:15:20.597-0400 I NETWORK [main] Received response: "HTTP/1.0 200 OK
      

            Assignee:
            ben.caimano@mongodb.com Benjamin Caimano (Inactive)
            Reporter:
            jamespharvey20 James P. Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: