Uploaded image for project: 'C Driver'
  1. C Driver
  2. CDRIVER-2996

AIX failed unit tests

    • Type: Icon: Task Task
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 1.14.0
    • Component/s: libmongoc
    • Labels:
    • Environment:
      AIX 7.1
      gcc (GCC) 7.2.0

      Failed unit tests on AIX for 1.14.0 (these failures are consistent since at least 1.12.0 - some of the socket ones since 1.7ish)

      cmake -DENABLE_SRV=OFF -DENABLE_AUTOMATIC_INIT_AND_CLEANUP=OFF ..

      Output from failed unit tests is attached.  Tests are failing for 2 reasons:

      #1 AIX poll function does not set POLLHUP in revents
      /Socket/connect_refusal
      /TOPOLOGY/scanner_connection_error
      /TOPOLOGY/dns
      /TOPOLOGY/happy_eyeballs/1
      /TOPOLOGY/happy_eyeballs/4
      /TOPOLOGY/happy_eyeballs/9

      For reference, AIX poll function.
      https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix.basetrf1/poll.htm

      Taking the first test as an example, the sequence:
      Create a socket and connect to local host on a port with nothing there (localhost:12345)
      Test ignores the failure on connect
      Use poll() on the socket
      The test is expecting to get POLLHUP back in the revents field from the poll request. However, AIX does not return POLLHUP, it returns POLLOUT so it keeps trying until it exceeds the time limit set by the test.

      Created quick reference code (attached) to connect to localhost on a port with nothing there followed by a poll request to see the difference between AIX and Redhat:

      ON AIX – 79 is ECONNREFUSED
      test > ./a.out
      connect failed. 79
      poll_ret: 1
      revents: 2
      POLLHUP: 8192
      poll success!

      ON REDHAT – 115 is EINPROGRESS
      $ ./a.out
      connect failed. 115
      poll_ret: 1
      revents: 28
      POLLHUP: 16
      getsockopt optval: 111
      (111 is connection refused)

      Changing localhost to somewhere else, it returns EINPROGRESS on connect, but does not set POLLHUP, however the socket error is set to ECONNREFUSED.
      test > ./a.out
      connect failed. 55
      poll_ret: 1
      revents: 2
      POLLHUP: 8192
      getsockopt optval: 79

      As a workaround, I patched mongoc-socket.c to check the last socket error after calling poll and if it is ECONNREFUSED to set POLLHUP in the revents returned for the stream so that it will follow the expected path when the connection is refused on the initial connection. The diff is attached.

      This fixed all of the test cases listed above.

      Any feedback on this workaround or alternative suggestions?

      #2 getaddrinfo() not returning ipv6 sorted before ipv4
      /TOPOLOGY/happy_eyeballs/10
      /TOPOLOGY/happy_eyeballs/16
      /TOPOLOGY/happy_eyeballs/dns_cache/

      The unit tests assume getaddrinfo() will return the results sorted with ipv6 before ipv4 as the connections are attempted in that order. The ordering though is configurable (gai.conf for example). For some (undetermined) reason, results for localhost are coming back ipv4 before ipv6 when call getaddrinfo().

      To confirm that the ordering from getaddrinfo() is causing these unit tests to fail I (temporarily) modified mongoc-topology-scanner.c to try ipv6 before ipv4 and the tests passed.  These failures look to be due to configuration in our environment.  No real question here other than maybe the unit tests should check the assumption about the ordering of results from getaddrinfo()?

        1. mongoc 1.14 failed unit tests.txt
          3 kB
        2. mongoc-socket diff.txt
          0.8 kB
        3. socket_test.c
          1 kB

            Assignee:
            Unassigned Unassigned
            Reporter:
            amygiersch Amy Giersch
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: