[CDRIVER-1988] Topology scanner times out while trying IPv6 address Created: 11/Jan/17 Updated: 02/May/17 Resolved: 11/Jan/17 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libmongoc, network |
| Affects Version/s: | 1.5.2 |
| Fix Version/s: | 1.5.3 |
| Type: | Bug | Priority: | Blocker - P1 |
| Reporter: | Remi Collet | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Fedora 26 |
||
| Issue Links: |
|
||||||||
| Case: | (copied to CRM) | ||||||||
| Description |
|
Version 1.5.1 builds fine Trying to update to 1.5.2, segfault during test suite
Full build.log: https://kojipkgs.fedoraproject.org//work/tasks/744/17240744/build.log |
| Comments |
| Comment by Githook User [ 11/Jan/17 ] | ||||
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: Revert " 8729c1448782481f392e4b51e513c14bb9736a5b Fixes | ||||
| Comment by Githook User [ 11/Jan/17 ] | ||||
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: Revert " 8729c1448782481f392e4b51e513c14bb9736a5b Fixes | ||||
| Comment by Githook User [ 11/Jan/17 ] | ||||
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: | ||||
| Comment by A. Jesse Jiryu Davis [ 11/Jan/17 ] | ||||
|
Thanks for reporting and investigating, Remi. I can reproduce this: if I run mongod locally with IPv6 turned off, I fail at the same spot, where test_command() tries to call ismaster on localhost:27017. We didn't catch this because we habitually pass "--ipv6" to our test mongod instance when we run the tests. The problem is in mongoc_topology_scanner_node_connect_tcp where we initially discover which servers from the host list are available. There, we call getaddrinfo with AF_UNSPEC (this is a change, it had been AF_INET). On my Mac and most machines, the IPv6 result for "localhost" is returned at the beginning of the results, followed by the IPv4 result. The driver chooses the first result and tries to connect, with a default timeout of 10 seconds. If that fails, it considers the host unavailable. It does *not* attempt to connect using the other results from the getaddrinfo list. The test passes if mongod is started with --ipv6. However, any tests using mock_server_t fail, since the mock server doesn't listen on IPv6. The such test is test_cooldown_rs(), so that fails once we have mongod listening on IPv6. Right now we should revert the change, as you proposed. There are two options for a long-term solution: 1. After a connection times out, we should reset the connect timer and try the next getaddrinfo result. Number 2 shouldn't be very hard, since the topology scanner is already parallel across multiple hosts, it just needs to become parallel across multiple addresses for each host. | ||||
| Comment by Remi Collet [ 11/Jan/17 ] | ||||
|
Final try: simply revert https://github.com/mongodb/mongo-c-driver/commit/333cbc2cd2f54f3650f51c39a2490c28c355cc0f Definitively, this doesn't seems enough for IPv6 support. | ||||
| Comment by Remi Collet [ 11/Jan/17 ] | ||||
|
Another try, skipping slow tests:
Another try, setting test_select_after_try_once and test_select_after_timeout as slow:
Looking at 1.5.1/1.5.2 diff, saw very little changes... but huge effects... | ||||
| Comment by Remi Collet [ 11/Jan/17 ] | ||||
|
Running mongod server with --ipv6 allow to got further, but
| ||||
| Comment by Remi Collet [ 11/Jan/17 ] | ||||
|
Tyring a local build (on fedora 25, with bundled libbson), fails later {{ , |