[CDRIVER-1219] Bugs in single-threaded selection timeout Created: 02/May/16 Updated: 22/Jul/16 Resolved: 22/Jul/16 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libmongoc |
| Affects Version/s: | 1.3.0 |
| Fix Version/s: | 1.4.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Two bugs with mongoc_topology_select in 1.3.x when serverSelectionTryOnce is turned off. Bug #1: instead of sleeping a half second between server checks, mongoc_topology_select sleeps half a second longer between each server check until the next sleep would exceed the server selection timeout. Consider the 1.3.5 code in mongoc_topology_select. "try_once" is false, connectTimeoutMS is 500, socketTimeoutMS is 60,000:
Bug #2: when scan_ready has advanced enough to exceed expire_at, the error message should be the standard "serverselectiontimeoutms timed out", not "minheartbeatfrequencyms not reached yet". |
| Comments |
| Comment by A. Jesse Jiryu Davis [ 22/Jul/16 ] |
|
Recent improvements seem to have fixed this bug; I've reenabled the tests on Windows and run them a dozen times without failure. |
| Comment by Githook User [ 22/Jul/16 ] |
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: |
| Comment by Githook User [ 14/May/16 ] |
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: |
| Comment by A. Jesse Jiryu Davis [ 14/May/16 ] |
|
The tests added for server selection with a down secondary time out on Windows. Is it a mock server bug or a bug in our timeout implementation on Windows? |
| Comment by Githook User [ 04/May/16 ] |
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: If serverSelectionTryOnce is false, selection should time out with Fix timing logic to rescan every 500ms while selection fails. |
| Comment by Githook User [ 04/May/16 ] |
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: |
| Comment by Githook User [ 04/May/16 ] |
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: |
| Comment by A. Jesse Jiryu Davis [ 03/May/16 ] |
|
A new test of the C Driver only (not involving PHP) does not reproduce the bug. Instead it works as expected: single-threaded selection, whether tryOnce is on or off, takes connectTimeoutMS if a secondary is down, then selects the primary. For the next 5 seconds the secondary is in cooldown, so single-threaded selection is instant, then the topology scanner will try the secondary again until connectTimeoutMS expires. It seems the driver has correctly implemented the spec. I now suspect the PHP driver's stream initiator prevents the topology scanner from properly implementing the spec: instead of applying connectTimeoutMS when it's doing its initial scan, it blocks the whole socketTimeoutMS before giving up on the secondary. |