[SERVER-26737] Segmentation fault in mongos at shutdown due to unconstructed ClientCursorManager Created: 22/Oct/16 Updated: 20/Nov/16 Resolved: 28/Oct/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.2.10 |
| Fix Version/s: | 3.2.11 |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Kirill Vechera | Assignee: | Kaloian Manassiev |
| Resolution: | Done | Votes: | 0 |
| Labels: | code-only | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Backport Completed: | |
| Steps To Reproduce: | Start mongos with unresolvable hostname as configdb parameter. I.e.: mongos --configdb asdf After 10 attempts to resolve the name of the host, it dies with segfault. |
| Sprint: | Sharding 2016-11-21 |
| Participants: |
| Description |
|
When mongos has been started with invalid hostname as a configdb value, it dies with Segmentation fault: E SHARDING [mongosMain] uncaught DBException in mongos main: 7 unable to resolve DNS for host adsf |
| Comments |
| Comment by Githook User [ 28/Oct/16 ] |
|
Author: {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}Message: |
| Comment by Andy Schwerin [ 24/Oct/16 ] |
|
Triage notes: I spent a few minutes looking at the reproduction, and at a high level I believe what happens is that CatalogManagerLegacy::init returns an error status, which sends mongos into shutdown. On the way, it attempts to access something normally initialized after the catalog manager, or perhaps the catalog manager itself (didn't have time to check which), and segfaults. I tried reproducing with a CSRS config string and found no repro, which is not surprising, since the CSRS catalog manager monitoring behavior is very different. This failure is annoying, because it may cause core dumps, and leaves a nasty stack trace, but mongos is shutting down anyways, and since it never contacted a config server, must have no user operations in flight. |