[SERVER-26737] Segmentation fault in mongos at shutdown due to unconstructed ClientCursorManager Created: 22/Oct/16  Updated: 20/Nov/16  Resolved: 28/Oct/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.2.10
Fix Version/s: 3.2.11

Type: Bug Priority: Minor - P4
Reporter: Kirill Vechera Assignee: Kaloian Manassiev
Resolution: Done Votes: 0
Labels: code-only
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongos.log    
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Steps To Reproduce:

Start mongos with unresolvable hostname as configdb parameter. I.e.:

mongos --configdb asdf

After 10 attempts to resolve the name of the host, it dies with segfault.

Sprint: Sharding 2016-11-21
Participants:

 Description   

When mongos has been started with invalid hostname as a configdb value, it dies with Segmentation fault:

E SHARDING [mongosMain] uncaught DBException in mongos main: 7 unable to resolve DNS for host adsf
F - [mongosMain] Invalid access at address: 0x18
F - [mongosMain] Got signal: 11 (Segmentation fault).



 Comments   
Comment by Githook User [ 28/Oct/16 ]

Author:

{u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

Message: SERVER-26737 Only shutdown sharding objects if constructed
Branch: v3.2
https://github.com/mongodb/mongo/commit/2915804d68927578a917c7c1e5438ba96b03c86a

Comment by Andy Schwerin [ 24/Oct/16 ]

Triage notes:

I spent a few minutes looking at the reproduction, and at a high level I believe what happens is that CatalogManagerLegacy::init returns an error status, which sends mongos into shutdown. On the way, it attempts to access something normally initialized after the catalog manager, or perhaps the catalog manager itself (didn't have time to check which), and segfaults. I tried reproducing with a CSRS config string and found no repro, which is not surprising, since the CSRS catalog manager monitoring behavior is very different.

This failure is annoying, because it may cause core dumps, and leaves a nasty stack trace, but mongos is shutting down anyways, and since it never contacted a config server, must have no user operations in flight.

Generated at Thu Feb 08 04:13:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.