[SERVER-25475] Crash in single-node CSRS on Windows with auth Created: 08/Aug/16 Updated: 25/Jan/17 Resolved: 23/Aug/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.3.12 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | Spencer Brody (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Sprint: | Sharding 2016-08-29 |
| Participants: |
| Description |
|
Seen here while Mongo Orchestration is trying to start a sharded cluster with two mongos servers, two standalone shards, and a single-node config server as replica set: https://evergreen.mongodb.com/version/57a7b8373ff12252cd00010a The config server crashes:
It appears not to happen unless auth is enabled. This is a nightly build from git hash 9cf1165c. Log and minidump attached. I believe that I can repro on demand. |
| Comments |
| Comment by Githook User [ 23/Aug/16 ] | |||||||||||||||||||||||||||||||
|
Author: {u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}Message: | |||||||||||||||||||||||||||||||
| Comment by Spencer Brody (Inactive) [ 23/Aug/16 ] | |||||||||||||||||||||||||||||||
|
jesse, this should be fixed now, can you confirm that this passes your tests now? | |||||||||||||||||||||||||||||||
| Comment by Githook User [ 23/Aug/16 ] | |||||||||||||||||||||||||||||||
|
Author: {u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}Message: This ensures that the sharding system is initialized before the ReplicationCoordinator is started, | |||||||||||||||||||||||||||||||
| Comment by Spencer Brody (Inactive) [ 12/Aug/16 ] | |||||||||||||||||||||||||||||||
|
Okay, I was able to reproduce locally using the data files Jesse provided and with adding a sleep in db.cpp:_initAndListen after the replication coordinator is started up and before we initializeGlobalShardingState. Looks like it's a race between the initialization of the replication system and the sharding system. | |||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 12/Aug/16 ] | |||||||||||||||||||||||||||||||
|
Yes, it seems to crash most of the time that I restart mongod with this config file. I've attached a zip of its dbpath. | |||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 12/Aug/16 ] | |||||||||||||||||||||||||||||||
|
Its config file is:
| |||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 12/Aug/16 ] | |||||||||||||||||||||||||||||||
|
Logs:
| |||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 12/Aug/16 ] | |||||||||||||||||||||||||||||||
|
I've just reproduced it on a VS 2015 spawnhost:
| |||||||||||||||||||||||||||||||
| Comment by Spencer Brody (Inactive) [ 11/Aug/16 ] | |||||||||||||||||||||||||||||||
|
jesse I'm having trouble reproducing this. Can you give any more information on the series of events that led to this crash? Also, once this happens, does it crash in the same way every time you start up the binary on the same data files? If so, would it be possible to share those data files? | |||||||||||||||||||||||||||||||
| Comment by Eric Milkie [ 08/Aug/16 ] | |||||||||||||||||||||||||||||||
|
It looks like we need to check the return statuses from these function calls before dereferencing:
|