Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major - P3
-
None
-
3.2.0-rc1
-
None
-
Fully Compatible
-
ALL
-
Repl C (11/20/15), Repl D (12/11/15), Repl E (01/08/16)
Description
Not sure which of these steps might be significant
1. Start with 2-shard CSRS cluster. Each shard is PSA
2. Add third shard
3. Remove one S from second shard
4. Shutdown the removed S
5. Eventually give up and "assume" the node is shutdown and attempt to restart.
What I observed was:
1. At step 4, the removed node did not shutdown.
Here is the process - you can see it has been running for many hours:
[red_red_1_4]$ ps -ef | grep "/tmp/data" | grep 87564
|
501 87564 1 0 4:43PM ?? 7:00.96 /var/lib/mongodb-mms-automation/mongodb-osx-x86_64-3.2.0-rc1/bin/mongod -f /tmp/data/red_red_1_4/automation-mongod.conf
|
[red_red_1_4]$ cat mongod.lock
|
87564
|
[red_red_1_4]$ date
|
Thu Oct 29 01:18:31 UTC 2015
|
2. At step 5, the error message I got was unusual. I got:
2015-10-29T01:08:39.668+0000 E NETWORK [initandlisten] listen(): bind() failed errno:48 Address already in use for socket: 0.0.0.0:28004
|
2015-10-29T01:08:39.668+0000 E NETWORK [initandlisten] addr already in use
|
2015-10-29T01:08:39.668+0000 E STORAGE [initandlisten] Failed to set up sockets during startup.
|
2015-10-29T01:08:39.668+0000 I CONTROL [initandlisten] dbexit: rc: 48
|
I was expecting the usual message that indicates that mongod recognized that there was still a lock file from a previously running process.
Logs attached:
- mongodb.1.log - steps 1 through 4, including the failed shutdown
- mongodb.2.log - step 5 - the attempt to restart even though the old process is still running