[SERVER-30862] ReplSetTest.remove() should call stop() on the node to be removed Created: 28/Aug/17  Updated: 30/Oct/23  Resolved: 05/May/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.5.13
Fix Version/s: 4.7.0

Type: Bug Priority: Major - P3
Reporter: Eric Milkie Assignee: Jason Chan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2020-05-04, Repl 2020-05-18
Participants:

 Description   

d20002| 2017-08-28T10:35:36.613-0400 E STORAGE  [thread2] WiredTiger error (2) [1503930936:613065][16730:0x7f3029f2c700], log-server: /data/db/__unknown_name__-2/journal: directory-list: opendir: No such file or directory
d20002| 2017-08-28T10:35:36.613-0400 E STORAGE  [thread2] WiredTiger error (2) [1503930936:613149][16730:0x7f3029f2c700], log-server: log pre-alloc server error: No such file or directory
d20002| 2017-08-28T10:35:36.613-0400 E STORAGE  [thread2] WiredTiger error (2) [1503930936:613161][16730:0x7f3029f2c700], log-server: log server error: No such file or directory
d20002| 2017-08-28T10:35:36.613-0400 E STORAGE  [thread2] WiredTiger error (-31804) [1503930936:613171][16730:0x7f3029f2c700], log-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
d20002| 2017-08-28T10:35:36.613-0400 F -        [thread2] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361
d20002| 2017-08-28T10:35:36.613-0400 F -        [thread2] 
d20002| 
d20002| ***aborting after fassert() failure

This crash occurs after the following steps:

$  ./mongo --nodb
MongoDB shell version v0.0.0
> var replTest = new ReplSetTest({nodes: 1});
> replTest.startSet();      // start mongod
> replTest.remove(0);    // Remove it from replsettest's config
> replTest.stopSet();      // crashes mongod.



 Comments   
Comment by Githook User [ 05/May/20 ]

Author:

{'name': 'Jason Chan', 'email': 'jason.chan@10gen.com', 'username': 'jasonjhchan'}

Message: SERVER-30862 ReplSetTest.remove() should call stop() on the node to be removed
Branch: master
https://github.com/mongodb/mongo/commit/ec12de9fcd39f4f7eff7c98dde4b3de6ff806778

Comment by Jason Chan [ 30/Apr/20 ]

I was able to reproduce this on v4.5. I believe it makes sense to have remove() call stop() on the node.

Comment by Siyuan Zhou [ 09/Apr/20 ]

In Safe Reconfig project, we usually restore the config to the original config. I think one reason is to avoid problems like this one. We should make ReplSetTest smarter to handle removed nodes.

Comment by Jack Mulrow [ 15/Sep/17 ]

I accidentally committed SERVER-30682 with this ticket's number, then I reverted it when I realized my mistake, so please ignore the above two comments.

Comment by Ramon Fernandez Marina [ 15/Sep/17 ]

Author:

{'username': u'jsmulrow', 'name': u'Jack Mulrow', 'email': u'jack.mulrow@mongodb.com'}

Message:SERVER-30862 Run the concurrency suite with "secondary" read preference
Branch:master
https://github.com/mongodb/mongo/commit/6dded939c14a072d6b15a47a692ba13c706d8db1

Comment by Ramon Fernandez Marina [ 15/Sep/17 ]

Author:

{'username': u'jsmulrow', 'name': u'Jack Mulrow', 'email': u'jack.mulrow@mongodb.com'}

Message:Revert "SERVER-30862 Run the concurrency suite with "secondary" read preference"

This reverts commit 6dded939c14a072d6b15a47a692ba13c706d8db1.
Branch:master
https://github.com/mongodb/mongo/commit/cbfcf83f39a05a6921d68a0b7350e273f86cbd22

Comment by Eric Milkie [ 28/Aug/17 ]

It turns out that ReplSetTest.remove(foo) cannot be called before first calling ReplSetTest.stop(foo). tags2.js is one such test that is following the rules.
If you don't stop a node prior to removing it, when stopSet runs it forgets to stop the removed node, and then proceeds to delete the dbpath out from under it.
We should either remove the remove() function from ReplSetTest, or add a call to stop() as the first thing that function does.

Generated at Thu Feb 08 04:25:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.