[SERVER-10601] Removing a node from a replica set can't succeed when one node is syncing Created: 22/Aug/13  Updated: 24/Aug/17  Resolved: 17/Apr/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Christian Sturm Assignee: Matt Dannenberg
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: FreeBSD
Steps To Reproduce:

Have a working replica set.
Add a new node.
Before the new node syncs (goes past STARTUP2) rs.remove() any node of the working replica set nodes.

Participants:

 Description   

Given I have a replica set with a number of nodes (>3) and one of them is syncing (in STARTUP2). As soon as I remove a node (be it SECONDARY or the PRIMARY) there will be a PRIMARY reelection, which appears to wait for the one node to finish the startup (in fact this node wants to become a primary). This however is never going to succeed, because no primary will be elected.

Maybe rs.remove() should give an error in such a case.



 Comments   
Comment by Ramon Fernandez Marina [ 24/Aug/17 ]

Author:

{'username': u'tkaye407', 'name': u'Tyler Kaye', 'email': u'tyler.kaye@mongodb.com'}

Message:SERVER-10601 --> remove parseLL() function definition and use in numberlong.cpp
Branch:master
https://github.com/mongodb/mongo/commit/d86fe12469ee5852a4dae7e2ff85894c9e5b91fa

Comment by Matt Dannenberg [ 17/Apr/14 ]

Hi Christian,

I haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, feel free to re-open the ticket and provide additional information.

Regards,
Matt

Comment by Matt Dannenberg [ 19/Mar/14 ]

Hey Christian,

I can't seem to reproduce this failure. I used a 3 node set and inserted 1,000,000 simple documents. Then I added a new node and while it was initial syncing (in STARTUP2), I removed one of the secondaries. But I did not see the election problem you did. Did I miss a step? Are you still able to repro this?

Matt

Generated at Thu Feb 08 03:23:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.