[SERVER-11059] Elections can be delayed by some locks Created: 07/Oct/13  Updated: 10/Dec/14  Resolved: 19/Jan/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.6, 2.5.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alexander Komyagin Assignee: Matt Dannenberg
Resolution: Duplicate Votes: 0
Labels: elections
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

2-node replica set


Attachments: Text File pri.log     File queue_elect.js     Text File sec.log     Text File sec_dbCurrentOp.log    
Issue Links:
Duplicate
duplicates SERVER-12170 Do not call relinquish() when not vet... Closed
Related
is related to SERVER-11103 replset mutex should be not be held d... Closed
Operating System: ALL
Participants:

 Description   

When the node is doing initial sync and it's in the index build phase, it will not respond to the replSetElect command.

Steps to reproduce:

  1. setup a one-node replica set
  2. write a lot of data and create a bunch of indexes
  3. add a new node
  4. wait for the new node to go into the the index building phase after all records are cloned
  5. issue stepDown(3) on the PRIMARY
  6. observe that the election is delayed until index build is finished

Expected result: to be able to elect a primary if the secondary is doing an initial sync

Rationale: in more complicated scenarios involving multiple servers doing initial sync and network splits, this bug can lead to significant (depending on the actual indexes) delays in elections



 Comments   
Comment by Matt Dannenberg [ 15/Jan/14 ]

The repro script failed on 2.5.2. But the repro script passed and the repro described in the ticket didn't seem to exhibit the bad behavior at HEAD of master. One of the commits between this ticket being opened and now (between 2.5.4 or and 2.5.5) has fixed this behavior. I believe it may have been this one.

Comment by Alexander Komyagin [ 14/Oct/13 ]

Attaching jstest

Generated at Thu Feb 08 03:24:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.