General note: I know that the title is too general, but this is the 3rd bug I'm opening this week. We have another one coming for 3.2.1 related to sharding which we will soon publish. We are thinking of moving out of mongodb, the reliability of 3.2 is horrible!
2 bugs in this ticket:
1. We removed a member using rs.remove(). After that - the removed member (of which the log is attached) - started a versioning mess and killed itself.
filename = crash.
2. 2nd time we got the following behavior: a member selects itself, although it doesn't need to, and causes a rollback of the other member.
Our setup: primary, secondary and arbiter.
Primary: rs.stepDown() for maintenance.
Secondary takes over.
When primary is back, it starts syncing, as you can see from the logs - during this time it receives 2 "no" votes since it is still stale, but then - it receives only 1 "yes" vote (for some reason, the arbiter is quiet) - and is elected before its time. This causes a rollback on the other node.
All 3 nodes' logs are attached (primary, secondary, are). Please note the following lines:
and after 9 seconds, suddenly:
All members in protocol version 1. They were 0 but upgraded according to your docs ~a week ago.