|
The trouble here is the 2.6 primary erroneously reporting a "majority" writeConcern as satisfied when only three of seven nodes are up ("majority" should need four to be satisfied).
This is caused by the primary being alerted of its own replication progress. In 2.6, nodes would simply subtract one from the writeConcern quantity to account for itself. In 2.8, nodes track their own progress along side all other nodes. Additionally, in 2.8, nodes update replication progress via heartbeat as well as the replSetUpdatePosition command.
So, when a 2.8 node receives a heartbeat from the 2.6 primary, the 2.8 node updates its replication progress map to reflect the primary's progress and then forwards this progress to the primary (or along a chained path where it will eventually reach the primary). When the 2.6 primary receives this progress update it sees that it has three nodes (which is majority minus one to account for itself) have replicated the op and does not know that one of those three is itself (which is now being double counted).
The solution is to not update the replication progress map when receiving a heartbeat from the primary.
|