[SERVER-18845] w_majority_change.js on v3.0 branch Created: 03/Jun/15 Updated: 29/Sep/15 Resolved: 29/Sep/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Matt Dannenberg |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Participants: |
| Description |
First appearance after: |
| Comments |
| Comment by Matt Dannenberg [ 10/Jun/15 ] | |||||||
|
This problem went away when we reverted the original solution to | |||||||
| Comment by Githook User [ 08/Jun/15 ] | |||||||
|
Author: {u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}Message: Revert " This reverts commit c69158b7fccfc9eb5648a68fcf194fc0cf30ba4d. | |||||||
| Comment by Githook User [ 08/Jun/15 ] | |||||||
|
Author: {u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}Message: Revert " This reverts commit 27f8803a31119c091e998fe29749dd5f75695ec6. | |||||||
| Comment by Githook User [ 05/Jun/15 ] | |||||||
|
Author: {u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}Message: | |||||||
| Comment by Githook User [ 05/Jun/15 ] | |||||||
|
Author: {u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}Message: | |||||||
| Comment by Matt Dannenberg [ 04/Jun/15 ] | |||||||
|
In 2.6 we required a handshake prior to updatePosition in order to accept replication progress. In 3.0 this is no longer necessary, but we kept that functionality for the sake of 2.6 compatibility. Once we started the 3.2 branch, we noticed that by requiring this, 3.0 and 3.2 were incompatible. As a result, we dropped this requirement from the updatePosition code path in 3.0. In a later commit, in order to fix reporting replication progress post-initial syncing, we removed this requirement from the heartbeat code path that accepts replication progress as well. That is the commit that caused this test failure. Though both commits allow for the same problem (reporting progress for a node that on the behalf of which we have not performed handshake), this commit made the problem a much more common occurrence. When processing an updatePosition command, 2.6 goes through the array of updates until it find a problematic one and then returns an error. My solution was to put ourselves first in that array, so that our update gets processed before the 2.6 node stops processing the updates. The trouble is nodes could still be chaining through us and be listed after a non-handshook node such that their progress would not be reported. I don't think that there is a proper solution for this. We can fix the test by disallowing chaining. | |||||||
| Comment by Eric Milkie [ 04/Jun/15 ] | |||||||
|
The issue appears to be that a 3.0 secondary (31001) is failing to sync from a 2.6 primary (31002), so the primary never gets notification that the secondary has the write:
(above taken from https://logkeeper.mongodb.org/build/556f5c68ead33c19c53803ac/test/556f633cfa59d047f638539c ) |