[SERVER-39621] Disabled chaining should enforce sync source change when the primary steps down even if the oplog fetcher isn't killed on sync source Created: 15/Feb/19 Updated: 29/Oct/23 Resolved: 08/May/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.0.12 |
| Fix Version/s: | 4.4.1, 4.7.0, 4.2.16 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Siyuan Zhou | Assignee: | Samyukta Lanka |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.4, v4.2, v4.0, v3.6
|
||||||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2020-05-18 | ||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 6 | ||||||||||||||||||||||||||||||||||||||||
| Description |
|
Since we no longer kill readers and close their connections on stepdown, the nodes syncing from the primary may not have a chance to choose a new sync source even if chaining is disabled.
|
| Comments |
| Comment by Githook User [ 04/Aug/21 ] |
|
Author: {'name': 'Samy Lanka', 'email': 'samy.lanka@mongodb.com', 'username': 'lankas'}Message: (cherry picked from commit 2ffaa9d4efefffc7045b6b47d9380299b28dfd7a) |
| Comment by Githook User [ 21/Aug/20 ] |
|
Author: {'name': 'Samy Lanka', 'email': 'samy.lanka@mongodb.com', 'username': 'lankas'}Message: (cherry picked from commit 2ffaa9d4efefffc7045b6b47d9380299b28dfd7a) |
| Comment by Siyuan Zhou [ 20/Aug/20 ] |
|
evin.roesle, it should be similar to 4.4 backport. I'd say less than a day. |
| Comment by Evin Roesle [ 20/Aug/20 ] |
|
siyuan.zhou Do you think there is any risk with a backport to 4.2 and what is the extra complexity? How much time do you estimate for this, less than a day or more? |
| Comment by Evin Roesle [ 14/May/20 ] |
|
With being so close to GA, I do not think we should backport this ticket to 4.4 at this time |
| Comment by Tess Avitabile (Inactive) [ 13/May/20 ] |
|
Sounds good, then I don't think we should backport to earlier versions. evin.roesle, do you think we should backport to 4.4? This close to GA, we would need to get special permission from Kelsey. |
| Comment by Tess Avitabile (Inactive) [ 11/May/20 ] |
|
evin.roesle, do you think this ticket should be backported to earlier branches? samy.lanka, can you weigh in on the complexity of the backport? |
| Comment by Githook User [ 08/May/20 ] |
|
Author: {'name': 'Samy Lanka', 'email': 'samy.lanka@mongodb.com', 'username': 'lankas'}Message: |
| Comment by Judah Schvimer [ 26/Feb/19 ] |
|
siyuan.zhou, how would the sync source know that the oplog read was being used for oplog fetching? Would it see that it's an internal connection, or just assume based on the OplogReplay flag? |
| Comment by Kelsey Schubert [ 22/Feb/19 ] |
|
This ticket would also help prior versions of MongoDB in cases where no active getmore was running against the primary when it stepped down. |
| Comment by Tess Avitabile (Inactive) [ 15/Feb/19 ] |
|
I think the effect of the Avoid Closing Connections project on this change was small. We never closed connections between replica set members on stepdown, since these connections used hangUpOnStepDown:false. Additionally, we never killed cursors on stepdown. The only change is that if there was an active getMore on the sync source, it will no longer be killed after the Avoid Closing Connections project. Before this project, it was possible that a node would continue syncing from an old primary if there had not been an active getMore at the time of the stepdown. |