[SERVER-25403] DataReplicator initial sync should be resilient to applier failures Created: 02/Aug/16 Updated: 25/Jan/17 Resolved: 21/Sep/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.3.14 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Siyuan Zhou |
| Resolution: | Done | Votes: | 0 |
| Labels: | Idempotency | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Repl 2016-09-19, Repl 2016-10-10 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Currently if the applier has an error it will fassert and terminate the mongod. Some errors should probably be ignored and some should lead to a restart of initial sync. One example is here where an IndexOptionsConflict error led to an fassert. If this error had been ignored it probably would have been fine. Alternatively initial sync could have just restarted. At the very least we should restart initial sync on errors like this. |
| Comments |
| Comment by Githook User [ 21/Sep/16 ] |
|
Author: {u'username': u'visualzhou', u'name': u'Siyuan Zhou', u'email': u'siyuan.zhou@mongodb.com'}Message: |
| Comment by Judah Schvimer [ 02/Aug/16 ] |
|
I think it will solve the race condition leading to this failure. I think either way we should make sure small failures lead to initial sync restarts rather than server crashes. |
| Comment by Eric Milkie [ 02/Aug/16 ] |
|
Will |