[SERVER-23390] Missing collection during replication causes shutdown Created: 29/Mar/16 Updated: 20/Dec/16 Resolved: 18/Nov/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 3.2.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Spencer Brody (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | RF | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Repl 2016-11-21 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||
| Description |
|
Several users have reported crashes with the message "[repl writer worker 10] writer worker caught exception: :: caused by :: 26 Failed to apply insert due to missing collection" when replicating a write. This is likely fallout related to the changes from |
| Comments |
| Comment by Spencer Brody (Inactive) [ 18/Nov/16 ] |
|
Ever case where a user reported this where we were able to find a root cause it turned out to be because they had dropped a collection while the node was running in standalone mode, so there was no oplog entry recorded for the drop. We never found any evidence of an actual bug in replication leading to these errors. |
| Comment by Eric Milkie [ 07/Apr/16 ] |
|
The user case that triggered this ticket's creation was due to running a replica set member node in standalone-mode (by omitting --replSet) and then doing writes, which caused the nodes' data to get out of sync. |
| Comment by Scott Hernandez (Inactive) [ 29/Mar/16 ] |
|
Please upload the logs and oplog (if possible) from incidents where this occurred. Also, please include any manual actions taken and their effects. For example, were you able to restart the node and everything went back to normal, or was a wipe + resync done? |