[SERVER-20814] 'majority' write stuck waiting for replication after both secondaries fell off the end of the oplog Created: 07/Oct/15 Updated: 08/Oct/15 Resolved: 08/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | Eric Milkie |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
Happened during execution of jstests/sharding/conf_server_write_concern.js. A write, which uses writeConcern: 'majority' got stuck in the awaitReplication call, while the secondaries entered the RECOVERING state. Below are the relevant lines from the logs and the call stacks are attached. This is the Evergreen task (it is from a patch build with only test changes).
|
| Comments |
| Comment by Kaloian Manassiev [ 08/Oct/15 ] |
|
After discussion with milkie, this is working as designed. When a node is in recovering state it still sends heartbeats so from the point of view of the primary it is up. Doing an automatic resync is not a good option, because if multiple secondaries start doing it at the same time, it might overload the primary. Also, initial sync starts by wiping out all the data, which is something that shouldn't be undertaken without administrator intervention. Failing the waitForWriteConcern is not appropriate, because there is always a possibility that the secondaries will go out of the recovering mode and satisfy the write concern. The way to avoid waiting indefinitely is for all applications to specify timeout when they use a write concern. |
| Comment by Eric Milkie [ 08/Oct/15 ] |
|
We can't automatically start a resync for a few reasons; the most important one being that some installations might not be able to do a resync because it would take too long or overload the primary. |
| Comment by Kaloian Manassiev [ 07/Oct/15 ] |
|
If both secondaries of this 3-node replica set have entered the RECOVERING state because they fell off the end of the oplog, shouldn't a full initial sync be attempted automatically or if not, shouldn't the current PRIMARY step down? |
| Comment by Spencer Brody (Inactive) [ 07/Oct/15 ] |
|
I think my commit https://github.com/mongodb/mongo/commit/60b2e7ffce5f91093d39c6d80701aa3f7c36b5c3 is the culprit here. |