[SERVER-6926] Secondary should take into account error status of node it chooses to sync from - not just ping time. Created: 04/Sep/12 Updated: 15/Feb/13 Resolved: 10/Sep/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.0.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Xuguang zhan | Assignee: | Eric Milkie |
| Resolution: | Done | Votes: | 0 |
| Labels: | replication | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
three Mongo config server > cfg = { , priority:2}, , priority:1}, , priority:1}, , priority:1}, , priority:0} |
||
| Attachments: |
|
| Operating System: | Linux |
| Participants: |
| Description |
|
when I use multhiThread to call insert , it cause tow secondaryNode Disk fully, but check the rs.status() I have confuse why the 10.224.88.160 not sync the right optlog with 10.224.88.109(PRI) 10.224.88.110(normally Secondary), seems it only keep in touch with other two nodes which have disk fully\ detail pls check the two attachments EDIT: in this replica set, 10.224.88.160 is syncing from either 10.224.88.161 or 10.224.88.163 - both have the same error (out of disk space) and have stopped syncing from the PRIMARY. This means that 10.224.88.60 is now also falling behind the primary even though it is not in error. This issue to be fixed here is that a node should check the error status of the node it is syncing from and switch to sync from another node if there is an error status. https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/k0XaKb0vH3s |
| Comments |
| Comment by Xuguang zhan [ 11/Sep/12 ] | |||||||||||||||||||||||||||||||||||
|
have you testing the case in the Version 2.2 about this? any wiki to show us more detail info. | |||||||||||||||||||||||||||||||||||
| Comment by Eric Milkie [ 10/Sep/12 ] | |||||||||||||||||||||||||||||||||||
|
In the latest version of MongoDB (2.2), secondaries are more liberal about shutting down when replication problems occur. This behavior should avoid the issues reported here. | |||||||||||||||||||||||||||||||||||
| Comment by Gregor Macadam [ 07/Sep/12 ] | |||||||||||||||||||||||||||||||||||
|
In 2.0.6 I can repro this but in 2.2 mongod asserts when it is out of disk space
and so the secondary will choose another node to sync from. |