[SERVER-6537] Replicaset stop replication Created: 20/Jul/12 Updated: 16/Nov/21 Resolved: 15/Nov/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.0.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Raymond | Assignee: | Shaun Verch |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | replication | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
OS: Centos 5 x64 |
||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
We get following error in our secondary, this is second time we encounter the problem, we have remove all the data in secondary and let it resync all the data in one week ago and seems fixed. But the problem comes again. "errmsg" : "syncTail: 10068 invalid operator: $id, syncing: { ts: Timestamp 1342667002000|51, h: -3142433914917806080, op: \"u\", ns: \"boc.paper\", o2: { _id: { $id: \"4fe0050d217042a83c010000\" }}, o: { $set: { info.difficulty: [ \"1\", \"2\", \"3\" ] } } }" |
| Comments |
| Comment by Shaun Verch [ 15/Nov/12 ] | |
|
Hi David, For more updates on this, you can follow I was able to reproduce it using the php driver given your description, so I'm posting that here: <?php $m = new Mongo("localhost:30001",array('replicaSet'=>'testreplset')); ?> Thanks! | |
| Comment by David Gubler [ 22/Oct/12 ] | |
|
Unfortunately no, sorry (logrotate took care of that...) But I had a close look at the logs when it happened and I don't remember seeing anything out of the ordinary before that assert (it came "out of the blue"). | |
| Comment by Shaun Verch [ 18/Oct/12 ] | |
|
Hi David, Thanks for the update. Do you have any logs from right before the secondary triggered this assert?
| |
| Comment by David Gubler [ 18/Oct/12 ] | |
|
Never mind the segfault. It turns out that the server in question had a faulty disk that produced garbage. I guess that crash was due to corrupted data from the disk. Sorry for that. | |
| Comment by Shaun Verch [ 04/Oct/12 ] | |
|
Thank you for the bug report. We're looking into this issue, and will let you know if we need any additional information. | |
| Comment by David Gubler [ 25/Sep/12 ] | |
|
We have hit the same issue with 2.0.7. One of the affected secondaries says: syncTail: 10068 invalid operator: $oid, syncing: { ts: Timestamp 1348485921000|30, h: -4162345707935058041, op: "i", ns: "doodle.pollCreatedLinkTracking", o: { _id: { $oid: "5059a6cf44aef65722ff7302" }, adminEmailLink: 0.0, copiedLink: 0.0, originalLink: The others just stop replicating. My co-worker says: works: , doesn't work (it appears that the driver can execute it on the primary, but it will blow up the secondaries): , We hit this issue while fiddling around with Rockmongo (uses the PHP driver). This is especially annoying because I found no way to re-sync a secondary from the primary (instead of another secondary). Luckily we create LVM snapshots on the primary, thus (I hope) I can recover a secondary using our backup and later all other secondaries from that one... Moreover, when I try to stop MongoDB on an affected secondary, it segfaults (but that cound be an unrelated problem): Tue Sep 25 13:39:14 [conn340605] end connection 188.92.145.82:47449 Tue Sep 25 13:39:24 Got signal: 11 (Segmentation fault). Tue Sep 25 13:39:24 Backtrace: |