[SERVER-9994] Slave hits Fatal Assertion 16361 during initial sync: objects in a capped ns cannot grow Created: 23/Jun/13  Updated: 10/Dec/14  Resolved: 23/Jul/13

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.4
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Nic Cottrell (Personal) Assignee: Eric Milkie
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

centos6


Attachments: Text File mongo-crash.txt     Text File mongo-crash2.log     File mongod.log.gz    
Issue Links:
Duplicate
duplicates SERVER-6984 Initial sync can fail, or break futur... Closed
Operating System: Linux
Steps To Reproduce:

Startup a clean mongo setup for replica set
Wait - see crash.

Participants:

 Description   

Slave crashes during init sync - after 24+ hours

Primary:

sprawk:PRIMARY> db.getReplicationInfo();
{
	"logSizeMB" : 28610.2294921875,
	"usedMB" : 537.86,
	"timeDiff" : 89633,
	"timeDiffHours" : 24.9,
	"tFirst" : "Sat Jun 22 2013 15:22:26 GMT+0200 (CEST)",
	"tLast" : "Sun Jun 23 2013 16:16:19 GMT+0200 (CEST)",
	"now" : "Sun Jun 23 2013 16:16:19 GMT+0200 (CEST

Servers are max 2 seconds out of sync....

Config on server:
master=true
replSet=sprawk
keyFile = /var/lib/mongo/sprawk.key
oplogSize = 28610

Config on slave:
replSet=sprawk
keyFile = /var/lib/mongo/sprawk.key

I will now try again with the same oplogSize specified on the slave...



 Comments   
Comment by Eric Milkie [ 23/Jul/13 ]

Right, I'll resolve this as a duplicate of SERVER-6984 and we'll track further updates on the bug there.

Comment by Nic Cottrell (Personal) [ 23/Jul/13 ]

No, thanks. Switching from a capped collection to a normal collection with a TTL index has made it possible to add new nodes. Probably still a bug in the server code somewhere though

Comment by Eric Milkie [ 23/Jul/13 ]

Hi Nic. Do you need any further help with this issue?

Comment by Nic Cottrell (Personal) [ 27/Jun/13 ]

Ok - so the init has finished and the new node is in state SECONDARY. I'm trying to work out whether all data has been replicated since the MASTER has 155x2GB data files, whereas the new secondary has only 68!

Comment by Eric Milkie [ 26/Jun/13 ]

The TTL index should peacefully coexist with other indexes such as the one you mentioned.

Comment by Nic Cottrell (Personal) [ 26/Jun/13 ]

Ah - I think my monit setup was restarting Mongo. I have disabled it and see what happens. I converted the big collection to non-capped, but have 2 other capped collections which may upset the init.

I'll definitely go add that TTL. Just regarding the constraints, if I add a TTL index on field "start" (which is a date) can I also have a compound index

{start:1, g:1, tl:1}

or is that not allowed?

Should have news in a couple of hours - that's about when it's due to crash...

Comment by Eric Milkie [ 26/Jun/13 ]

I'm not sure what's automatically restarting mongod after it exits; that's not part of a standard installation.

For initial sync, you have the steps correct. After step 4, we apply the oplog changes that have accrued during the secondary index builds, and then it transitions to SECONDARY state.

If you move all the current data to a noncapped collection, you can add a TTL index that will automatically delete old data for you: http://docs.mongodb.org/manual/tutorial/expire-data/
While the performance of a noncapped collection might be slower than a capped collection, it can be beneficial to control the expiration of old data yourself rather than let the capped collection do it based on the volume of data.

Comment by Nic Cottrell (Personal) [ 25/Jun/13 ]

Hi Eric! Thanks for the details.

Unforuntately once it crashes, it seems to automatically restart itself and begin copying all the data across from scratch. The jerome5.0 file has a create timestamp of the time of the restart. Is there a way of stopping that auto restart so I can manually restart with the replSet value removed from the conf file?

As I said, the capped collection seems to be big enough that it doesn't fully flush before the data is copied.

Can you provide more detail on the init step? I understand it's like this:

1. Copies over all data
2. Build _id index on all collections
3. Applies oplog changes
4. Rebuilds all other indexes...

I have the whole mongo folder on an LVM on the master, so I can try to do a snapshot, copy all the data files and restart on the slave. I estimate the copy will take 3-4 hours since the slave is in another country.

I'm also wondering whether I should try and convert the collection back to a normal one and manually remove all data with a cron job that just deletes data > 36 hours old... This seems like a safer and also more flexible way for me to handle these logs...

Comment by Eric Milkie [ 25/Jun/13 ]

So in the log, you had an initial sync start at 4:29 Monday morning. Until 16:41, it is cloning all the data (copying all the documents for each collection, sequentially). After that phase, it then started the first initial sync oplog phase at 16:41, where it immediately shut down. The operation that it was trying to apply was from 137199840200, which is "Sun, 23 Jun 2013 14:40:02". My guess is that this operation is the first operation to be applied from the oplog after all the data has been cloned. Technically, the oplog application phase needs to start with the op one past where it is really starting; however, due to the idempotency of the oplog, this should not be a problem.

After the secondary shuts down, can you restart it in non-replset mode and then run stats() on TranslationResult there? I am curious to see if the number of extents and their sizes is the same (it really needs to be, for capped collections to work).

There are still issues with replicating capped collections, such that if you have cycled around in your capped collection and you try to replicate it, the cycle on the secondary will begin in a different place. This can result in issues like the one you are experiencing, and also things like SERVER-8972.
One workaround for this is to copy the datafiles to the secondary and start it up that way: http://docs.mongodb.org/manual/tutorial/resync-replica-set-member/#replica-set-resync-by-copying

Comment by Nic Cottrell (Personal) [ 25/Jun/13 ]

Btw, the oldest entry is see in this collection is:

{ "_id" : ObjectId( "51c63f09e4b060ede2e189a6" ),
  "prep" : "",
  "postp" : "",
  "sourceText" : "<html>\r\n<head>\r\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">\r\n<style>\r\n<!--\r\n@import url(\"http://fonts.googleapis.com/css?family=Source+Sans+Pro:300,400,600,700&subset=latin,latin-ext\");\r\n\r\nhtml, body, p, td, div, span {\r\n\tfont-family: 'Source Sans Pro', verdana, arial, sans-serif;\r\n}\r\nbody {\r\n\ttext-align: left;\r\n        line-height: 1.1em;\r\n}\r\nbody, p, td, th {\r\n    font-size: 14px;\r\n    background-color:#ffffff;\r\n}\r\np { \r\n  margin-top: 4px;\r\n  margin-bottom: 8px; \r\n}\r\ntd, th {\r\n  text-align: left;\r\n  vertical-align: top;\r\n}\r\nlabel {\r\n  font-weight: bold;\r\n}\r\ndiv.field {\r\n margin-top:4px;\r\n}\r\na:link { text-decoration: none;\t }\r\na:visited {\ttext-decoration: none; }\r\na:hover {\ttext-decoration: underline; }\r\na:active { text-decoration: none; }\r\np.warning {\r\n  text-transform: uppercase;\r\n  color: #666666;\r\n}\r\nh1, .style1 {\r\n\tfont-size: 16pt;\r\n\tcolor: #003366;\r\n\tmargin-bottom: 2px;\r\n  font-weight: normal;\r\n}\r\nh2, .style2 {\r\n\tfont-size: 14pt;\r\n        color: #000066;\r\n  font-weight: normal;\r\n}\r\nh3,.style3 {\r\n\tfont-size: 12pt;\r\n        color: #000;\r\n  font-weight: normal;\r\n}\r\nh4, .style4 {\r\n\tfont-size: small;\r\n\tcolor: #003399;\r\n  font-weight: normal;\r\n}\r\n.footer {\r\n  font-size: 0.9em;\r\n        color: #333333;\r\n}\r\n#footer {\r\n  margin-top: 16px;\r\n}\r\n\r\n.warning, warning, td.warning, .alert-warning {\r\n  color: #993300;\r\n  border: 1px solid goldenrod;\r\n  padding: 4px 4px 4px 24px;\r\n  background: lemonchiffon url( http://www.sprawk.com/images/warning_icon.gif ) no-repeat 4px 6px;\r\n  margin: 4px 0;\r\n}\r\n\r\n.warning a, .alert-warning a {\r\n  color: darkorange;\r\n  font-weight: bold;\r\n}\r\n\r\n.warning div.extraInfo, \r\n.alert-warning div.extraInfo {\r\n  margin: 2px 1px 4px;\r\n}\r\n\r\n.msg, .info, td.msg, td.info, .alert-info {\r\n  border: none;\r\n  padding: 4px;\r\n  padding-left: 24px;\r\n  margin-bottom: 6px;\r\n  background: #BAD0D8 url( http://www.sprawk.com/images/mini/information.png ) no-repeat 4px 6px;\r\n}\r\n\r\n.debug, td.debug, td.debug, .alert-info {\r\n  border: none;\r\n  padding: 4px;\r\n  padding-left: 24px;\r\n  margin-bottom: 6px;\r\n  background: #BAD0D8 url( http://www.sprawk.com/images/mini/bug.png ) no-repeat 4px 6px;\r\n}\r\n\r\ndiv.error, td.error, span.error, .alert-error {\r\n  margin: 8px 0 8px;\r\n  padding: 4px 4px 4px 24px;\r\n  background: #ff6666 url( http://www.sprawk.com/images/error_icon.gif ) no-repeat 4px 4px;\r\n  border: 1px dotted red;\r\n  color: white;\r\n  width: auto;\r\n}\r\n\r\n.alert-error a {\r\n  color: #800000;\r\n}\r\n\r\ntr.error, .alert-error {\r\n  background: #ff6666;\r\n}\r\n\r\ntr.error td.error {\r\n  background: transparent;\r\n  border: none;\r\n}\r\n\r\ninput.error {\r\n  border: 2px dotted #ff6666;\r\n  background: none;\r\n}\r\n\r\n.success, td.success, .alert-success {\r\n  margin: 8px 0 8px;\r\n  padding: 4px 4px 4px 24px;\r\n  border: 1px dotted #32CD32;\r\n  background: #CCFFCC url( http://www.sprawk.com/images/success_icon.gif ) 4px 6px no-repeat;\r\n  width: auto;\r\n}\r\n\r\n.quote, .usermsg {\r\n  margin-left: 20px;\r\n  color: #000;\r\n  border-width:1px;\r\n}\r\n\r\n\r\ndiv.task {\r\n  background: transparent url(http://www.sprawk.com/images/task.gif) 2px 2px no-repeat;\r\n  padding-left: 22px;\r\n  padding-top: 1px;\r\n  padding-bottom: 1px;\r\n  margin-right: 2px;\r\n}\r\n\r\ndiv.line {\r\n  background: transparent url(http://www.sprawk.com/images/mini/page_red.png) 2px 2px no-repeat;\r\n  padding-left: 22px;\r\n  padding-top: 1px;\r\n  padding-bottom: 1px;\r\n  margin-right: 2px;\r\n}\r\n.extraInfo {\r\n  color: #545454;\r\n  padding: 0;\r\n  margin: 3px 1px 6px 1px;\r\n  font-size: 90%;\r\n}\r\ntable.data {\r\n    border: 1px gray;\r\n    padding: 1px;\r\n    color: #111111;\r\n    margin: 6px 0 6px 0;\r\n    border-collapse: separate;\r\n}\r\ntable.data.floater {\r\n    padding-right: 3px;\r\n    margin-right: 15px;\r\n    width: 45%;\r\n}\r\ntable.data tr {\r\n    padding-top: 1px;\r\n    padding-bottom: 1px;\r\n}\r\ntable.data tr:hover {\r\n    background-color: #F0FFF0;\r\n}\r\ntable.data tr td, table.data tr th {\r\n    border-width: 1px;\r\n    border-color: white;\r\n    border-style: solid;\r\n}\r\ntable.data tr td {\r\n    padding: 4px 8px;\r\n    text-align: left;\r\n    vertical-align: top;\r\n    background-color: #f1f0ff;\r\n    white-space: normal;\r\n    word-wrap: break-word;\r\n}\r\ntable.data tr th {\r\n    text-align: left;\r\n    vertical-align: top;\r\n    font-weight: normal;\r\n    padding: 2px 5px 2px 8px;\r\n    text-align: left;\r\n    vertical-align: top;\r\n    background-color: #e5e4fe;\r\n}\r\ntable.data tr.odd, table.data tr.odd td {\r\n    background-color: #f0f8ff;\r\n}\r\ntable.data tr.even, table.data tr.even td {\r\n    background-color: white;\r\n}\r\n\r\ntable.data tr.odd:hover, table.data tr.even:hover {\r\n    background-color: #ddd;\r\n}\r\n\r\ntable.data tr.number td {\r\n    text-align: right;\r\n}\r\n\r\ntable.data td.number {\r\n    text-align: right\r\n}\r\n\r\ntable.data tr td:hover {\r\n    background-color: #E6E6FA;\r\n}\r\n\r\ntable.data tr td div, table.data tr td div.error, table.data tr td div.success, table.data tr td div.warning, table.data tr td div.info {\r\n    margin: 0;\r\n}\r\n\r\ntable.spaced tr {\r\n    margin: 0 1px;\r\n}\r\n\r\ntable.spaced tr th {\r\n    padding-top: 2px;\r\n    margin-bottom: 2px;\r\n}\r\n\r\ntable.spaced tr td {\r\n    padding-right: 4px;\r\n    margin-right: 2px;\r\n}\r\n\r\ntd.amount {\r\n    text-align: right !important;\r\n}\r\n\r\ntr.total td {\r\n    font-style: italic;\r\n}\r\n\r\n-->\r\n</style>\r\n</head>\r\n<body bgcolor=\"#FFFFFF\" LEFTMARGIN=\"0\" MARGINHEIGHT=\"0\" MARGINWIDTH=\"0\" TOPMARGIN=\"0\">\r\n<br>\r\n<div>\r\n  <TABLE BORDER=\"0\" CELLSPACING=\"0\" CELLPADDING=\"0\" ALIGN=\"center\">\r\n   <TR>\r\n    <TD ALIGN=\"left\" VALIGN=\"top\" WIDTH=\"500\">\r\n<p><img style=\"border:0\" src=\"http://www.sprawk.com/images/sprawk-wl-243.png\" width=\"243\" height=\"63\" /></p>\r\n%%contents%%\r\n<div id=\"footer\" class=\"footer\" >\r\n<hr height=\"1\" />\r\n<p><a href=\"http://www.sprawk.com/\">sprawk.com</a> | Stockholm, Sweden | +46 70 885 9690 | <a href=\"mailto:info@sprawk.com\">info@sprawk.com</a></p>\r\n<p class=\"extraInfo\">If this email is redirected to your spam/junk folder, please add sprawk.com to your \"safe sender's list\"</p>\r\n</div>\r\n</TD>\r\n</TR>\r\n</TABLE>\r\n</div>\r\n\r\n<img src=\"http://www.sprawk.com/clickstats/read-%%msgid%%.png\" width=\"20\">\r\n</body>\r\n</html>",
  "sourceTextLc" : "<html>\r\n<head>\r\n<meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\">\r\n<style>\r\n<!--\r\n@import url(\"http://fonts.googleapis.com/css?family=source+sans+pro:300,400,600,700&subset=latin,latin-ext\");\r\n\r\nhtml, body, p, td, div, span {\r\n\tfont-family: 'source sans pro', verdana, arial, sans-serif;\r\n}\r\nbody {\r\n\ttext-align: left;\r\n        line-height: 1.1em;\r\n}\r\nbody, p, td, th {\r\n    font-size: 14px;\r\n    background-color:#ffffff;\r\n}\r\np { \r\n  margin-top: 4px;\r\n  margin-bottom: 8px; \r\n}\r\ntd, th {\r\n  text-align: left;\r\n  vertical-align: top;\r\n}\r\nlabel {\r\n  font-weight: bold;\r\n}\r\ndiv.field {\r\n margin-top:4px;\r\n}\r\na:link { text-decoration: none;\t }\r\na:visited {\ttext-decoration: none; }\r\na:hover {\ttext-decoration: underline; }\r\na:active { text-decoration: none; }\r\np.warning {\r\n  text-transform: uppercase;\r\n  color: #666666;\r\n}\r\nh1, .style1 {\r\n\tfont-size: 16pt;\r\n\tcolor: #003366;\r\n\tmargin-bottom: 2px;\r\n  font-weight: normal;\r\n}\r\nh2, .style2 {\r\n\tfont-size: 14pt;\r\n        color: #000066;\r\n  font-weight: normal;\r\n}\r\nh3,.style3 {\r\n\tfont-size: 12pt;\r\n        color: #000;\r\n  font-weight: normal;\r\n}\r\nh4, .style4 {\r\n\tfont-size: small;\r\n\tcolor: #003399;\r\n  font-weight: normal;\r\n}\r\n.footer {\r\n  font-size: 0.9em;\r\n        color: #333333;\r\n}\r\n#footer {\r\n  margin-top: 16px;\r\n}\r\n\r\n.warning, warning, td.warning, .alert-warning {\r\n  color: #993300;\r\n  border: 1px solid goldenrod;\r\n  padding: 4px 4px 4px 24px;\r\n  background: lemonchiffon url( http://www.sprawk.com/images/warning_icon.gif ) no-repeat 4px 6px;\r\n  margin: 4px 0;\r\n}\r\n\r\n.warning a, .alert-warning a {\r\n  color: darkorange;\r\n  font-weight: bold;\r\n}\r\n\r\n.warning div.extrainfo, \r\n.alert-warning div.extrainfo {\r\n  margin: 2px 1px 4px;\r\n}\r\n\r\n.msg, .info, td.msg, td.info, .alert-info {\r\n  border: none;\r\n  padding: 4px;\r\n  padding-left: 24px;\r\n  margin-bottom: 6px;\r\n  background: #bad0d8 url( http://www.sprawk.com/images/mini/information.png ) no-repeat 4px 6px;\r\n}\r\n\r\n.debug, td.debug, td.debug, .alert-info {\r\n  border: none;\r\n  padding: 4px;\r\n  padding-left: 24px;\r\n  margin-bottom: 6px;\r\n  background: #bad0d8 url( http://www.sprawk.com/images/mini/bug.png ) no-repeat 4px 6px;\r\n}\r\n\r\ndiv.error, td.error, span.error, .alert-error {\r\n  margin: 8px 0 8px;\r\n  padding: 4px 4px 4px 24px;\r\n  background: #ff6666 url( http://www.sprawk.com/images/error_icon.gif ) no-repeat 4px 4px;\r\n  border: 1px dotted red;\r\n  color: white;\r\n  width: auto;\r\n}\r\n\r\n.alert-error a {\r\n  color: #800000;\r\n}\r\n\r\ntr.error, .alert-error {\r\n  background: #ff6666;\r\n}\r\n\r\ntr.error td.error {\r\n  background: transparent;\r\n  border: none;\r\n}\r\n\r\ninput.error {\r\n  border: 2px dotted #ff6666;\r\n  background: none;\r\n}\r\n\r\n.success, td.success, .alert-success {\r\n  margin: 8px 0 8px;\r\n  padding: 4px 4px 4px 24px;\r\n  border: 1px dotted #32cd32;\r\n  background: #ccffcc url( http://www.sprawk.com/images/success_icon.gif ) 4px 6px no-repeat;\r\n  width: auto;\r\n}\r\n\r\n.quote, .usermsg {\r\n  margin-left: 20px;\r\n  color: #000;\r\n  border-width:1px;\r\n}\r\n\r\n\r\ndiv.task {\r\n  background: transparent url(http://www.sprawk.com/images/task.gif) 2px 2px no-repeat;\r\n  padding-left: 22px;\r\n  padding-top: 1px;\r\n  padding-bottom: 1px;\r\n  margin-right: 2px;\r\n}\r\n\r\ndiv.line {\r\n  background: transparent url(http://www.sprawk.com/images/mini/page_red.png) 2px 2px no-repeat;\r\n  padding-left: 22px;\r\n  padding-top: 1px;\r\n  padding-bottom: 1px;\r\n  margin-right: 2px;\r\n}\r\n.extrainfo {\r\n  color: #545454;\r\n  padding: 0;\r\n  margin: 3px 1px 6px 1px;\r\n  font-size: 90%;\r\n}\r\ntable.data {\r\n    border: 1px gray;\r\n    padding: 1px;\r\n    color: #111111;\r\n    margin: 6px 0 6px 0;\r\n    border-collapse: separate;\r\n}\r\ntable.data.floater {\r\n    padding-right: 3px;\r\n    margin-right: 15px;\r\n    width: 45%;\r\n}\r\ntable.data tr {\r\n    padding-top: 1px;\r\n    padding-bottom: 1px;\r\n}\r\ntable.data tr:hover {\r\n    background-color: #f0fff0;\r\n}\r\ntable.data tr td, table.data tr th {\r\n    border-width: 1px;\r\n    border-color: white;\r\n    border-style: solid;\r\n}\r\ntable.data tr td {\r\n    padding: 4px 8px;\r\n    text-align: left;\r\n    vertical-align: top;\r\n    background-color: #f1f0ff;\r\n    white-space: normal;\r\n    word-wrap: break-word;\r\n}\r\ntable.data tr th {\r\n    text-align: left;\r\n    vertical-align: top;\r\n    font-weight: normal;\r\n    padding: 2px 5px 2px 8px;\r\n    text-align: left;\r\n    vertical-align: top;\r\n    background-color: #e5e4fe;\r\n}\r\ntable.data tr.odd, table.data tr.odd td {\r\n    background-color: #f0f8ff;\r\n}\r\ntable.data tr.even, table.data tr.even td {\r\n    background-color: white;\r\n}\r\n\r\ntable.data tr.odd:hover, table.data tr.even:hover {\r\n    background-color: #ddd;\r\n}\r\n\r\ntable.data tr.number td {\r\n    text-align: right;\r\n}\r\n\r\ntable.data td.number {\r\n    text-align: right\r\n}\r\n\r\ntable.data tr td:hover {\r\n    background-color: #e6e6fa;\r\n}\r\n\r\ntable.data tr td div, table.data tr td div.error, table.data tr td div.success, table.data tr td div.warning, table.data tr td div.info {\r\n    margin: 0;\r\n}\r\n\r\ntable.spaced tr {\r\n    margin: 0 1px;\r\n}\r\n\r\ntable.spaced tr th {\r\n    padding-top: 2px;\r\n    margin-bottom: 2px;\r\n}\r\n\r\ntable.spaced tr td {\r\n    padding-right: 4px;\r\n    margin-right: 2px;\r\n}\r\n\r\ntd.amount {\r\n    text-align: right !important;\r\n}\r\n\r\ntr.total td {\r\n    font-style: italic;\r\n}\r\n\r\n-->\r\n</style>\r\n</head>\r\n<body bgcolor=\"#ffffff\" leftmargin=\"0\" marginheight=\"0\" marginwidth=\"0\" topmargin=\"0\">\r\n<br>\r\n<div>\r\n  <table border=\"0\" cellspacing=\"0\" cellpadding=\"0\" align=\"center\">\r\n   <tr>\r\n    <td align=\"left\" valign=\"top\" width=\"500\">\r\n<p><img style=\"border:0\" src=\"http://www.sprawk.com/images/sprawk-wl-243.png\" width=\"243\" height=\"63\" /></p>\r\n%%contents%%\r\n<div id=\"footer\" class=\"footer\" >\r\n<hr height=\"1\" />\r\n<p><a href=\"http://www.sprawk.com/\">sprawk.com</a> | stockholm, sweden | +46 70 885 9690 | <a href=\"mailto:info@sprawk.com\">info@sprawk.com</a></p>\r\n<p class=\"extrainfo\">if this email is redirected to your spam/junk folder, please add sprawk.com to your \"safe sender's list\"</p>\r\n</div>\r\n</td>\r\n</tr>\r\n</table>\r\n</div>\r\n\r\n<img src=\"http://www.sprawk.com/clickstats/read-%%msgid%%.png\" width=\"20\">\r\n</body>\r\n</html>",
  "sth" : 3139412265070807266,
  "sth2" : 0,
  "stlen" : 6054,
  "sl" : "eng",
  "destText" : "<html>\r\n<head>\r\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">\r\n<style>\r\n<!--\r\n@import url(\"http://fonts.googleapis.com/css?family=Source+Sans+Pro:300,400,600,700&subset=latin,latin-ext\");\r\n\r\nhtml, body, p, td, div, span {\r\n\tfont-family: 'Source Sans Pro', verdana, arial, sans-serif;\r\n}\r\nbody {\r\n\ttext-align: left;\r\n        line-height: 1.1em;\r\n}\r\nbody, p, td, th {\r\n    font-size: 14px;\r\n    background-color:#ffffff;\r\n}\r\np { \r\n  margin-top: 4px;\r\n  margin-bottom: 8px; \r\n}\r\ntd, th {\r\n  text-align: left;\r\n  vertical-align: top;\r\n}\r\nlabel {\r\n  font-weight: bold;\r\n}\r\ndiv.field {\r\n margin-top:4px;\r\n}\r\na:link { text-decoration: none;\t }\r\na:visited {\ttext-decoration: none; }\r\na:hover {\ttext-decoration: underline; }\r\na:active { text-decoration: none; }\r\np.warning {\r\n  text-transform: uppercase;\r\n  color: #666666;\r\n}\r\nh1, .style1 {\r\n\tfont-size: 16pt;\r\n\tcolor: #003366;\r\n\tmargin-bottom: 2px;\r\n  font-weight: normal;\r\n}\r\nh2, .style2 {\r\n\tfont-size: 14pt;\r\n        color: #000066;\r\n  font-weight: normal;\r\n}\r\nh3,.style3 {\r\n\tfont-size: 12pt;\r\n        color: #000;\r\n  font-weight: normal;\r\n}\r\nh4, .style4 {\r\n\tfont-size: small;\r\n\tcolor: #003399;\r\n  font-weight: normal;\r\n}\r\n.footer {\r\n  font-size: 0.9em;\r\n        color: #333333;\r\n}\r\n#footer {\r\n  margin-top: 16px;\r\n}\r\n\r\n.warning, warning, td.warning, .alert-warning {\r\n  color: #993300;\r\n  border: 1px solid goldenrod;\r\n  padding: 4px 4px 4px 24px;\r\n  background: lemonchiffon url( http://www.sprawk.com/images/warning_icon.gif ) no-repeat 4px 6px;\r\n  margin: 4px 0;\r\n}\r\n\r\n.warning a, .alert-warning a {\r\n  color: darkorange;\r\n  font-weight: bold;\r\n}\r\n\r\n.warning div.extraInfo, \r\n.alert-warning div.extraInfo {\r\n  margin: 2px 1px 4px;\r\n}\r\n\r\n.msg, .info, td.msg, td.info, .alert-info {\r\n  border: none;\r\n  padding: 4px;\r\n  padding-left: 24px;\r\n  margin-bottom: 6px;\r\n  background: #BAD0D8 url( http://www.sprawk.com/images/mini/information.png ) no-repeat 4px 6px;\r\n}\r\n\r\n.debug, td.debug, td.debug, .alert-info {\r\n  border: none;\r\n  padding: 4px;\r\n  padding-left: 24px;\r\n  margin-bottom: 6px;\r\n  background: #BAD0D8 url( http://www.sprawk.com/images/mini/bug.png ) no-repeat 4px 6px;\r\n}\r\n\r\ndiv.error, td.error, span.error, .alert-error {\r\n  margin: 8px 0 8px;\r\n  padding: 4px 4px 4px 24px;\r\n  background: #ff6666 url( http://www.sprawk.com/images/error_icon.gif ) no-repeat 4px 4px;\r\n  border: 1px dotted red;\r\n  color: white;\r\n  width: auto;\r\n}\r\n\r\n.alert-error a {\r\n  color: #800000;\r\n}\r\n\r\ntr.error, .alert-error {\r\n  background: #ff6666;\r\n}\r\n\r\ntr.error td.error {\r\n  background: transparent;\r\n  border: none;\r\n}\r\n\r\ninput.error {\r\n  border: 2px dotted #ff6666;\r\n  background: none;\r\n}\r\n\r\n.success, td.success, .alert-success {\r\n  margin: 8px 0 8px;\r\n  padding: 4px 4px 4px 24px;\r\n  border: 1px dotted #32CD32;\r\n  background: #CCFFCC url( http://www.sprawk.com/images/success_icon.gif ) 4px 6px no-repeat;\r\n  width: auto;\r\n}\r\n\r\n.quote, .usermsg {\r\n  margin-left: 20px;\r\n  color: #000;\r\n  border-width:1px;\r\n}\r\n\r\n\r\ndiv.task {\r\n  background: transparent url(http://www.sprawk.com/images/task.gif) 2px 2px no-repeat;\r\n  padding-left: 22px;\r\n  padding-top: 1px;\r\n  padding-bottom: 1px;\r\n  margin-right: 2px;\r\n}\r\n\r\ndiv.line {\r\n  background: transparent url(http://www.sprawk.com/images/mini/page_red.png) 2px 2px no-repeat;\r\n  padding-left: 22px;\r\n  padding-top: 1px;\r\n  padding-bottom: 1px;\r\n  margin-right: 2px;\r\n}\r\n.extraInfo {\r\n  color: #545454;\r\n  padding: 0;\r\n  margin: 3px 1px 6px 1px;\r\n  font-size: 90%;\r\n}\r\ntable.data {\r\n    border: 1px gray;\r\n    padding: 1px;\r\n    color: #111111;\r\n    margin: 6px 0 6px 0;\r\n    border-collapse: separate;\r\n}\r\ntable.data.floater {\r\n    padding-right: 3px;\r\n    margin-right: 15px;\r\n    width: 45%;\r\n}\r\ntable.data tr {\r\n    padding-top: 1px;\r\n    padding-bottom: 1px;\r\n}\r\ntable.data tr:hover {\r\n    background-color: #F0FFF0;\r\n}\r\ntable.data tr td, table.data tr th {\r\n    border-width: 1px;\r\n    border-color: white;\r\n    border-style: solid;\r\n}\r\ntable.data tr td {\r\n    padding: 4px 8px;\r\n    text-align: left;\r\n    vertical-align: top;\r\n    background-color: #f1f0ff;\r\n    white-space: normal;\r\n    word-wrap: break-word;\r\n}\r\ntable.data tr th {\r\n    text-align: left;\r\n    vertical-align: top;\r\n    font-weight: normal;\r\n    padding: 2px 5px 2px 8px;\r\n    text-align: left;\r\n    vertical-align: top;\r\n    background-color: #e5e4fe;\r\n}\r\ntable.data tr.odd, table.data tr.odd td {\r\n    background-color: #f0f8ff;\r\n}\r\ntable.data tr.even, table.data tr.even td {\r\n    background-color: white;\r\n}\r\n\r\ntable.data tr.odd:hover, table.data tr.even:hover {\r\n    background-color: #ddd;\r\n}\r\n\r\ntable.data tr.number td {\r\n    text-align: right;\r\n}\r\n\r\ntable.data td.number {\r\n    text-align: right\r\n}\r\n\r\ntable.data tr td:hover {\r\n    background-color: #E6E6FA;\r\n}\r\n\r\ntable.data tr td div, table.data tr td div.error, table.data tr td div.success, table.data tr td div.warning, table.data tr td div.info {\r\n    margin: 0;\r\n}\r\n\r\ntable.spaced tr {\r\n    margin: 0 1px;\r\n}\r\n\r\ntable.spaced tr th {\r\n    padding-top: 2px;\r\n    margin-bottom: 2px;\r\n}\r\n\r\ntable.spaced tr td {\r\n    padding-right: 4px;\r\n    margin-right: 2px;\r\n}\r\n\r\ntd.amount {\r\n    text-align: right !important;\r\n}\r\n\r\ntr.total td {\r\n    font-style: italic;\r\n}\r\n\r\n-->\r\n</style>\r\n</head>\r\n<body bgcolor=\"#FFFFFF\" LEFTMARGIN=\"0\" MARGINHEIGHT=\"0\" MARGINWIDTH=\"0\" TOPMARGIN=\"0\">\r\n<br>\r\n<div>\r\n  <TABLE BORDER=\"0\" CELLSPACING=\"0\" CELLPADDING=\"0\" ALIGN=\"center\">\r\n   <TR>\r\n    <TD ALIGN=\"left\" VALIGN=\"top\" WIDTH=\"500\">\r\n<p><img style=\"border:0\" src=\"http://www.sprawk.com/images/sprawk-wl-243.png\" width=\"243\" height=\"63\" /></p>\r\n%%contents%%\r\n<div id=\"footer\" class=\"footer\" >\r\n<hr height=\"1\" />\r\n<p><a href=\"http://www.sprawk.com/\">sprawk.com</a> | Stockholm, Sweden | +46 70 885 9690 | <a href=\"mailto:info@sprawk.com\">info@sprawk.com</a></p>\r\n<p class=\"extraInfo\">If this email is redirected to your spam/junk folder, please add sprawk.com to your \"safe sender's list\"</p>\r\n</div>\r\n</TD>\r\n</TR>\r\n</TABLE>\r\n</div>\r\n\r\n<img src=\"http://www.sprawk.com/clickstats/read-%%msgid%%.png\" width=\"20\">\r\n</body>\r\n</html>",
  "dl" : "eng",
  "sr" : 0.8999999761581421,
  "tt" : 3,
  "b" : false,
  "mime" : "text/html",
  "start" : Date( 1371946761615 ),
  "g" : ObjectId( "4defe4887112ede06261f72f" ),
  "translator" : "transmachina.web.Msg$1",
  "suhosth" : -2147483648,
  "subResultsSize" : 0,
  "fromCache" : false,
  "et" : 0,
  "status" : 2,
  "mb" : "MT",
  "note" : "Same lang",
  "server" : "alpha",
  "same" : true,
  "cacheSuffix" : 5 }

so the capped collection is taking about 2.5 days of logs but the init seems to die after about 24 hours. So it makes sense that it would be inserting objects initially inserted on June 23.

Comment by Nic Cottrell (Personal) [ 25/Jun/13 ]

Btw, the TranslationResult on the master looks like:

> db.TranslationResult.stats()
{
	"ns" : "jerome5.TranslationResult",
	"count" : 549212,
	"size" : 991211364,
	"avgObjSize" : 1804.7882493463362,
	"storageSize" : 1000001536,
	"numExtents" : 1,
	"nindexes" : 11,
	"lastExtentSize" : 1000001536,
	"paddingFactor" : 1,
	"systemFlags" : 1,
	"userFlags" : 0,
	"totalIndexSize" : 476799792,
	"indexSizes" : {
		"_id_" : 22304128,
		"g_1_status_1_subResultsSize_1_b_1_start_1_dl_1" : 62538224,
		"g_1_fromCache_1_same_1_b_1_subResultsSize_1_status_1" : 51476096,
		"start_-1_status_1_g_1" : 50830192,
		"start_1_g_1_sl_1_dl_1" : 37208976,
		"g_1_sl_1_sth_1" : 51042768,
		"g_1_sth_1_suh_1" : 50658496,
		"p_1" : 21028672,
		"suh_1" : 31984512,
		"suhost_1_sup_1_start_-1" : 69577760,
		"r_1_start_-1" : 28149968
	},
	"capped" : true,
	"max" : NumberLong("9223372036854775807"),
	"ok" : 1
}

Comment by Nic Cottrell (Personal) [ 25/Jun/13 ]

This is the mongo log going back to June 16 I believe...

Comment by Eric Milkie [ 25/Jun/13 ]

I don't think a database named "system" should cause any harm. There are system collections within every database that begin with the name "system.", as you mentioned.

I looked at your two logs with the failures. They both have failed when trying to replicate an insert operation that looks exactly like this:

{ ts: Timestamp 1371998402000|9, h: 6117456418327665188, v: 2, op: "i", ns: "jerome5.TranslationResult", o: { _id: ObjectId('51c708c2e4b00a8dfc3e931c'), prep: "", postp: "", sourceText: "Sign up as a Sprawk customer or freelancer at <a href="http://www.sprawk.com/en/signupChoose">http://www.sprawk.com/en/signupChoose</a>", sourceTextLc: "sign up as a sprawk customer or freelancer at <a href="http://www.sprawk.com/en/signupchoose">http://www.sprawk.com/en/signupchoose</a>", sth: -3109906834912301281, sth2: 0, stlen: 135, sl: "eng", destText: "Sign up as a Sprawk customer or freelancer at <a href="http://www.sprawk.com/en/signupChoose">http://www.sprawk.com/en/signupChoose</a>", dl: "nld", sr: 0.0, tt: 1, b: false, mime: "text/html", start: new Date(1371998402694), g: ObjectId('4defe4887112ede06261f72f'), translator: "transmachina.jerome.translate.HtmlTranslator", sourceUrl: "http://www.transmachina.com/node/96", suhost: "www.transmachina.com", suhosth: -680836827, sup: "/node/96", suh: 7017488501028237330, subResultsSize: 0, fromCache: false, et: 0, status: -1, mb: "MT", note: "translateViaBestExample failed
Failed to translate in MemoryTranslate.translateSentence", server: "alpha", same: false, cacheSuffix: 5 } }

This makes no sense to me yet, because your attempts were made on different days, and yet both attempts are reading the same operation in the oplog. The timestamp of the operation is Sun, 23 Jun 2013 14:40:02, so the question becomes why did this operation get applied by the initial sync you ran on Monday June 24? Can you post the full log of the initial sync from Monday?

Comment by Nic Cottrell (Personal) [ 25/Jun/13 ]

Could the database called "system" also cause some weird problems. I know it's a reserved word for collection names, but maybe there is a weird bug too. I've removed those old databases and running the init again now... I checked the check and added an assert, but so far nothing which updates any objects in that TranslationResult capped collection... Any other ideas?

Comment by Nic Cottrell (Personal) [ 24/Jun/13 ]

I don't think that I am, but I will check the code again. The * database must be very old - from some early tests. Only 'system' and 'config' are internal databases. I can remove 'upload'?

Comment by Eric Milkie [ 24/Jun/13 ]

I think this is a manifestation of SERVER-6984.
Do you know if you are modifying documents in-place in the capped collection?

Also, I noticed you have a database named "*", was this intentional?

Comment by Nic Cottrell (Personal) [ 24/Jun/13 ]

Exactly the same again - it seems to happen before final indexing is done since only about half of the 2G data files exist on the new code as compared to the master.

Comment by Nic Cottrell (Personal) [ 24/Jun/13 ]

Unfortunately same/similar crash again overnight. TranslationResult is a capped collection which has about a 24 lifetime, whereas the init takes slightly longer. So I guess no objects from the original copy of that collection remain once the oplog is applied... Is there a way to exclude that collection from the replica set somehow? Or some other work around?

Comment by Nic Cottrell (Personal) [ 23/Jun/13 ]

I've now set up ntpd correctly on the new node - so times should now be in sync. Will let you know if any errors occur this time around.

Generated at Thu Feb 08 03:21:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.