[SERVER-16821] Do not abort server when receiving multiple SIGUSR1 in the same second Created: 13/Jan/15 Updated: 09/Apr/17 Resolved: 25/Jan/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Logging |
| Affects Version/s: | 2.6.4 |
| Fix Version/s: | 3.4.4, 3.5.2 |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Volans | Assignee: | Gabriel Russell (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Backport Requested: |
v3.4
|
||||||||||||||||||||
| Sprint: | Platforms 2017-01-23, Platforms 2017-02-13 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Server crashes if logRotate destination file already exists after receiving a SIGUSR1. Affected version: 2015-01-13T13:54:10.021+0000 [signalProcessingThread] db version v2.6.4 The reported error is: 015-01-13T13:54:10.766+0000 [signalProcessingThread] warning: Rotating log file /data/log/mongodb/mongodb.log failed: FileRenameFailed Renaming file /data/log/mongodb/mongodb.log to /data/log/mongodb/mongodb.log.2015-01-13T13-54-10 failed; destination already exists ***aborting after fassert() failure 2015-01-13T13:54:10.772+0000 [signalProcessingThread] SEVERE: Got signal: 6 (Aborted). |
| Comments |
| Comment by Githook User [ 09/Apr/17 ] | |
|
Author: {u'username': u'gabrielrussell', u'name': u'Gabriel Russell', u'email': u'gabriel.russell@mongodb.com'}Message: (cherry picked from commit 3cef6afea83b252613be458a0e0bf94ecea28f96) | |
| Comment by Githook User [ 25/Jan/17 ] | |
|
Author: {u'username': u'gabrielrussell', u'name': u'Gabriel Russell', u'email': u'gabriel.russell@mongodb.com'}Message: | |
| Comment by Volans [ 24/Feb/15 ] | |
|
nicolas_ this explains why it happen to you. You're matching multiple files and logrotate apply the post-rotate block for each file if the sharedscripts option is not set (see http://linuxcommand.org/man_pages/logrotate8.html ). If you have multiple MongoDB processes how do you send the SIGUSR1 to the right process? My 2 cents | |
| Comment by Nicolas Fouché [ 24/Feb/15 ] | |
|
volans I don't have the "sharedscripts" option. I did not notice it in https://jira.mongodb.org/browse/SERVER-14053?focusedCommentId=596867&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-596867. Should I add it ? About rotating multiple files, the config actually matches multiple files. Only because we have several mongod running. I admit that we should create one logrotate config per mongod.
| |
| Comment by Volans [ 16/Feb/15 ] | |
|
nicolas_ did you check if your logrotate config could match multiple files? If so, is the "sharedscripts" option set to run the postrotate only once? | |
| Comment by Nicolas Fouché [ 16/Feb/15 ] | |
|
volans I checked that I don't send SIGUSR1 twice; for example I don't have a mongo log rotation configured daily and hourly. I had the exact same error message. | |
| Comment by Volans [ 11/Feb/15 ] | |
|
nicolas_ In addition, are you getting exactly the same stack trace? "failed; destination already exists" | |
| Comment by Volans [ 11/Feb/15 ] | |
|
nicolas_ have you checked if your logrotate configuration by any chance could execute the SIGUSR1 twice in the same second? | |
| Comment by Nicolas Fouché [ 11/Feb/15 ] | |
|
I updated from 2.4.9 to 2.6.7. And out of 6 mongod processes, 2 of them had this Fatal Assert on log rotation at midnight. I used SIGUSR1, and have been using it for years. I was extremely lucky that it did not crash my whole cluster. To fix the problem, I also tried all suggestions, and only the one suggested by Volans worked:
Anyway, I understand it will be fixed in 3.0 by | |
| Comment by Volans [ 13/Jan/15 ] | |
|
ramon.fernandez thanks for your multiple replies. To answer to your comments:
| |
| Comment by Ramon Fernandez Marina [ 13/Jan/15 ] | |
|
volans, I should add that if you're using logrotate you may want to check this comment by one of our techops engineers. If you search for how to use logrotate and MongoDB you may also run into this stack overflow thread, which was addressed by the same MongoDB engineer. | |
| Comment by Ramon Fernandez Marina [ 13/Jan/15 ] | |
|
volans, as I mentioned before, this behavior introduced One could also argue that the rotated file name could include a more granular timestamp to prevent this issue, but also that rotating a log file twice in a second is probably an error in log rotation procedures. Can you elaborate on how you arrived to this scenario to see if there's any further recommendations we can incorporate into our documentation? In | |
| Comment by Ramon Fernandez Marina [ 13/Jan/15 ] | |
|
Thanks for your report volans. This behavior was introduced in |