[SERVER-29230] Journal files accumulating on ReplicaSet Secondary Created: 16/May/17 Updated: 26/Feb/18 Resolved: 26/Feb/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.4.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Alberto Ornaghi | Assignee: | Susan LoVerso |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
we have a replica set composed by a primary, a secondary and an arbiter. we deployed it with automation-agent from mongodb cloud manager. currently there are 3926 journal files in the journal directory. (see attachment) |
| Comments |
| Comment by Kelsey Schubert [ 26/Feb/18 ] | |
|
reopening to change resolution to clarify that this work was completed in | |
| Comment by Alberto Ornaghi [ 21/May/17 ] | |
|
just for reference (if needed by someone else in the future) doing this: "echo 2dd1ce17-079e-403c-b352-a1921ee207ee > /sys/bus/vmbus/drivers/hv_util/unbind" will resolve the "Time has been changed" problem. | |
| Comment by Alberto Ornaghi [ 21/May/17 ] | |
|
it took 4.5 hours to recover, but now it seems ok (210 Mb of journal). it's rebuilding the index in foreground so i have to wait some more time... the timesyncd daemon was already stopped. but the "Time has been changed" message appears even in that situation. unfortunately i only have to wait for a kernel patch for this... the perfect storm. | |
| Comment by Susan LoVerso [ 21/May/17 ] | |
|
There is a WiredTiger verbose option that will give progress for recovery. You can turn it on by adding this:
However, the checkpoint after recovery is where your issue is and the potential hang. Can you turn off the time daemon or change its setting so that it updates much less frequently, particularly during this recovery process? | |
| Comment by Alberto Ornaghi [ 20/May/17 ] | |
|
last week i tried it and it seemed stuck on recovering the journal files... 200 Gb of journal to recover is a lot... how can i check the recovery process progress? systemLog: | |
| Comment by Susan LoVerso [ 20/May/17 ] | |
|
For the moment, if you restart the site it will replay the journal and then it will perform a checkpoint and that will allow it to remove the old journal files. The situation you have is that the code assumes time goes forward and the system is stuck in checkpoint in a very long sleep because it computed a time difference when time went backward. Assuming the checkpoint that happens after the restart and replay of the journal completes then it will be able to remove the files. I suggest doing this one site at a time. | |
| Comment by Alberto Ornaghi [ 20/May/17 ] | |
|
I created an index on the primary and then it replicated to the secondary. is it safe to kill on of them and let them redo the journal? is there some sort of debug output to see in the log when it re-do the journal on startup to check if it's stucked or not? | |
| Comment by Susan LoVerso [ 18/May/17 ] | |
|
Work in WiredTiger relating to time structures and time going backward will be done in | |
| Comment by Alberto Ornaghi [ 17/May/17 ] | |
|
yes, its full of those messages: May 17 14:37:05 db-rs0-2 systemd[4467]: Time has been changed unfortunately the comment in the stack overflow article you linked (the one about azure) is not reassuring... thanks again | |
| Comment by Susan LoVerso [ 17/May/17 ] | |
|
If you check /var/log/syslog you may see messages that say something like "Time has been changed" (or if you want to upload that file I will look at it here). You might also find this page helpful: https://askubuntu.com/questions/888493/var-log-syslog-systemd1-time-has-been-changed-message-every-5-seconds ETA: If you see those messages, resolving those time change messages should resolve the other problems you're seeing. Please let me know. | |
| Comment by Susan LoVerso [ 17/May/17 ] | |
|
If you notice this happening again, please try to gather stacks (one suggested method is pmp, poor man's profiler, from poormansprofiler.org). Something inside WiredTiger using time is causing a checkpoint to get stuck and that needs to get fixed. That is the reason the journal files are accumulating. I will be looking at code to avoid time going backward and also find the hang. Thank you for your cooperation! | |
| Comment by Alberto Ornaghi [ 17/May/17 ] | |
|
I've restarted the secondary from another snapshot from the primary. deleted the diagnostic.data dir. the resync is done and the journal files are not accumulating right now, but I will monitor the situation and let you know if it happens again. till now it happened 2 times over 3 resync. pretty bad rate. do you have any recommendation on the timesyncd configuration? | |
| Comment by Alberto Ornaghi [ 17/May/17 ] | |
|
we are running on azure VM, they are Ubuntu 16.04, time is synchronised with the default timesyncd (no change to the configuration). thank you | |
| Comment by Susan LoVerso [ 17/May/17 ] | |
|
Can you describe the system you're running on? Is it a virtual machine? Or an AWS instance or some other cloud instance? Are you running any time daemon process like ntpd or similar? There are values that appear in the diagnostic data that indicate the system clock may have been adjusted backward, skewing several values, but also causing a checkpoint to get stuck. That checkpoint getting stuck is the reason that journal files cannot be removed. If you can get it into this situation again, can you attempt to gather stacks so that we can see where the checkpoint thread is stuck. I will be attempting to reproduce it here as well. | |
| Comment by Alberto Ornaghi [ 16/May/17 ] | |
|
the secondary server stopped working since it filled all the free space. the last server log (in mongodb.log) was: i was tailing it live, but now its a zero byte files. so i cannot send it | |
| Comment by Alberto Ornaghi [ 16/May/17 ] | |
|
uploaded to the same portal. note that yesterday we performed a snapshot on the primary and attached it to the secondary that resynched from that point. | |
| Comment by Susan LoVerso [ 16/May/17 ] | |
|
alberto.ornaghi@gmail.com can you also upload the WiredTiger.wt and WiredTiger.turtle files from your database directory? | |
| Comment by Alberto Ornaghi [ 16/May/17 ] | |
|
I've uploaded the diagnostic data. Thank you | |
| Comment by Kelsey Schubert [ 16/May/17 ] | |
|
Would you please upload an archive of the diagnostic.data directory from the affected node's dbpath? I've created a secure upload portal for you to use. Files uploaded to this portal are only visible to MongoDB employees investigating this issue and are routinely deleted after some time. Thank you, |