[SERVER-22117] WiredTiger journal files not deleted/ Way too many journal files Created: 11/Jan/16  Updated: 06/Dec/22  Resolved: 30/Mar/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: 3.2.5, 3.3.4

Type: Bug Priority: Major - P3
Reporter: Maciej Galkowski Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 0
Labels: WTplaybook, code-and-test
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File diagnostics.data.tar.gz    
Issue Links:
Depends
depends on WT-2264 Checkpoints cannot keep up with inserts Closed
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Participants:

 Description   

Hi,
We have notices a large journal file buildup when using WiredTiger.
We are pushing a lot of test data into Mongo to test the performance.

We did notice that the WiredTiger journal files are not deleted and just left in the /journal dir when we are pushing the data.
The files are deleted when we stop pushing any new data to the Mongo, but in the production environment it wouldn't happen.

From my understanding of the documentation the journal files would be flushed frequently, not left on the disk indefinitely.

We are using MongoDB 3.2 on FreeBSD 10.1. The files are on zfs, with atime disabled. We are using zlib on both journal files and data files, but we did see the same issue with snappy.

Currently we have ~520 journal files, and growing every second. They are taking 16GB of space at the moment.

Is is normal? I believe it is a bug, there should be a way to limit the number of journal files.



 Comments   
Comment by Rob Offer [ 01/Apr/18 ]

Thank you, I have downgraded for now.

Comment by Kelsey Schubert [ 01/Apr/18 ]

Hi rob.offer@footy.com,

Thank you for the report. Generally it's preferable to open a new ticket – that way we can keep investigations of different issues from getting muddled. In this case, I believe you're encountering WT-3985, which will be fixed in the next release, MongoDB 3.6.4. In the meantime, as workarounds, I would suggest either occasionally restarting the nodes to clear the journal files or downgrading to 3.6.2.

Kind regards,
Kelsey

Comment by Rob Offer [ 01/Apr/18 ]

Apologies if posting on this ticket is not the right place, but we are seeing this problem in production. We recently upgraded to 3.6.3

We are running a replica set and one server has already filled up, the other is starting to run out of space. We are looking at approaching 100 GB.

We are running on WIndows in Azure, if there is any info I can provide then please let me know.

Comment by Kelsey Schubert [ 02/May/17 ]

HI juanroy,

This bug was resolved during the development of MongoDB 3.4 and all versions of MongoDB 3.4.x contain this fix.

The issue you describe is likely related to WT-3264, which is currently in progress. If you encounter this issue again, please feel free to open a new SERVER ticket so we can confirm this diagnosis. We would need the following information to conclusively determine whether the behavior you've observed is caused by WT-3264:

  • All WiredTiger* files
  • _mdb_catalog.wt
  • sizeStorer.wt
  • output of an ls -lR database-directory
  • mongod.logs
  • diagnostic.data

In addition, we may need to inspect some WiredTiger journal files.

Thank you,
Thomas

Comment by Juan Antonio Roy Couto [ 02/May/17 ]

Hello, @michael, @thomas. I have had this issue.
My MongoDB version is: 3.4.2
What is the WiredTiger's version of MongoDB 3.4.2?
This issue is solved in 2.8.0 version of WiredTiger, right?
I have resynced my node and I do not have the content of the $dbpath/diagnostic.data directory!
Thank you

Comment by Michael Cahill (Inactive) [ 30/Mar/16 ]

Fixed by latest merge of WiredTiger, see WT-2264.

Comment by Kelsey Schubert [ 11/Jan/16 ]

Hi mgalkowski,

Thank you for uploading the diagnostic data. We have identified this as a known issue in the WiredTiger storage engine: WT-2264. Please feel free to watch WT-2264 in addition to this ticket for updates.

Kind regards,
Thomas

Comment by Maciej Galkowski [ 11/Jan/16 ]

Attaching diagnostics file from today.

Comment by Kelsey Schubert [ 11/Jan/16 ]

Hi mgalkowski,

Can you please archive the $dbpath/diagnostic.data directory and attach it to this ticket? These files contain periodically collected serverStatus data, which will help us to identify what is happening here.

Thank you,
Thomas

Comment by Maciej Galkowski [ 11/Jan/16 ]

UPDATE :
After ~2.5 hours since reporting this bug, the journal have grown to 51GB, and there are 2039 journal files in our /journal directory. We are still pushing the test data into MongoDB

Comment by Maciej Galkowski [ 11/Jan/16 ]

I just checked the procstat result :

#procstat -f 34331
 PID COMM               FD T V FLAGS     REF  OFFSET PRO NAME        
34331 mongod            text v r r-------  -       - -   /root/mongodb-src-r3.2.0/mongod
34331 mongod            ctty v c rw------  -       - -   /dev/pts/1        
34331 mongod             cwd v d r-------  -       - -   /root/mongodb-src-r3.2.0
34331 mongod            root v d r-------  -       - -   /                 
34331 mongod               0 v c rw------  7 400291541 -   /dev/pts/1        
34331 mongod               1 v c rw------  7 400291541 -   /dev/pts/1        
34331 mongod               2 v c rw------  7 400291541 -   /dev/pts/1        
34331 mongod               3 v c r-------  1    4096 -   /dev/random       
34331 mongod               4 v c r-------  1    4096 -   /dev/random       
34331 mongod               5 s - rw------  1       0 TCP 0.0.0.0:27017 0.0.0.0:0
34331 mongod               6 s - rw------  1       0 UDS /tmp/mongodb-27017.sock
34331 mongod               7 v r rw-----l  1       6 -   /var/db/mongodb/mongod.lock
34331 mongod               8 v r rw------  1       0 -   /var/db/mongodb/WiredTiger.lock
34331 mongod               9 v r rw------  1       0 -   /var/db/mongodb/WiredTiger.wt
34331 mongod              10 v d r-------  1       0 -   /var/db/mongodb/journal
34331 mongod              11 v r rw------  2       0 -   -                 
34331 mongod              12 v r rw------  1       0 -   /var/db/mongodb/WiredTigerLAS.wt
34331 mongod              13 v r rw------  1       0 -   /var/db/mongodb/sizeStorer.wt
34331 mongod              14 v r rw------  1       0 -   /var/db/mongodb/_mdb_catalog.wt
34331 mongod              15 v r rw------  1       0 -   /var/db/mongodb/collection/0-8714697171096101635.wt
34331 mongod              16 v r rw------  2       0 -   /var/db/mongodb/collection/2-8714697171096101635.wt
34331 mongod              17 v r -wa-----  1 2621440 -   -                 
34331 mongod              18 v r rw------  1       0 -   /var/db/mongodb/index/1-8714697171096101635.wt
34331 mongod              21 v r rw------  1       0 -   /var/db/mongodb/index/3-8714697171096101635.wt
[...]
34331 mongod              31 v r rw------  1       0 -   /var/db/mongodb/index/4-8714697171096101635.wt
34331 mongod              32 v r rw------  1       0 -   /var/db/mongodb/index/5-8714697171096101635.wt
34331 mongod              33 v r rw------  1       0 -   /var/db/mongodb/index/6-8714697171096101635.wt
34331 mongod              34 v r rw------  1       0 -   /var/db/mongodb/index/7-8714697171096101635.wt
34331 mongod              36 v r rw------  2       0 -   /var/db/mongodb/journal/WiredTigerLog.0000001378

# procstat -f 34331
 PID COMM               FD T V FLAGS     REF  OFFSET PRO NAME        
34331 mongod            text v r r-------  -       - -   /root/mongodb-src-r3.2.0/mongod
34331 mongod            ctty v c rw------  -       - -   /dev/pts/1        
34331 mongod             cwd v d r-------  -       - -   /root/mongodb-src-r3.2.0
34331 mongod            root v d r-------  -       - -   /                 
34331 mongod               0 v c rw------  7 400474300 -   /dev/pts/1        
34331 mongod               1 v c rw------  7 400474300 -   /dev/pts/1        
34331 mongod               2 v c rw------  7 400474300 -   /dev/pts/1        
34331 mongod               3 v c r-------  1    4096 -   /dev/random       
34331 mongod               4 v c r-------  1    4096 -   /dev/random       
34331 mongod               5 s - rw------  1       0 TCP 0.0.0.0:27017 0.0.0.0:0
34331 mongod               6 s - rw------  1       0 UDS /tmp/mongodb-27017.sock
34331 mongod               7 v r rw-----l  1       6 -   /var/db/mongodb/mongod.lock
34331 mongod               8 v r rw------  1       0 -   /var/db/mongodb/WiredTiger.lock
34331 mongod               9 v r rw------  1       0 -   /var/db/mongodb/WiredTiger.wt
34331 mongod              10 v d r-------  1       0 -   /var/db/mongodb/journal
34331 mongod              11 v r rw------  1       0 -   /var/db/mongodb/journal/WiredTigerLog.0000001393
34331 mongod              12 v r rw------  1       0 -   /var/db/mongodb/WiredTigerLAS.wt
34331 mongod              13 v r rw------  1       0 -   /var/db/mongodb/sizeStorer.wt
34331 mongod              14 v r rw------  1       0 -   /var/db/mongodb/_mdb_catalog.wt
34331 mongod              15 v r rw------  1       0 -   /var/db/mongodb/collection/0-8714697171096101635.wt
34331 mongod              16 v r rw------  1       0 -   /var/db/mongodb/collection/2-8714697171096101635.wt
34331 mongod              17 v r -wa-----  1 2621440 -   -                 
34331 mongod              18 v r rw------  1       0 -   /var/db/mongodb/index/1-8714697171096101635.wt
34331 mongod              21 v r rw------  1       0 -   /var/db/mongodb/index/3-8714697171096101635.wt
[...]
34331 mongod              31 v r rw------  1       0 -   /var/db/mongodb/index/4-8714697171096101635.wt
34331 mongod              32 v r rw------  1       0 -   /var/db/mongodb/index/5-8714697171096101635.wt
34331 mongod              33 v r rw------  1       0 -   /var/db/mongodb/index/6-8714697171096101635.wt
34331 mongod              34 v r rw------  1       0 -   /var/db/mongodb/index/7-8714697171096101635.wt

Only one journal time is open at a time, apparently.

Generated at Thu Feb 08 03:59:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.