[SERVER-34403] The journal files grow out of disk size Created: 10/Apr/18  Updated: 06/Dec/22  Resolved: 10/Apr/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: jinwuzhao Assignee: Backlog - Storage Execution Team
Resolution: Duplicate Votes: 0
Labels: Bug
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OS: Windows Server 2008 R2 Enterprise with Service Pack 1 (64 bit)
CPU: Intel Xeon CPU E5-26xx v2 2.6GHz
RAM: 4 GB
disk: 100 GB disk partition for MongoDB.
MongoDB Replica Set: 3 instances, this instance is the Primary.


Attachments: PNG File QQ20180410-113252.png    
Issue Links:
Duplicate
duplicates WT-3985 Pre-allocated log files accumulate on... Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

Mongod crashed because of no space left on disk. This is the log when crash happened:

2018-04-09T16:13:33.568+0800 I COMMAND [conn484] command sessions.session_common command: findAndModify { findAndModify: "session_common", query:

Unknown macro: { sessionid}

, update: { $inc:

Unknown macro: { maxmsgid}

, $set:

Unknown macro: { activetime}

}, new: true, $db: "sessions" } planSummary: IXSCAN

Unknown macro: { sessionid}

keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keysInserted:1 keysDeleted:1 numYields:0 reslen:417 locks:{ Global: { acquireCount:

Unknown macro: { r}

}, Database: { acquireCount:

Unknown macro: { w}

}, Collection: { acquireCount:

Unknown macro: { w}

}, oplog: { acquireCount:

Unknown macro: { w}

} } protocol:op_query 120ms 2018-04-09T16:13:33.876+0800 E STORAGE [thread3841] WiredTiger error (28) [1523261613:876714][3376:1995256704], log-server: D:\mongodb\data\journal\WiredTigerTmplog.0000001761: handle-set-end: SetEndOfFile: 磁盘空间不足。

: No space left on device 2018-04-09T16:13:33.876+0800 E STORAGE [thread3841] WiredTiger error (28) [1523261613:876714][3376:1995256704], log-server: log pre-alloc server error: No space left on device 2018-04-09T16:13:33.877+0800 E STORAGE [thread3841] WiredTiger error (28) [1523261613:877691][3376:1995256704], log-server: log server error: No space left on device 2018-04-09T16:13:33.877+0800 E STORAGE [thread3841] WiredTiger error (-31804) [1523261613:877691][3376:1995256704], log-server: the process must exit and restart: WT_PANIC: WiredTiger library panic 2018-04-09T16:13:33.886+0800 F - [conn486] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn475] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn483] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn478] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [WTOplogJournalThread] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn487] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn3774] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [thread3841] Fatal Assertion 28558 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 361 2018-04-09T16:13:33.886+0800 F - [conn470] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn486]

***aborting after fassert() failure

2018-04-09T16:13:33.886+0800 F - [conn466] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn475]

***aborting after fassert() failure

2018-04-09T16:13:33.886+0800 F - [conn483]

***aborting after fassert() failure

When I restarted the mongod the disk space was free.
It seems that journal files fill in the whole disk space. But my real dataset is not very big.
This is the database status in MongoDB:

Unknown macro: { "db" }
Unknown macro: { "db" }
Unknown macro: { "db" }

I notice that the number of WiredTigerPreplog file in journal directory always grows and doesn't stop. The total size of these files is even the majority of MongoDB data size.
This is the screenshot on the machine. Note number of WiredTigerPreplog file:
screenshot



 Comments   
Comment by Alexander Gorrod [ 10/Apr/18 ]

I believe you are reporting the same thing that was reported in WT-3985 - the issue should be gone if you upgrade to 3.6.4

Comment by jinwuzhao [ 10/Apr/18 ]

God, the format is error.
This is the log:

2018-04-09T16:13:33.568+0800 I COMMAND [conn484] command sessions.session_common command: findAndModify { findAndModify: "session_common", query: { sessionid: "2912686-2912715" }, update: { $inc: { maxmsgid: 1 }, $set: { activetime: 1523261613364 } }, new: true, $db: "sessions" } planSummary: IXSCAN { sessionid: 1 } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keysInserted:1 keysDeleted:1 numYields:0 reslen:417 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } protocol:op_query 120ms 2018-04-09T16:13:33.876+0800 E STORAGE [thread3841] WiredTiger error (28) [1523261613:876714][3376:1995256704], log-server: D:\mongodb\data\journal\WiredTigerTmplog.0000001761: handle-set-end: SetEndOfFile: 磁盘空间不足。
 
: No space left on device 2018-04-09T16:13:33.876+0800 E STORAGE [thread3841] WiredTiger error (28) [1523261613:876714][3376:1995256704], log-server: log pre-alloc server error: No space left on device 2018-04-09T16:13:33.877+0800 E STORAGE [thread3841] WiredTiger error (28) [1523261613:877691][3376:1995256704], log-server: log server error: No space left on device 2018-04-09T16:13:33.877+0800 E STORAGE [thread3841] WiredTiger error (-31804) [1523261613:877691][3376:1995256704], log-server: the process must exit and restart: WT_PANIC: WiredTiger library panic 2018-04-09T16:13:33.886+0800 F - [conn486] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn475] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn483] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn478] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [WTOplogJournalThread] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn487] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn3774] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [thread3841] Fatal Assertion 28558 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 361 2018-04-09T16:13:33.886+0800 F - [conn470] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn486]
 
***aborting after fassert() failure
 
2018-04-09T16:13:33.886+0800 F - [conn466] Fatal Assertion 28559 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 64 2018-04-09T16:13:33.886+0800 F - [conn475]
 
***aborting after fassert() failure
 
2018-04-09T16:13:33.886+0800 F - [conn483]
 
***aborting after fassert() failure

And the database status:

{
    "db" : "msgs",
    "collections" : 1,
    "views" : 0,
    "objects" : 18937483,
    "avgObjSize" : 502.662245927825,
    "dataSize" : 9519157737.0,
    "storageSize" : 9797206016.0,
    "numExtents" : 0,
    "indexes" : 2,
    "indexSize" : 542953472.0,
    "fsUsedSize" : 20426571776.0,
    "fsTotalSize" : 107237863424.0,
    "ok" : 1.0,
    "operationTime" : Timestamp(1523329048, 279)
}
{
    "db" : "sessions",
    "collections" : 4,
    "views" : 0,
    "objects" : 5873935,
    "avgObjSize" : 210.05505781048,
    "dataSize" : 1233849756.0,
    "storageSize" : 593965056.0,
    "numExtents" : 0,
    "indexes" : 9,
    "indexSize" : 464515072.0,
    "fsUsedSize" : 20533002240.0,
    "fsTotalSize" : 107237863424.0,
    "ok" : 1.0,
    "operationTime" : Timestamp(1523329111, 302)
}
{
    "db" : "local",
    "collections" : 8,
    "views" : 0,
    "objects" : 18568028,
    "avgObjSize" : 285.190052546237,
    "dataSize" : 5295416881.0,
    "storageSize" : 3196956672.0,
    "numExtents" : 0,
    "indexes" : 7,
    "indexSize" : 131072.0,
    "fsUsedSize" : 20886585344.0,
    "fsTotalSize" : 107237863424.0,
    "ok" : 1.0,
    "operationTime" : Timestamp(1523330339, 432)
}

And the screenshot:

Generated at Thu Feb 08 04:36:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.