[SERVER-84145] Mongodb 5.0.20 process is getting crashed due to higher OS cache memory utilization. Created: 05/Dec/23 Updated: 25/Jan/24 Resolved: 25/Jan/24 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Sreedhar N | Assignee: | Chris Kelly |
| Resolution: | Done | Votes: | 51 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 20.04 and Mongodb 5.0.20 |
||
| Attachments: |
|
| Assigned Teams: |
Server Triage
|
| Participants: |
| Description |
|
Mongodb process is getting crashed after upgrading to 5.0.20 from 4.4.18. OS cache memory utlization is going higher with in few hours and mongdb process is getting crashed. Setup details: Scenario: Crash Info: {"t":\{"$date":"2023-11-28T15:30:56.755+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"Checkpointer","msg":"WiredTiger error","attr":{"error":22,"message":"[1701185456:755161][12787:0x7fd437302700], file:collection-45-1301129529321625809.wt, WT_SESSION.checkpoint: __wt_block_checkpoint_resolve, 928: collection-45-1301129529321625809.wt: the checkpoint failed, the system must restart: Invalid argument"}} {"t":\{"$date":"2023-11-28T15:30:56.755+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"Checkpointer","msg":"WiredTiger error","attr":{"error":-31804,"message":"[1701185456:755177][12787:0x7fd437302700], file:collection-45-1301129529321625809.wt, WT_SESSION.checkpoint: __wt_block_checkpoint_resolve, 928: the process must exit and restart: WT_PANIC: WiredTiger library panic"}} {"t":\{"$date":"2023-11-28T15:30:56.755+00:00"},"s":"F", "c":"-", "id":23089, "ctx":"Checkpointer","msg":"Fatal assertion","attr":{"msgid":50853,"file":"src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp","line":574}} {"t":\{"$date":"2023-11-28T15:30:56.755+00:00"},"s":"F", "c":"-", "id":23090, "ctx":"Checkpointer","msg":"\n\n***aborting after fassert() failure\n\n"} {"t":\{"$date":"2023-11-28T15:30:56.755+00:00"},"s":"F", "c":"CONTROL", "id":6384300, "ctx":"Checkpointer","msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}} Please note that in Mongodb 4.4.18, we have performed the above scenario and not seen the crash and OS cache memory utilization was constant. |
| Comments |
| Comment by Chris Kelly [ 25/Jan/24 ] |
|
Hi sreedhar.nalgonda@gmail.com, It looks like the server is crashing due to "No space left on device" errors immediately preceding the errors you are pointing out. Specifically:
For this issue we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums. If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project. |
| Comment by Sreedhar N [ 03/Jan/24 ] |
|
Hi Chris Kelly, Thanks for providing an access to upload files. Files are uploaded now. Kindly have a look and provide solution for crash. Thanks, Sreedhar |
| Comment by Chris Kelly [ 27/Dec/23 ] |
|
Hi sreedhar.nalgonda@gmail.com! Thanks for your report, and your patience here. Anecdotally, this an error I've seen when we run out of space on a device (WT-11906) but I can't discern enough information here. To look into this further, would you please archive (tar or zip) the mongod.log files and the $dbpath/diagnostic.data directory (the contents are described here) and upload them to this support uploader location? Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time. |