[SERVER-16155] After hitting open file limits - WT goes into a loop at shutdown and needs forced kill Created: 14/Nov/14  Updated: 11/Jul/16  Resolved: 14/Nov/14

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 2.8.0-rc1

Type: Bug Priority: Critical - P2
Reporter: Anil Kumar Assignee: Unassigned
Resolution: Done Votes: 0
Labels: wiredtiger
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File wired-tiger-run-max-openfiles.log.gz    
Operating System: ALL
Participants:

 Description   

Once WT instance encounters a open files limit for the files on disk, normal shutdown goes into a loop with following errors:

2014-11-14T10:22:23.938-0500 E STORAGE  [conn1] WiredTiger (0) [1415978543:938879][3539:0x10eff7000], file:WiredTiger.wt, cursor.search: the WiredTiger library cannot continue; the process must exit and restart
2014-11-14T10:22:23.939-0500 E STORAGE  [conn1] WiredTiger (0) [1415978543:938998][3539:0x10eff7000], file:WiredTiger.wt, cursor.close: the WiredTiger library cannot continue; the process must exit and restart
2014-11-14T10:22:23.939-0500 E STORAGE  [conn1] WiredTiger (0) [1415978543:939113][3539:0x10eff7000], cursor.set_key: the WiredTiger library cannot continue; the process must exit and restart

Eventually this needs a hard kill of the process.

Steps:
1. Create scenario for running out of file descriptors

 
for (i = 0; i < 2430; i++) { db.getCollection("coll-" + i).insert({}); } 

2. db.shutdownServer() or kill <pid>
3. WT goes into loop needing kill -9



 Comments   
Comment by Ramon Fernandez Marina [ 28/Jan/15 ]

soner, can you please open a new ticket and elaborate on what you mean with "the same problem"? Please upload full logs and reproduction details if applicable.

Thanks,
Ramón.

Comment by Soner K [ 28/Jan/15 ]

How is this fixed? I'm on 2.8.0-rc5 and still have this problem. Is there any solution available?

Comment by Anil Kumar [ 14/Nov/14 ]

Latest master branch no longer gets into endless loop and stops with an abort / panic:

2014-11-14T19:20:31.402+0000 D STORAGE  [conn1] create uri: table:index-3-3856557546900843133 config: type=file,leaf_page_max=16k,,key_format=u,value_format=u,collator=mongo_index,app_metadata={ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "local.coll-1" }
2014-11-14T19:20:31.823+0000 E STORAGE  WiredTiger (24) [1415992831:823601][7021:0x1107d7000], archive-server: /Users/aks/code/os/mongodb/mongo/data/journal: opendir: Too many open files
2014-11-14T19:20:31.823+0000 E STORAGE  WiredTiger (24) [1415992831:823857][7021:0x1107d7000], archive-server: dirlist journal prefix WiredTigerLog: Too many open files
2014-11-14T19:20:31.824+0000 E STORAGE  WiredTiger (24) [1415992831:824038][7021:0x1107d7000], archive-server: log archive server error: Too many open files
2014-11-14T19:20:31.824+0000 E STORAGE  WiredTiger (-31803) [1415992831:824297][7021:0x1107d7000], archive-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
2014-11-14T19:20:31.824+0000 I -        Fatal Assertion 28558
2014-11-14T19:20:31.832+0000 I CONTROL
 0x109c0399a 0x109b9f7db 0x109b8d25f 0x1099cac2b 0x10a068950 0x10a068aa9 0x10a069054 0x10a01f4e3 0x7fff9812d2fc 0x7fff9812d279 0x7fff9812b4b1
----- BEGIN BACKTRACE -----

Comment by Eric Milkie [ 14/Nov/14 ]

What version?
(This might already be fixed in master branch)

Generated at Thu Feb 08 03:40:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.