Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-8190

Some WT fds are not being closed

    • Type: Icon: Bug Bug
    • Resolution: Community Answered
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
      Centos 7/8, WT version 3.2.1
    • 0
    • Storage - Ra 2021-10-04, Storage - Ra 2021-10-18, Storage - Ra 2021-11-01

      Hi,
      We have noticed that in some cases some of the fds to the WT files are not being closed even when the WT configuration is set to "file_manager=(close_handle_minimum=0)".

      __posix_open_file, 669: /var/lib/wiredtiger/3RBWh-mviRc94g5aIVGk61NL_AtTD5uuRrhaxlAznbc_snap.6_base_path.wti: handle-open: open: Too many open files

      We have printed the statistics before closing the connection:

      2021/09/30 06:31:06 - Debug: block-manager: blocks pre-loaded=19037
      2021/09/30 06:31:06 - Debug: block-manager: blocks read=57757
      2021/09/30 06:31:06 - Debug: block-manager: blocks written=136058
      2021/09/30 06:31:06 - Debug: block-manager: bytes read=462M (462360576)
      2021/09/30 06:31:06 - Debug: block-manager: bytes written=1B (1081647104)
      2021/09/30 06:31:06 - Debug: block-manager: bytes written for checkpoint=1B (1029980160)
      2021/09/30 06:31:06 - Debug: cache: application threads page read from disk to cache count=49497
      2021/09/30 06:31:06 - Debug: cache: application threads page read from disk to cache time (usecs)=10M (10335654)
      2021/09/30 06:31:06 - Debug: cache: application threads page write from cache to disk count=116644
      2021/09/30 06:31:06 - Debug: cache: application threads page write from cache to disk time (usecs)=2M (2545413)
      2021/09/30 06:31:06 - Debug: cache: bytes belonging to page images in the cache=19M (19195835)
      2021/09/30 06:31:06 - Debug: cache: bytes belonging to the cache overflow table in the cache=6825
      2021/09/30 06:31:06 - Debug: cache: bytes currently in the cache=27M (27347688)
      2021/09/30 06:31:06 - Debug: cache: bytes dirty in the cache cumulative=7B (7881520762)
      2021/09/30 06:31:06 - Debug: cache: bytes not belonging to page images in the cache=8M (8151852)
      2021/09/30 06:31:06 - Debug: cache: bytes read into cache=1B (1972013794)
      2021/09/30 06:31:06 - Debug: cache: bytes written from cache=4B (4306415623)
      2021/09/30 06:31:06 - Debug: cache: cache overflow table entries=163862
      2021/09/30 06:31:06 - Debug: cache: cache overflow table insert calls=163148
      2021/09/30 06:31:06 - Debug: cache: cache overflow table on-disk size=2M (2547712)
      2021/09/30 06:31:06 - Debug: cache: checkpoint blocked page eviction=1818
      2021/09/30 06:31:06 - Debug: cache: eviction calls to get a page=54217
      2021/09/30 06:31:06 - Debug: cache: eviction calls to get a page found queue empty=2255
      2021/09/30 06:31:06 - Debug: cache: eviction calls to get a page found queue empty after locking=4247
      2021/09/30 06:31:06 - Debug: cache: eviction passes of a file=21858
      2021/09/30 06:31:06 - Debug: cache: eviction server candidate queue empty when topping up=1836
      2021/09/30 06:31:06 - Debug: cache: eviction server candidate queue not empty when topping up=4449
      2021/09/30 06:31:06 - Debug: cache: eviction server evicting pages=39365
      2021/09/30 06:31:06 - Debug: cache: eviction server slept, because we did not make progress with eviction=6568
      2021/09/30 06:31:06 - Debug: cache: eviction server waiting for a leaf page=236659
      2021/09/30 06:31:06 - Debug: cache: eviction state=128
      2021/09/30 06:31:06 - Debug: cache: eviction walk target pages histogram - 0-9=7064
      2021/09/30 06:31:06 - Debug: cache: eviction walk target pages histogram - 10-31=10735
      2021/09/30 06:31:06 - Debug: cache: eviction walk target pages histogram - 32-63=2507
      2021/09/30 06:31:06 - Debug: cache: eviction walk target pages histogram - 64-128=1552
      2021/09/30 06:31:06 - Debug: cache: eviction walks abandoned=258
      2021/09/30 06:31:06 - Debug: cache: eviction walks gave up because they restarted their walk twice=19209
      2021/09/30 06:31:06 - Debug: cache: eviction walks gave up because they saw too many pages and found no candidates=899
      2021/09/30 06:31:06 - Debug: cache: eviction walks gave up because they saw too many pages and found too few candidates=81
      2021/09/30 06:31:06 - Debug: cache: eviction walks reached end of tree=40380
      2021/09/30 06:31:06 - Debug: cache: eviction walks started from root of tree=20278
      2021/09/30 06:31:06 - Debug: cache: eviction walks started from saved location in tree=1580
      2021/09/30 06:31:06 - Debug: cache: eviction worker thread active=1
      2021/09/30 06:31:06 - Debug: cache: eviction worker thread created=9
      2021/09/30 06:31:06 - Debug: cache: eviction worker thread evicting pages=8238
      2021/09/30 06:31:06 - Debug: cache: eviction worker thread removed=23
      2021/09/30 06:31:06 - Debug: cache: eviction worker thread stable number=1
      2021/09/30 06:31:06 - Debug: cache: files with new eviction walks started=21171
      2021/09/30 06:31:06 - Debug: cache: force re-tuning of eviction workers once in a while=21
      2021/09/30 06:31:06 - Debug: cache: forced eviction - pages evicted that were dirty count=107
      2021/09/30 06:31:06 - Debug: cache: forced eviction - pages evicted that were dirty time (usecs)=7M (7951635)
      2021/09/30 06:31:06 - Debug: cache: forced eviction - pages selected because of too many deleted items count=1
      2021/09/30 06:31:06 - Debug: cache: forced eviction - pages selected count=323471
      2021/09/30 06:31:06 - Debug: cache: forced eviction - pages selected unable to be evicted count=323260
      2021/09/30 06:31:06 - Debug: cache: forced eviction - pages selected unable to be evicted time=647653
      2021/09/30 06:31:06 - Debug: cache: hazard pointer blocked page eviction=400
      2021/09/30 06:31:06 - Debug: cache: hazard pointer check calls=371072
      2021/09/30 06:31:06 - Debug: cache: hazard pointer check entries walked=75M (75078847)
      2021/09/30 06:31:06 - Debug: cache: hazard pointer maximum array length=659
      2021/09/30 06:31:06 - Debug: cache: in-memory page passed criteria to be split=213
      2021/09/30 06:31:06 - Debug: cache: in-memory page splits=104
      2021/09/30 06:31:06 - Debug: cache: internal pages evicted=14
      2021/09/30 06:31:06 - Debug: cache: leaf pages split during eviction=306
      2021/09/30 06:31:06 - Debug: cache: maximum bytes configured=1B (1887436800)
      2021/09/30 06:31:06 - Debug: cache: maximum page size at eviction=144781
      2021/09/30 06:31:06 - Debug: cache: modified pages evicted=45575
      2021/09/30 06:31:06 - Debug: cache: page written requiring cache overflow records=171
      2021/09/30 06:31:06 - Debug: cache: pages currently held in the cache=901
      2021/09/30 06:31:06 - Debug: cache: pages queued for eviction=51826
      2021/09/30 06:31:06 - Debug: cache: pages queued for eviction post lru sorting=104809
      2021/09/30 06:31:06 - Debug: cache: pages queued for urgent eviction=738
      2021/09/30 06:31:06 - Debug: cache: pages queued for urgent eviction during walk=78
      2021/09/30 06:31:06 - Debug: cache: pages read into cache=51590
      2021/09/30 06:31:06 - Debug: cache: pages read into cache after truncate=351
      2021/09/30 06:31:06 - Debug: cache: pages read into cache requiring cache overflow entries=250
      2021/09/30 06:31:06 - Debug: cache: pages requested from the cache=88M (88438866)
      2021/09/30 06:31:06 - Debug: cache: pages seen by eviction walk=5M (5545632)
      2021/09/30 06:31:06 - Debug: cache: pages selected for eviction unable to be evicted=324772
      2021/09/30 06:31:06 - Debug: cache: pages walked for eviction=6M (6837565)
      2021/09/30 06:31:06 - Debug: cache: pages written from cache=124757
      2021/09/30 06:31:06 - Debug: cache: pages written requiring in-memory restoration=35494
      2021/09/30 06:31:06 - Debug: cache: percentage overhead=8
      2021/09/30 06:31:06 - Debug: cache: tracked bytes belonging to internal pages in the cache=142679
      2021/09/30 06:31:06 - Debug: cache: tracked bytes belonging to leaf pages in the cache=27M (27205009)
      2021/09/30 06:31:06 - Debug: cache: tracked dirty bytes in the cache=6825
      2021/09/30 06:31:06 - Debug: cache: tracked dirty pages in the cache=1
      2021/09/30 06:31:06 - Debug: cache: unmodified pages evicted=624
      2021/09/30 06:31:06 - Debug: capacity: bytes read=437M (437100544)
      2021/09/30 06:31:06 - Debug: capacity: bytes written for checkpoint=744M (744259682)
      2021/09/30 06:31:06 - Debug: capacity: bytes written for eviction=40M (40614092)
      2021/09/30 06:31:06 - Debug: capacity: bytes written total=784M (784873774)
      2021/09/30 06:31:06 - Debug: connection: auto adjusting condition resets=5366
      2021/09/30 06:31:06 - Debug: connection: auto adjusting condition wait calls=8341
      2021/09/30 06:31:06 - Debug: connection: files currently open=8
      2021/09/30 06:31:06 - Debug: connection: memory allocations=229M (229224579)
      2021/09/30 06:31:06 - Debug: connection: memory frees=229M (229637465)
      2021/09/30 06:31:06 - Debug: connection: memory re-allocations=27M (27718659)
      2021/09/30 06:31:06 - Debug: connection: pthread mutex condition wait calls=19008
      2021/09/30 06:31:06 - Debug: connection: pthread mutex shared lock read-lock calls=44M (44706620)
      2021/09/30 06:31:06 - Debug: connection: pthread mutex shared lock write-lock calls=14M (14458701)
      2021/09/30 06:31:06 - Debug: connection: total fsync I/Os=7711
      2021/09/30 06:31:06 - Debug: connection: total read I/Os=60683
      2021/09/30 06:31:06 - Debug: connection: total write I/Os=137256
      2021/09/30 06:31:06 - Debug: cursor: cursor close calls that result in cache=172
      2021/09/30 06:31:06 - Debug: cursor: cursor create calls=17M (17459015)
      2021/09/30 06:31:06 - Debug: cursor: cursor insert calls=17M (17131298)
      2021/09/30 06:31:06 - Debug: cursor: cursor insert key and value bytes=1B (1601050341)
      2021/09/30 06:31:06 - Debug: cursor: cursor next calls=12M (12169274)
      2021/09/30 06:31:06 - Debug: cursor: cursor operation restarted=109
      2021/09/30 06:31:06 - Debug: cursor: cursor prev calls=172
      2021/09/30 06:31:06 - Debug: cursor: cursor remove calls=16M (16434101)
      2021/09/30 06:31:06 - Debug: cursor: cursor remove key bytes removed=1B (1175896873)
      2021/09/30 06:31:06 - Debug: cursor: cursor reset calls=30M (30875091)
      2021/09/30 06:31:06 - Debug: cursor: cursor search calls=14M (14061508)
      2021/09/30 06:31:06 - Debug: cursor: cursor search near calls=11M (11633812)
      2021/09/30 06:31:06 - Debug: cursor: cursor update calls=5M (5633198)
      2021/09/30 06:31:06 - Debug: cursor: cursor update key and value bytes=788M (788609166)
      2021/09/30 06:31:06 - Debug: cursor: open cursor count=18
      2021/09/30 06:31:06 - Debug: data-handle: connection data handle size=424
      2021/09/30 06:31:06 - Debug: data-handle: connection data handles currently active=7files 
      2021/09/30 06:31:06 - Debug: data-handle: connection sweep dhandles closed=1199
      2021/09/30 06:31:06 - Debug: data-handle: connection sweep dhandles removed from hash list=5989
      2021/09/30 06:31:06 - Debug: data-handle: connection sweep time-of-death sets=6330
      2021/09/30 06:31:06 - Debug: data-handle: connection sweeps=253
      2021/09/30 06:31:06 - Debug: data-handle: session dhandles swept=2346
      2021/09/30 06:31:06 - Debug: data-handle: session sweep attempts=317
      2021/09/30 06:31:06 - Debug: lock: checkpoint lock acquisitions=516
      2021/09/30 06:31:06 - Debug: lock: checkpoint lock application thread wait time (usecs)=26M (26369969)
      2021/09/30 06:31:06 - Debug: lock: dhandle lock application thread time waiting (usecs)=57
      2021/09/30 06:31:06 - Debug: lock: dhandle lock internal thread time waiting (usecs)=11
      2021/09/30 06:31:06 - Debug: lock: dhandle read lock acquisitions=902669
      2021/09/30 06:31:06 - Debug: lock: dhandle write lock acquisitions=12329
      2021/09/30 06:31:06 - Debug: lock: metadata lock acquisitions=862
      2021/09/30 06:31:06 - Debug: lock: metadata lock application thread wait time (usecs)=3
      2021/09/30 06:31:06 - Debug: lock: metadata lock internal thread wait time (usecs)=5
      2021/09/30 06:31:06 - Debug: lock: schema lock acquisitions=2239
      2021/09/30 06:31:06 - Debug: lock: schema lock application thread wait time (usecs)=284904
      2021/09/30 06:31:06 - Debug: lock: table lock application thread time waiting for the table lock (usecs)=20M (20356789)
      2021/09/30 06:31:06 - Debug: lock: table lock internal thread time waiting for the table lock (usecs)=888252
      2021/09/30 06:31:06 - Debug: lock: table read lock acquisitions=15
      2021/09/30 06:31:06 - Debug: lock: table write lock acquisitions=14M (14386202)
      2021/09/30 06:31:06 - Debug: lock: txn global lock application thread time waiting (usecs)=114
      2021/09/30 06:31:06 - Debug: lock: txn global lock internal thread time waiting (usecs)=1
      2021/09/30 06:31:06 - Debug: lock: txn global read lock acquisitions=56552
      2021/09/30 06:31:06 - Debug: lock: txn global write lock acquisitions=47105
      2021/09/30 06:31:06 - Debug: perf: operation read latency histogram (bucket 1) - 100-249us=10742
      2021/09/30 06:31:06 - Debug: perf: operation read latency histogram (bucket 2) - 250-499us=2077
      2021/09/30 06:31:06 - Debug: perf: operation read latency histogram (bucket 3) - 500-999us=849
      2021/09/30 06:31:06 - Debug: perf: operation read latency histogram (bucket 4) - 1000-9999us=190
      2021/09/30 06:31:06 - Debug: perf: operation read latency histogram (bucket 5) - 10000us+=3
      2021/09/30 06:31:06 - Debug: perf: operation write latency histogram (bucket 1) - 100-249us=14281
      2021/09/30 06:31:06 - Debug: perf: operation write latency histogram (bucket 2) - 250-499us=3601
      2021/09/30 06:31:06 - Debug: perf: operation write latency histogram (bucket 3) - 500-999us=3814
      2021/09/30 06:31:06 - Debug: perf: operation write latency histogram (bucket 4) - 1000-9999us=759
      2021/09/30 06:31:06 - Debug: perf: operation write latency histogram (bucket 5) - 10000us+=380
      2021/09/30 06:31:06 - Debug: reconciliation: page reconciliation calls=152973
      2021/09/30 06:31:06 - Debug: reconciliation: page reconciliation calls for eviction=40380
      2021/09/30 06:31:06 - Debug: reconciliation: pages deleted=6727
      2021/09/30 06:31:06 - Debug: reconciliation: split bytes currently awaiting free=7808
      2021/09/30 06:31:06 - Debug: reconciliation: split objects currently awaiting free=45
      2021/09/30 06:31:06 - Debug: session: open session count=14
      2021/09/30 06:31:06 - Debug: session: table create successful calls=344
      2021/09/30 06:31:06 - Debug: session: table drop successful calls=172
      2021/09/30 06:31:06 - Debug: thread-yield: page acquire busy blocked=1
      2021/09/30 06:31:06 - Debug: thread-yield: page acquire eviction blocked=4
      2021/09/30 06:31:06 - Debug: thread-yield: page acquire locked blocked=892
      2021/09/30 06:31:06 - Debug: thread-yield: page acquire time sleeping (usecs)=730800
      2021/09/30 06:31:06 - Debug: transaction: transaction begins=5M (5470159)
      2021/09/30 06:31:06 - Debug: transaction: transaction checkpoint generation=345
      2021/09/30 06:31:06 - Debug: transaction: transaction checkpoint max time (msecs)=2986
      2021/09/30 06:31:06 - Debug: transaction: transaction checkpoint min time (msecs)=16
      2021/09/30 06:31:06 - Debug: transaction: transaction checkpoint most recent time (msecs)=46
      2021/09/30 06:31:06 - Debug: transaction: transaction checkpoint scrub time (msecs)=100
      2021/09/30 06:31:06 - Debug: transaction: transaction checkpoint total time (msecs)=179410
      2021/09/30 06:31:06 - Debug: transaction: transaction checkpoints=344
      2021/09/30 06:31:06 - Debug: transaction: transaction fsync calls for checkpoint after allocating the transaction ID=344
      2021/09/30 06:31:06 - Debug: transaction: transaction fsync duration for checkpoint after allocating the transaction ID (usecs)=1824
      2021/09/30 06:31:06 - Debug: transaction: transactions committed=5M (5470243)

      Can you please help us identify why those files stays open? is there any configuration that we have missed?

      Thank you

            Assignee:
            etienne.petrel@mongodb.com Etienne Petrel
            Reporter:
            tom@jetapps.com Tom M
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: