Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7671

Excessive amounts of dhandles open, when calling session verify

    • Storage Engines
    • 5
    • Nick - 2024-04-30

      From looking at BF-21361, WiredTiger is opening an excessive amount of files, reaching almost 64K open files. The recommended machine file limit is 64K.
      here.
      The error seems to be failing from:

          /* Create/Open the file. */
          WT_SYSCALL_RETRY(((pfh->fd = open(name, f, mode)) == -1 ? -1 : 0), ret);
          if (ret != 0)
              WT_ERR_MSG(session, ret,
                pfh->direct_io ? "%s: handle-open: open: failed with direct I/O configured, some "
                                 "filesystem types do not support direct I/O" :
                                 "%s: handle-open: open",
                name);
      

      Looking at the FTDC data:

      • This shows that dhandles are not being discarded, when verify calls are occurring, eventually reaching to a point where the machine's limit is reached.
      • This bug could potentially be occurring, from the dhandle not being able to be cleaned in it's interaction with session verify.
      • I have verified that session verify itself does not open any new dhandles through tracking the statistics.
      • Through the ss wt cache bytes read into cache metric, it also conveys that verify is reading in a lot of bytes, which is expected.
      • We also require the dhandle, checkpoint and schema lock from verify.

      Investigation:

      • Understand the interaction between how dhandle interacts with verify
      • Investigate why the dhandle write lock is being acquired.
      • Look at possible reasons why the old dhandle's are not being sweeped

            Assignee:
            andrew.morton@mongodb.com Andrew Morton
            Reporter:
            jie.chen@mongodb.com Jie Chen
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: