Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-14140

Unnecessary schema lock taken for active "file:" dhandles that are not swept

    • Storage Engines, Storage Engines - Foundations
    • 8
    • StorEng - 2025-03-14, StorEng - 2025-03-28, StorEng - 2025-04-25
    • v8.1, v8.0, v7.0, v6.0

      Fix the issue where an unnecessary schema lock is taken for actively used file: -prefixed dhandles because their corresponding table:-prefixed dhandles are expired by the sweep server. This leads to schema lock contention, especially during checkpoint prepare, affecting performance.

      Description

      When opening a file: -prefixed dhandle, the table:-prefixed dhandle is used to determine if a corresponding file: -prefixed dhandle is a simple table. However, the sweep server sweeps table:-prefixed dhandles, leading to their premature expiration.

      Since file: -prefixed dhandles have places to reset their "Time of Death", they remain active. But there is no "Time of Death" reset mechanism for table:-prefixed dhandles during schema operations, causing them to expire. Also, even when there are no schema ops, corresponding table: -prefixed dhandles are expired by the sweep server.

      This results in:

      • Unnecessary reopening of table:-prefixed dhandles, requiring a schema lock.
      • Schema lock contention when a checkpoint is preparing, since it also needs the schema lock.
      • Performance degradation due to increased blocking between application threads and the checkpoint thread.

      Reproducer

      In the Python test below, I create 1,000 dhandles and then spawn 1,000 threads to perform inserts for 100 iterations, ensuring that all dhandles remain active.

      def insert(self, i, start, rows):
          session = self.conn.open_session()
          uri = self.uri + str(i)
          cursor = session.open_cursor(uri)
          session.begin_transaction()
          for i in range(start, rows):
               cursor.set_key(i)
               cursor.set_value(str(i))
               cursor.insert()
          session.commit_transaction()
           cursor.close()
           session.close()
      
      def test_dhandles(self):
          dhandles = 1000
          for i in range(1,dhandles):
              uri = self.uri + str(i)
              self.session.create(uri, format)
      
          for i in range(1,100):
              threads = []
              for i in range(1,dhandles):
                  thread = threading.Thread(target=self.insert, args=(i, 0, 100))
                  thread.start()
                  threads.append(thread)
              
              for thread in threads:
                 thread.join()
      

      To test the sweep server, I modify the config accordingly

      file_manager=(close_handle_minimum=0,close_idle_time=60,close_scan_interval=30)

      Scope:

      • Decide the potential solutions described in WT-13663.
      • Fix the issue by applying the chosen solution.

        1. screenshot-1.png
          screenshot-1.png
          72 kB
        2. Screenshot 2025-03-22 at 9.45.09 AM.png
          Screenshot 2025-03-22 at 9.45.09 AM.png
          172 kB
        3. Screenshot 2025-03-22 at 9.45.23 AM.png
          Screenshot 2025-03-22 at 9.45.23 AM.png
          193 kB

            Assignee:
            ravi.giri@mongodb.com Ravi Giri
            Reporter:
            siddhartha.mahajan@mongodb.com Sid Mahajan
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: