Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-5479

Checkpoint schema lock can cause long stalls in command execution on systems with many active data handles

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: WT3.1.0, WT3.2.1
    • Fix Version/s: Backlog
    • Component/s: Checkpoints
    • Labels:
      None
    • Case:

      Description

      I'm creating this issue related to WT-1598, mostly for documentation purposes, because it took us weeks to figure out what was happening and we could not find good information online, before debugging the WT code ourselves. The only other mention of this problem we found was in a StackExchange post that also has no solution to this date.

      Our application is neither very heavy in document or collection size, so we did not yet see the necessity for horizontal scaling. But it does use a large number of collections and indexes.

      "files currently open": 126583
      "connection data handles currently active": 209848

      Our problems began after upgrading to MongoDB 3.6 (WT3.1) from MongoDB 3.4 (WT2.9). One obvious difference we noticed is that the number of data handles was previously similar to the number of open files, but now seems to be about twice that, as can be seen above. A possible reason may be the introduction of client sessions, but this is merely conjecture. The checkpoint code mentioned below was also changed significantly between WT2 and WT3.

      What we started noticing were occasional extreme slow queries, that hung solely due to schema lock acquisition and were not reproducible.

      planSummary: IXSCAN { ... }
      keysExamined:0
      ..
      nreturned:0
      ...
      storage: { timeWaitingMicros: { schemaLock: 89210873 } }
      protocol:op_msg 89214ms

      Eventually, we traced the issue back to the WiredTigerCheckpointThread which acquires the lock in the following line of txn_ckpt.c:

      WT_WITH_SCHEMA_LOCK(session, ret = __checkpoint_prepare(session, &tracking, cfg));

      and then subsequently spends a large amount of time iterating over all active data handles as part of the following line of the same file:

      __checkpoint_apply_all(session, cfg, __wt_checkpoint_get_handles))

      Commands are only occasionally affected by this, namely when they require a new data handle (e.g. an index which has not been used or is in the wrong mode). In this case they run into the following code block of session_dhandle.c:

      /*
       * For now, we need the schema lock and handle list locks to
       * open a file for real.
       *
       * Code needing exclusive access (such as drop or verify)
       * assumes that it can close all open handles, then open an
       * exclusive handle on the active tree and no other threads can
       * reopen handles in the meantime.  A combination of the schema
       * and handle list locks are used to enforce this.
       */
      if (!F_ISSET(session, WT_SESSION_LOCKED_SCHEMA)) {
      	dhandle->excl_session = NULL;
      	dhandle->excl_ref = 0;
      	F_CLR(dhandle, WT_DHANDLE_EXCLUSIVE);
      	__wt_writeunlock(session, &dhandle->rwlock);
       
      	WT_WITH_SCHEMA_LOCK(
      	  session, ret = __wt_session_get_dhandle(session, uri, checkpoint, cfg, flags));
       
      	return (ret);
      }

      I am not sure if improvements to this problem can be made without all of WT-1598. At the very least this current limitation of the WT storage engine could be documented to advise application developers against excessive use of collections and indexes.

      Any recommendations for tweaks to our setup are also welcome. Thank you!

        Attachments

        1. 4.2.9_vs_4.2.10.png
          4.2.9_vs_4.2.10.png
          143 kB
        2. bench-schema-locks.js
          6 kB
        3. image (2).png
          image (2).png
          392 kB
        4. image (3).png
          image (3).png
          195 kB
        5. image (4).png
          image (4).png
          87 kB
        6. image-2020-12-14-11-20-21-140.png
          image-2020-12-14-11-20-21-140.png
          138 kB
        7. image-2020-12-14-16-21-43-885.png
          image-2020-12-14-16-21-43-885.png
          77 kB
        8. MongoDB-v-Percona.png
          MongoDB-v-Percona.png
          148 kB
        9. mongo-upgrade-4.2-4.4.png
          mongo-upgrade-4.2-4.4.png
          46 kB
        10. perf-kernel.svg
          2.42 MB
        11. screenshot1.png
          screenshot1.png
          80 kB
        12. screenshot-1.png
          screenshot-1.png
          114 kB
        13. screenshot2.png
          screenshot2.png
          72 kB
        14. screenshot-2.png
          screenshot-2.png
          38 kB
        15. Screen Shot 2020-03-02 at 2.50.55 PM.png
          Screen Shot 2020-03-02 at 2.50.55 PM.png
          97 kB
        16. screenshot3.png
          screenshot3.png
          13 kB
        17. screenshot-3.png
          screenshot-3.png
          88 kB
        18. screenshot4.png
          screenshot4.png
          24 kB
        19. screenshot-4.png
          screenshot-4.png
          222 kB
        20. WT_open_files_schema_lock.png
          WT_open_files_schema_lock.png
          33 kB
        21. wt-5479.png
          wt-5479.png
          192 kB
        22. wt-5479-2.png
          wt-5479-2.png
          98 kB

          Issue Links

            Activity

              People

              Assignee:
              brian.lane Brian Lane
              Reporter:
              ralf.strobel Ralf Strobel
              Votes:
              24 Vote for this issue
              Watchers:
              49 Start watching this issue

                Dates

                Created:
                Updated: