Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-14116

Investigate whether windows locking mechanism is strict enough

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Engines
    • StorEng - Defined Pipeline

      From PM-3810, we created a unit test and have identified a particular locking mechanism within that requires investigation on whether it is safe for correctness or not. The unit test is as follows:

          SECTION("Test CONFLICT_CHECKPOINT_LOCK")
          {
              /* Attempt to drop the table while the checkpoint lock is taken by another session. */
              WT_WITH_CHECKPOINT_LOCK(((WT_SESSION_IMPL *)session_b),
                REQUIRE(session_a->drop(session_a, URI, "lock_wait=0") == EBUSY));
      
              utils::check_error_info(
                err_info_a, EBUSY, WT_CONFLICT_CHECKPOINT_LOCK, CONFLICT_CHECKPOINT_LOCK_MSG);
              utils::check_error_info(err_info_b, 0, WT_NONE, WT_ERROR_INFO_EMPTY);
          }
      
      

      Explanation
      Let's set up a scenario where the thread opens up two sessions, session_A, and session_B. Now if session_A takes the checkpoint lock, it should throw EBUSY if session_B tries to take the checkpoint lock because they distinctly different sessions. The problem is that windows does not return EBUSY and actually continues along into the function.

      Further investigation done by interns show that the windows machine uses TryEnterCriticalSection function to perform it's locking mechanism. The documented behaviour is here:
      https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-tryentercriticalsection

      The documented behavior is that thread takes ownership of the lock as follows:

      Attempts to enter a critical section without blocking. If the call is successful, the calling thread takes ownership of the critical section.
      

      This problem extends toward other the schema lock too. The unit test created for this is under test_sub_level_error_drop_conflict.cpp file

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            jie.chen@mongodb.com Jie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: