Investigate TSAN instrumentation against WT codebase

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Engines, Storage Engines - Foundations
    • SE Foundations - 2025-08-15
    • 8

      Found from WT-13074, we found a case where TSAN doesn't respect WiredTiger acquire/release semantics as follows:

      Thread 1:
      static int
      __log_newfile(WT_SESSION_IMPL *session, bool conn_open, bool *created)
      {
          for (yield_cnt = 0; log->log_close_fh != NULL;) {} // ACQUIRE 2
      
          WT_ASSIGN_LSN(&log->log_close_lsn, &log->alloc_lsn);
      
          WT_RELEASE_WRITE_WITH_BARRIER(log->log_close_fh, log->log_fh); // RELEASE 1
      }
      
      
      Thread 2: 
      __log_file_server(void *arg) {
          while (true) {
              WT_ACQUIRE_READ_WITH_BARRIER(close_fh, log->log_close_fh); // ACQUIRE 1
              if (close_fh != NULL) {
                  WT_ASSIGN_LSN(&close_end_lsn, &log->log_close_lsn);
                  WT_FULL_BARRIER();
                  log->log_close_fh = NULL; // RELEASE 2
              }
          }
      }
      

      Theoretically speaking TSAN should detect the acquire and release barriers and find no TSAN warnings. After testing with ivan.kochin@mongodb.com, we found using GCC intrinsics worked. This makes us question the validity of TSAN in respect to our codebase. This ticket needs to research the limitations of TSAN and how we can respect it's limitations. Here are some ideas:

      • TSAN provides users to annotate code bases.
      • Replace all wiredtiger barriers to use GCC intrinstics or C11 memory model

              Assignee:
              [DO NOT USE] Backlog - Storage Engines Team
              Reporter:
              Jie Chen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: