Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-71520

Dump all thread stacks on RSTL acquisition timeout

    • Replication
    • Fully Compatible
    • v7.1, v7.0, v6.0, v5.0, v4.4
    • Repl 2023-03-06, Repl 2023-03-20, Repl 2023-05-01, Repl 2023-05-15, Repl 2023-05-29, Repl 2023-06-12, Repl 2023-06-26, Repl 2023-07-24, Repl 2023-08-07, Repl 2023-08-21, Repl 2023-09-04, Repl 2023-09-18

      SERVER-56756 added an fassert to crash the server when it times out on acquiring the RSTL lock on stepUp/stepDown. We currently dump all locks before the fassert. But sometimes, the lock manager dump isn't sufficient for diagnosing the underlying issues. Most of the time, a core dump is needed to understand what are all of the current running ops and what are they doing. Ideally, it'd be helpful if we can just dump the stacktraces (printAllThreadStacks) but that's not always feasible especially on production builds. One alternative way to do this is to selectively dump currentOp (and maybe the session catalog as well).

      SERVER-71521 is an improvement of currentOp that may help with this.

      Update: see conversation, we decided to dump all thread stacks.

            frederic.vitzikam@mongodb.com Frederic Vitzikam
            lingzhi.deng@mongodb.com Lingzhi Deng
            0 Vote for this issue
            11 Start watching this issue