Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26578

Add startup warning for Intel CPUs which might have TSX bugs

    • Service Arch
    • 5

      Certain versions of the Intel CPU microcode have TSX bugs which might lead to unexplained concurrency issues. We should include server startup warnings or if possible even refuse to start the server if we discover this situation.

      More information on this was provided by user xiaost as part of SERVER-26018:

      • can only be reproduced on servers with the new CPU(E5-2630 v4)
      • can be easily reproduced by modification of unittests
      • can only be reproduced under particular code execution sequence
      • it works well if we add some debug codes into the lock context

      after debugging, we started to focus on hardware issue, including memory / CPU.

      With the help of Google, we found the TSX feature, speeding up execution of multi-threaded software through lock elision, seems to be evil of everything since 2014:
      [1 [2 [3

      In August 2014, Intel announced a bug in the TSX implementation on current steppings of Haswell, Haswell-E, Haswell-EP and early Broadwell CPUs, which resulted in disabling the TSX feature on affected CPUs via a microcode update.

      we checkout our microcode changelog. In the latest release:

      + Likely fixes a recently identified, critical but low-hitting TSX erratum on Broadwell, Broadwell-E and related Xeons (Broadwell-DE/WS/EP: Xeon-D 1500, E3-v4 and E5-v4)

            Assignee:
            backlog-server-servicearch [DO NOT USE] Backlog - Service Architecture
            Reporter:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: