Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-13014

Investigate changes in SERVER-40954: On v4.0 if FCV is set to 3.6 rollback fails with "No stable timestamp available to recover to" after a restart





      Downstream Change Summary

      In case this is not documented: if you freshly upgraded to 4.0 but need to roll back, you may be unable to complete the rollback. In this case, you must downgrade the binary version to 3.6 to let the rollback finish, after which you may upgrade again.

      Description of Linked Ticket

      In a replica set with all nodes on v4.0 binary version and in FCV=3.6, a clean shutdown will cause a node to set its recovery timestamp to 0. If this happens for a node whose oplog has diverged (i.e. needs to enter rollback), this node won't be able to complete the rollback since it does not have a stable timestamp to roll back to which is needed for recover-to-timestamp. Furthermore, in order to take a new stable checkpoint, it would have to commit a new majority write, which it shouldn't be able to do until it completes the rollback. It also shouldn't be able to upgrade to FCV=4.0 until the node can completes the rollback and replicate new log entries from the primary. If FCV=3.6 and we encounter this situation, falling back on the rollbackViaFetch algorithm may be the appropriate solution. Another alternative may be to always use rollbackViaRefetch whenever FCV=3.6.

      Scope of changes

      Impact to Other Docs

      MVP (Work and Date)

      Resources (Scope or Design Docs, Invision, etc.)


          Issue Links



              Unassigned Unassigned
              backlog-server-pm Backlog - Core Eng Program Management Team
              Last commenter:
              Ravind Kumar Ravind Kumar (Inactive)
              0 Vote for this issue
              1 Start watching this issue


                Days since reply:
                2 years, 7 weeks, 2 days ago