Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-96874

Replication recovery can crash due to concurrent shutdown

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Tools and Replicator
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      In the linked AF ticket, the node was applying oplog entries as part of replication recovery as a standalone. There was a concurrent shutdown triggered by a SIGTERM and then the node crashed since shutdown kills all operations including the main startup thread that is doing recovery.

      We should handle this case more gracefully, for example early exit the recovery procedure if shutdown is detected, but there might be some thinking needed to do that correctly w.r.t to the startup sequence.

            Assignee:
            mankawaldeep.singh@mongodb.com Mankawaldeep Singh
            Reporter:
            wenbin.zhu@mongodb.com Wenbin Zhu
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: