Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35724

Remote EC2 hosts which are not accessible via ssh should fail with system error

    • Fully Compatible
    • v4.0, v3.6
    • TIG 2018-07-02
    • 31

      When a remote EC2 instance is "crashed" by the powercycle test it sometimes fails to become available via ssh. The AWS status still indicates it as "running". The work in SERVER-34996 is intended to help analyze why this may occur.

      The following should be done such that we can distinguish between a test failure (possible data corruption) and an environment failure:

      • powertest.py should exit with a way to indicate it failed due to ssh
      • If the exit is due to ssh, then a system failure should be triggered (which will show the task as purple)

      In order to help find out why a particular EC2 instance is failing to permit ssh we should also do the following:

      • Termination of the EC2 instance should not be attempted if a system failure occurred due to ssh issue from powercycle (for non-Windows variants)
      • Increase the expire_hours to 24 (for non-Windows variants)

            jonathan.abrahams Jonathan Abrahams
            jonathan.abrahams Jonathan Abrahams
            0 Vote for this issue
            3 Start watching this issue