In this blog, we are going to learn how to use EC2Rescue for Linux to fix unreachable Linux instances. With this method, we can rescue even the EC2 Instances that are not associated with AWS System Manager.
Explore more about AWS System Manager with this blog: Step by Step Guide: Centralized Multi-Account OS Patching using AWS Systems Manager.
In this method, AWS will launch one CloudFormation stack behind and create a new VPC and launch an EC2RescueInstance for rescuing our Unreachable Instance. Once it is available, It will stop the unreachable Instance. Then it stops the Instance and creates the backup of unreachable Instances. Once it is finished, It will detach the root volume from the Unreachable Instance and attached it to the Rescue instance. Then it will locate the rescue device and then mount the rescue volume. It will run the following commands:
'/mnt/mount/etc/resolv.conf' -> '/mnt/mount/etc/resolv.conf.back'
'/etc/resolv.conf' -> '/mnt/mount/etc/resolv.conf'
'/mnt/mount/usr/bin/ec2rl' -> '/usr/local/ec2rl-1.1.5/ec2rl'
It will Start chroot and Run EC2 Rescue for Linux. Then it will stop the Rescue Instance and detached the Instance Root Volume From EC2RescueInstance. Once it has been done, it will attach Instance Root Volume To Instance and will restore Instance to its Initial State. Then Cloudformation will delete the stack which was created for the rescue operation.
- For Demonstration, I have changed My Ec2 Instance /home directory permissions to 777. (As a best practice, create an AMI for the unreachable Instance before running this Automation)
- I have tried to log in to my Instance again, and now I cannot SSH and get a “Permission denied” error
- Here, we will use the “AWSSupport-ExecuteEC2Rescue” Automation Document to fix this issue:
- Go to AWS Console and Open the Systems Manager
- From the left menu pane, choose Automation and then select Execute Automation
- Select “Self service support workflows” from the Automation section
- Then choose “AWSSupport-ExecuteEC2Rescue” and click “Next”
- Next, collect the Instance ID of our Unreachable Instance and provide it in the parameter section
- Click the Execute button to start Automation
- Once you click the “Execute” button, the Automation will start, and you can see the Status “In Progress”:
- We can expand the executed steps to see more details ( Linux Instances will have a “Failed” status for the first step every time.)
- It will execute several Steps to recover our Unreachable Instance. Initially, it creates the Rescue Instance, and then It will Stop our Unreachable Instance
- Now It will detach the Root volume from the Unreachable Instance and attach it to Rescue Instance
- Once it is attached, the procedure runs EC2Rescue for Linuxover rescue Instance to fix the issue.
- Monitor the Overall Status of the Procedure using the Execution Status tab under Automation Executions, and wait for it to get “Success”, which marks it as complete
- Now we can try to connect to our Original Instance,
With this method, you can log in to your EC2 Instance successfully. AWSSupport-ExecuteEC2Rescue is a new method that automates every step required to fix common issues on our unreachable Linux instance utilizing respective EC2Rescue for Linux.
4. About CloudThat:
CloudThat is the authorized AWS Well-Architected Partner, helping other businesses build secure, high-performing, resilient, and efficient infrastructures for their application and workloads.
CloudThat is also the official AWS (Amazon Web Services) Advanced Consulting Partner and Training partner and Microsoft gold partner, helping people develop knowledge of the cloud and help their businesses aim for higher goals using best in industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.