How to deal with EC2 machines that are unreachable by session manager or SSH

rolandHawkins · October 2, 2022, 3:36pm

What’s the recommended way of dealing with issues on EC2 machines that are unreachable by session manager or SSH? I’ve repeatedly had this problem and always had to guess the problem when developing new Terraform modules. That can of course take ages, but I can’t easily access the logs. The only thing I can think of is mounting EFS on /var/log, then starting a different service that also mounts it but that would take a lot of potentially unnecessary scaffolding.

pOdom · October 2, 2022, 4:56pm

There is the ec2 serial connector

pOdom · October 2, 2022, 6:34pm

https://aws.amazon.com/blogs/aws/troubleshoot-boot-and-networking-issues-with-new-ec2-serial-console/

rolandHawkins · October 2, 2022, 7:55pm

Wow, I’m an idiot. I must have completely misunderstood something when I was starting out because I was under the impression that this is not applicable to “normal” EC2s. don’t know what the reason for that was, but yeah, three clicks… thanks!

and found out that the reason for my problem was that normal instances attach EBS drives like /dev/sdX, ASGs attach them as sdX … yay. don’t know if I’d ever have figured that out.

pOdom · October 2, 2022, 8:06pm

glad you found it

krista · October 2, 2022, 9:36pm

This shouldn’t be a problem for a root volume unless you’re doing something weird with your TerraForm. An ASG will auto attach a root volume from the Launch Template/Config (Ami) if you simply don’t tell it to do anything. If you’re using second/third volumes, they need to attach to different device (/dev) folders based on instance type (some are not /dev/sdX). You may get more help with your TerraForm in the channel; if you’re having ec2 specific problems and are sure the issue is not with your TerraForm, you may get more help with ec2 in the channel.

rolandHawkins · October 2, 2022, 11:36pm

It’s a solved issue - mentioned here:

> The following example gets the volume ID and NVMe device name for a volume that was attached during instance launch. Note that the NVMe device name does not include the /dev/ prefix. The device name is available through the NVMe controller vendor-specific extension (bytes 384:4095 of the controller identification):
> The following example gets the volume ID and NVMe device name for a volume that was attached after instance launch. Note that the NVMe device name includes the /dev/ prefix.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/nvme-ebs-volumes.html

Nothing to do with Terraform.

rolandHawkins · October 3, 2022, 12:41am

And my userdata script was looking at the NVMe data in order to identify devices and was expecting them with /dev/ as that’s how normal instances handle it.