Test and Troubleshoot inside AWS ECS Fargate
The ECS Exec functionality allows users to either run an interactive shell or a single command against a container.
This feature helps get "break-glass" access to containers to debug high-severity issues encountered in production. To this point, it’s important to note that only tools and utilities that are installed inside the container can be used when "exec-ing" into it. In other words, if the netstat or heapdump utilities are not installed in the base image of the container, you won’t be able to use them.
In such cases, what you need to do is to build a container image with utilities installed inside.
% vim Dockerfile_Nginx
FROM nginx RUN apt-get update -y RUN apt-get install -y iputils-ping dnsutils telnet
% docker build -t skycone/nginx - < Dockerfile_Nginx
% docker image push skycone/nginx
The push refers to repository [docker.io/skycone/nginx] 08b363107c65: Pushed 91f05189b339: Pushed b6812e8d56d6: Mounted from library/nginx 7046505147d7: Mounted from library/nginx c876aa251c80: Mounted from library/nginx f5ab86d69014: Mounted from library/nginx 4b7fffa0f0a4: Mounted from library/nginx 9c1b6dd6c1e6: Mounted from library/nginx latest: digest: sha256:9e60f63bf5ac424e90c67e329f7a6af17b9e6d447cab1c4fc59125da56d61ed7 size: 1992
Client-side requirements
If you are using the AWS CLI to initiate the exec command, the only package you need to install is the SSM Session Manager plugin for the AWS CLI. This plugin need to be installed on the host that you will "exec" into a container running inside a task deployed on AWS Fargate.
Install and uninstall the Session Manager plugin on macOS
You can install the Session Manager plugin on macOS using the bundled installer.
To install the Session Manager plugin using the bundled installer (macOS)
Download the bundled installer.
% curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "sessionmanager-bundle.zip"
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3499k 100 3499k 0 0 108k 0 0:00:32 0:00:32 --:--:-- 87890
Unzip the package.
% unzip sessionmanager-bundle.zip
Archive: sessionmanager-bundle.zip creating: sessionmanager-bundle/ inflating: sessionmanager-bundle/install inflating: sessionmanager-bundle/THIRD-PARTY inflating: sessionmanager-bundle/seelog.xml.template inflating: sessionmanager-bundle/LICENSE creating: sessionmanager-bundle/bin/ inflating: sessionmanager-bundle/bin/session-manager-plugin inflating: sessionmanager-bundle/NOTICE inflating: sessionmanager-bundle/README.md inflating: sessionmanager-bundle/RELEASENOTES.md extracting: sessionmanager-bundle/VERSION
% sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin
Creating install directories: /usr/local/sessionmanagerplugin/bin Creating Symlink from /usr/local/sessionmanagerplugin/bin/session-manager-plugin to /usr/local/bin/session-manager-plugin Installation successful!
Verify the Session Manager plugin installation
Run the following commands to verify that the Session Manager plugin installed successfully.
% session-manager-plugin
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.
Create VPC
resource "aws_vpc" "vpc" { cidr_block = var.vpc_cidr_block enable_dns_hostnames = true enable_dns_support = true tags = { Name = "ECS ${var.ecs_cluster_name} - VPC" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } } resource "aws_subnet" "public_subnet_az1" { vpc_id = aws_vpc.vpc.id cidr_block = var.subnet_cidr_block1 availability_zone = "us-east-1a" map_public_ip_on_launch = false tags = { Name = "ECS ${var.ecs_cluster_name} - Public Subnet 1" Description = "Created for ECS cluster ${var.ecs_cluster_name}" Tier = "Public" } } resource "aws_subnet" "public_subnet_az2" { vpc_id = aws_vpc.vpc.id cidr_block = var.subnet_cidr_block2 availability_zone = "us-east-1b" map_public_ip_on_launch = false tags = { Name = "ECS ${var.ecs_cluster_name} - Public Subnet 2" Description = "Created for ECS cluster ${var.ecs_cluster_name}" Tier = "Public" } } resource "aws_subnet" "private_subnet_az1" { vpc_id = aws_vpc.vpc.id cidr_block = var.subnet_cidr_block_private1 availability_zone = "us-east-1a" map_public_ip_on_launch = false tags = { Name = "ECS ${var.ecs_cluster_name} - private subnet 1" Description = "Created for ECS cluster ${var.ecs_cluster_name}" Tier = "Private" } } resource "aws_subnet" "private_subnet_az2" { vpc_id = aws_vpc.vpc.id cidr_block = var.subnet_cidr_block_private2 availability_zone = "us-east-1b" map_public_ip_on_launch = false tags = { Name = "ECS ${var.ecs_cluster_name} - private subnet 2" Description = "Created for ECS cluster ${var.ecs_cluster_name}" Tier = "Private" } } resource "aws_internet_gateway" "internet_gateway" { vpc_id = aws_vpc.vpc.id tags = { Name = "ECS ${var.ecs_cluster_name} - InternetGateway" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } } resource "aws_nat_gateway" "natgw" { allocation_id = aws_eip.natgw.id subnet_id = aws_subnet.public_subnet_az1.id tags = { Name = "NAT Gateway" } depends_on = [aws_internet_gateway.internet_gateway] } resource "aws_route_table" "public_route_table" { vpc_id = aws_vpc.vpc.id tags = { Name = "ECS ${var.ecs_cluster_name} - RouteTable" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } } resource "aws_route_table" "private_route_table" { vpc_id = aws_vpc.vpc.id tags = { Name = "ECS ${var.ecs_cluster_name} - private route table" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } } resource "aws_route_table_association" "a1" { subnet_id = aws_subnet.public_subnet_az1.id route_table_id = aws_route_table.public_route_table.id } resource "aws_route_table_association" "a2" { subnet_id = aws_subnet.public_subnet_az2.id route_table_id = aws_route_table.public_route_table.id } resource "aws_route_table_association" "private_association1" { subnet_id = aws_subnet.private_subnet_az1.id route_table_id = aws_route_table.private_route_table.id } resource "aws_route_table_association" "private_association2" { subnet_id = aws_subnet.private_subnet_az2.id route_table_id = aws_route_table.private_route_table.id } resource "aws_route" "route_public_subnets_to_internet" { route_table_id = aws_route_table.public_route_table.id destination_cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.internet_gateway.id } resource "aws_route" "route_private_subnets_to_internet" { route_table_id = aws_route_table.private_route_table.id destination_cidr_block = "0.0.0.0/0" nat_gateway_id = aws_nat_gateway.natgw.id } resource "aws_security_group" "ecs_sg" { name = "ECS-allowed-ports" description = "ECS allowed ports" vpc_id = aws_vpc.vpc.id tags = { Name = "ECS ${var.ecs_cluster_name} - ECS SecurityGroup" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } } resource "aws_security_group" "alb_sg" { name = "ELB-allowed-ports" description = "ELB allowed ports" vpc_id = aws_vpc.vpc.id tags = { Name = "ECS ${var.ecs_cluster_name} - ALB SecurityGroup" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } } resource "aws_security_group" "sg_system_manager" { name = "SG-SystemManager" description = "Security group for EC2 instance access" vpc_id = aws_vpc.vpc.id tags = { Name = "ECS ${var.ecs_cluster_name} - security group for System Manager" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } } resource "aws_security_group_rule" "ecs_sg_rule_inbound1" { type = "ingress" from_port = var.ecs_port to_port = var.ecs_port protocol = "tcp" cidr_blocks = [var.vpc_cidr_block] security_group_id = aws_security_group.ecs_sg.id } resource "aws_security_group_rule" "ecs_sg_rule_egress1" { type = "egress" from_port = 443 to_port = 443 protocol = "tcp" source_security_group_id = aws_security_group.sg_system_manager.id security_group_id = aws_security_group.ecs_sg.id } resource "aws_security_group_rule" "ecs_sg_rule_egress2" { type = "egress" from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] security_group_id = aws_security_group.ecs_sg.id description = "Docker Hub" } resource "aws_security_group_rule" "alb_sg_rule_ingress1" { type = "ingress" from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = [var.source_cidr] security_group_id = aws_security_group.alb_sg.id } resource "aws_security_group_rule" "alb_sg_rule_egress1" { type = "egress" from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = [var.vpc_cidr_block] security_group_id = aws_security_group.alb_sg.id } resource "aws_security_group_rule" "sg_rule_sg_system_manager_ingress1" { type = "ingress" from_port = 443 to_port = 443 protocol = "tcp" source_security_group_id = aws_security_group.ecs_sg.id security_group_id = aws_security_group.sg_system_manager.id } resource "aws_security_group_rule" "sg_rule_sg_system_manager_egress1" { type = "egress" from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] security_group_id = aws_security_group.sg_system_manager.id } resource "aws_vpc_endpoint" "ssmmessages" { vpc_id = aws_vpc.vpc.id service_name = "com.amazonaws.${var.region}.ssmmessages" vpc_endpoint_type = "Interface" security_group_ids = [aws_security_group.sg_system_manager.id] subnet_ids = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id] private_dns_enabled = true } resource "aws_vpc_endpoint" "ssm" { vpc_id = aws_vpc.vpc.id service_name = "com.amazonaws.${var.region}.ssm" vpc_endpoint_type = "Interface" security_group_ids = [aws_security_group.sg_system_manager.id] subnet_ids = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id] private_dns_enabled = true } resource "aws_vpc_endpoint" "ec2messages" { vpc_id = aws_vpc.vpc.id service_name = "com.amazonaws.${var.region}.ec2messages" vpc_endpoint_type = "Interface" security_group_ids = [aws_security_group.sg_system_manager.id] subnet_ids = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id] private_dns_enabled = true }
Create S3
resource "aws_s3_bucket" "ecs_exec_s3_bucket" { bucket = "sc-ecs-exec" force_destroy = true }
Create IAM
data "aws_iam_policy_document" "assume_role_policy" { statement { actions = ["sts:AssumeRole"] principals { type = "Service" identifiers = ["ecs-tasks.amazonaws.com"] } } } data "aws_iam_policy_document" "ecs_exec_demo_task_role_policy" { statement { effect = "Allow" actions = [ "ssmmessages:CreateControlChannel", "ssmmessages:CreateDataChannel", "ssmmessages:OpenControlChannel", "ssmmessages:OpenDataChannel" ] resources = ["*"] } statement { effect = "Allow" actions = [ "logs:DescribeLogGroups", "logs:CreateLogStream", "logs:PutLogEvents" ] resources = ["*"] } statement { effect = "Allow" actions = [ "s3:PutObject" ] resources = ["arn:aws:s3:::${aws_s3_bucket.ecs_exec_s3_bucket.id}/*"] } statement { effect = "Allow" actions = [ "s3:GetEncryptionConfiguration" ] resources = ["arn:aws:s3:::${aws_s3_bucket.ecs_exec_s3_bucket.id}/*"] } statement { effect = "Allow" actions = [ "kms:Decrypt" ] resources = ["*"] } } resource "aws_iam_role" "iam_role_ecs_task_execution_role" { name = var.ecs_task_execution_role assume_role_policy = data.aws_iam_policy_document.assume_role_policy.json } resource "aws_iam_role" "ecs_exec_demo_task_role" { name = "ecs-exec-demo-task-role" assume_role_policy = data.aws_iam_policy_document.assume_role_policy.json } resource "aws_iam_role_policy_attachment" "ecs_exec_demo_task_execution" { role = aws_iam_role.iam_role_ecs_task_execution_role.name policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy" } resource "aws_iam_policy" "ecs_exec_demo_task" { name = "inlinepolicy_ecs" description = "Policy for ECS" policy = data.aws_iam_policy_document.ecs_exec_demo_task_role_policy.json } resource "aws_iam_role_policy_attachment" "ecs_exec_demo_task_role" { role = aws_iam_role.ecs_exec_demo_task_role.name policy_arn = aws_iam_policy.ecs_exec_demo_task.arn }
Create CloudWatch Logs Group
resource "aws_cloudwatch_log_group" "ecs-exec-demo" { name = "/aws/ecs/ecs-exec-demo" }
Create KMS
resource "aws_kms_key" "ecs" { tags = { Keyname = "aws-ecs" } } resource "aws_kms_alias" "ecs" { name = "alias/ecs-exec-demo-kms-key" target_key_id = join("", aws_kms_key.ecs.*.id) }
Create ALB
resource "aws_lb_target_group" "target_group_alb_ecsfargate" { name = "tg-ecs-fargate" port = var.ecs_port protocol = "HTTP" vpc_id = aws_vpc.vpc.id target_type = "ip" tags = { Name = "ECS ${var.ecs_cluster_name} - TargetGroup" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } lifecycle { create_before_destroy = true } health_check { enabled = true healthy_threshold = 5 interval = 30 matcher = "200-399" path = "/" port = "traffic-port" protocol = "HTTP" timeout = 5 unhealthy_threshold = 2 } } resource "aws_lb" "ecs_alb" { name = "alb" internal = false load_balancer_type = "application" security_groups = [aws_security_group.alb_sg.id] subnets = [aws_subnet.public_subnet_az1.id, aws_subnet.public_subnet_az2.id] tags = { Name = "ECS ${var.ecs_cluster_name} - ALB" Description = "Created for ECS cluster ${var.ecs_cluster_name}" } lifecycle { create_before_destroy = true } } resource "aws_lb_listener" "lb_listener_alb" { load_balancer_arn = aws_lb.ecs_alb.arn port = var.ecs_port protocol = "HTTP" default_action { type = "forward" target_group_arn = aws_lb_target_group.target_group_alb_ecsfargate.arn } lifecycle { create_before_destroy = true } }
Let’s launch the Fargate task now!
resource "aws_ecs_cluster" "ecs-exec-demo-cluster" { name = "ecs-exec-demo-cluster" configuration { execute_command_configuration { kms_key_id = aws_kms_key.ecs.arn logging = "OVERRIDE" log_configuration { cloud_watch_log_group_name = aws_cloudwatch_log_group.ecs-exec-demo.name s3_bucket_name = aws_s3_bucket.ecs_exec_s3_bucket.id s3_key_prefix = "exec-output" } } } } resource "aws_ecs_task_definition" "ecs-exec-demo" { family = "ecs-exec-demo" cpu = 256 memory = 512 requires_compatibilities = ["FARGATE"] network_mode = "awsvpc" execution_role_arn = aws_iam_role.iam_role_ecs_task_execution_role.arn task_role_arn = aws_iam_role.ecs_exec_demo_task_role.arn container_definitions = jsonencode([ { logConfiguration = { logDriver = "awslogs" options = { awslogs-group = aws_cloudwatch_log_group.ecs-exec-demo.name awslogs-region = var.region awslogs-stream-prefix = "container-stdout" } } linuxParameters = { initProcessEnabled = true } image = "skycone/nginx" name = "nginx" portMappings = [ { hostPort = 80, protocol = "tcp", containerPort = 80 } ] } ]) } resource "aws_ecs_service" "ecs-exec-demo" { name = "ecs-exec-demo" cluster = aws_ecs_cluster.ecs-exec-demo-cluster.id task_definition = aws_ecs_task_definition.ecs-exec-demo.arn desired_count = 2 health_check_grace_period_seconds = 0 launch_type = "FARGATE" scheduling_strategy = "REPLICA" enable_execute_command = true platform_version = "1.4.0" load_balancer { target_group_arn = aws_lb_target_group.target_group_alb_ecsfargate.arn container_name = "nginx" container_port = 80 } network_configuration { subnets = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id] security_groups = [aws_security_group.ecs_sg.id] assign_public_ip = false } }
You can enable the feature at ECS Service level by using the same enable-execute-command flag. The "enable-execute-command" option will instruct the ECS and Fargate agents to bind mount the SSM binaries and launch them along the application. With this opt-in setting, you are now able to exec into the container.
aws ecs describe-tasks \ --cluster ecs-exec-demo-cluster \ --tasks ef6260ed8aab49cf926667ab0c52c313Returns:
{ "tasks": [ { ... "containers": [ { ... "managedAgents": [ { ... "name": "ExecuteCommandAgent", "lastStatus": "RUNNING" } ], ... } ], ... "enableExecuteCommand": true, ... } ], ... }
Confirm that the
"ExecuteCommandAgent"
in the task status is also RUNNING
and that "enableExecuteCommand"
is set to true
.With the feature enabled and appropriate permissions in place, we are ready to
exec
into one of its containers.For the purpose of this walkthrough, we will continue to use the IAM role with the
Administration
policy we have used so far. However, remember that “exec-ing” into a container is governed by the new ecs:ExecuteCommand
IAM action and that that action is compatible with conditions
on tags.Execute a command to invoke a shell.Return:
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session. Starting session with SessionId: ecs-execute-command-01bc3dbc6179c8abb root@ip-10-1-1-5:/# ping 8.8.8.8 -c 2 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=108 time=0.613 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=108 time=0.638 ms --- 8.8.8.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1011ms rtt min/avg/max/mdev = 0.613/0.625/0.638/0.012 ms root@ip-10-1-1-5:/# nslookup amazon.com Server: 10.1.0.2 Address: 10.1.0.2#53 Non-authoritative answer: Name: amazon.com Address: 176.32.103.205 Name: amazon.com Address: 54.239.28.85 Name: amazon.com Address: 205.251.242.103 root@ip-10-1-1-5:/# telnet amazon.com 443 Trying 205.251.242.103... Connected to amazon.com. Escape character is '^]'. root@ip-10-1-1-5:/# curl https://amazon.com <html> <head><title>301 Moved Permanently</title></head> <body> <center><h1>301 Moved Permanently</h1></center> <hr><center>Server</center> </body> </html> root@ip-10-1-1-5:/# exit exit Exiting session with sessionId: ecs-execute-command-01bc3dbc6179c8abb.
The ls command is part of the payload of the ExecuteCommand API call as logged in AWS CloudTrail. Note the sessionId and the command in this extract of the CloudTrail log content. The sessionId and the various timestamps will help correlate the events.
{ "eventVersion": "1.08", "userIdentity": { "type": "AssumedRole", "principalId": "AR**CI:ecs-execute-command", "arn": "arn:aws:sts::123456789012:assumed-role/AWSServiceRoleForECS/ecs-execute-command", "accountId": "123456789012", "accessKeyId": "AS**US", "sessionContext": { "sessionIssuer": { "type": "Role", "principalId": "AR**CI", "arn": "arn:aws:iam::123456789012:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS", "accountId": "123456789012", "userName": "AWSServiceRoleForECS" }, "webIdFederationData": {}, "attributes": { "creationDate": "2022-04-25T15:26:42Z", "mfaAuthenticated": "false" } }, "invokedBy": "ecs.amazonaws.com" }, "eventTime": "2022-04-25T15:26:42Z", "eventSource": "ssm.amazonaws.com", "eventName": "StartSession", "awsRegion": "us-east-1", "sourceIPAddress": "ecs.amazonaws.com", "userAgent": "ecs.amazonaws.com", "requestParameters": { "target": "ecs:ecs-exec-demo-cluster_00****79", "documentName": "AmazonECS-ExecuteInteractiveCommand", "parameters": { "cloudWatchEncryptionEnabled": [ "false" ], "s3EncryptionEnabled": [ "false" ], "s3BucketName": [ "sc-ecs-exec" ], "kmsKeyId": [ "arn:aws:kms:us-east-1:123456789012:key/4a****1e" ], "s3KeyPrefix": [ "exec-output" ], "cloudWatchLogGroupName": [ "/aws/ecs/ecs-exec-demo" ], "command": [ "ls" ] } }, "responseElements": { "sessionId": "ecs-execute-command-0b3e9f9145aebe654", "streamUrl": "wss://ssmmessages.us-east-1.amazonaws.com/v1/data-channel/ecs-execute-command-0b3e9f9145aebe654?role=publish_subscribe&cell-number=AA***==", "tokenValue": "Value hidden due to security reasons." }, ... "eventType": "AwsApiCall", "managementEvent": true, "recipientAccountId": "123456789012", "eventCategory": "Management" }
This is the output logged to the S3 bucket.
root@ip-10-1-2-148:/# ls bin dev docker-entrypoint.sh home lib64 media opt root sbin sys usr boot docker-entrypoint.d etc lib managed-agents mnt proc run srv tmp var root@ip-10-1-2-148:/# hostname ip-10-1-2-148.ec2.internal root@ip-10-1-2-148:/# exit
This is the output logged to the CloudWatch log stream for the same ls command.
Script started on 2022-04-25 15:26:49+00:00 [<not executed on terminal>] bin docker-entrypoint.d home managed-agents opt run sys var boot docker-entrypoint.sh lib media proc sbin tmp dev etc lib64 mnt root srv usr Script done on 2022-04-25 15:26:49+00:00 [COMMAND_EXIT_CODE="0"]
Conclusions
In this post, we have discussed the release of ECS Exec, a feature that allows ECS users to more easily interact with and debug containers deployed on either Amazon EC2 or AWS Fargate. We are eager for you to try it out and tell us what you think about it, and how this is making it easier for you to debug containers on AWS and specifically on Amazon ECS.
References
NEW – Using Amazon ECS Exec to access your containers on AWS Fargate and Amazon EC2