Test and Troubleshoot inside AWS ECS Fargate


The ECS Exec functionality allows users to either run an interactive shell or a single command against a container.

This feature helps get "break-glass" access to containers to debug high-severity issues encountered in production. To this point, it’s important to note that only tools and utilities that are installed inside the container can be used when "exec-ing" into it. In other words, if the netstat or heapdump utilities are not installed in the base image of the container, you won’t be able to use them.

In such cases, what you need to do is to build a container image with utilities installed inside.

% vim Dockerfile_Nginx

FROM nginx
RUN apt-get update -y
RUN apt-get install -y iputils-ping dnsutils telnet

% docker build -t skycone/nginx - < Dockerfile_Nginx

% docker image push skycone/nginx
The push refers to repository [docker.io/skycone/nginx]
08b363107c65: Pushed
91f05189b339: Pushed
b6812e8d56d6: Mounted from library/nginx
7046505147d7: Mounted from library/nginx
c876aa251c80: Mounted from library/nginx
f5ab86d69014: Mounted from library/nginx
4b7fffa0f0a4: Mounted from library/nginx
9c1b6dd6c1e6: Mounted from library/nginx
latest: digest: sha256:9e60f63bf5ac424e90c67e329f7a6af17b9e6d447cab1c4fc59125da56d61ed7 size: 1992

Client-side requirements
If you are using the AWS CLI to initiate the exec command, the only package you need to install is the SSM Session Manager plugin for the AWS CLI. This plugin need to be installed on the host that you will "exec" into a container running inside a task deployed on AWS Fargate.

Install and uninstall the Session Manager plugin on macOS
You can install the Session Manager plugin on macOS using the bundled installer.
To install the Session Manager plugin using the bundled installer (macOS)
Download the bundled installer.
% curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "sessionmanager-bundle.zip"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3499k  100 3499k    0     0   108k      0  0:00:32  0:00:32 --:--:-- 87890

Unzip the package.
% unzip sessionmanager-bundle.zip
Archive:  sessionmanager-bundle.zip
   creating: sessionmanager-bundle/
  inflating: sessionmanager-bundle/install
  inflating: sessionmanager-bundle/THIRD-PARTY
  inflating: sessionmanager-bundle/seelog.xml.template
  inflating: sessionmanager-bundle/LICENSE
   creating: sessionmanager-bundle/bin/
  inflating: sessionmanager-bundle/bin/session-manager-plugin
  inflating: sessionmanager-bundle/NOTICE
  inflating: sessionmanager-bundle/README.md
  inflating: sessionmanager-bundle/RELEASENOTES.md
 extracting: sessionmanager-bundle/VERSION

% sudo ./sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin
Creating install directories: /usr/local/sessionmanagerplugin/bin
Creating Symlink from /usr/local/sessionmanagerplugin/bin/session-manager-plugin to /usr/local/bin/session-manager-plugin
Installation successful!

Verify the Session Manager plugin installation
Run the following commands to verify that the Session Manager plugin installed successfully.
% session-manager-plugin
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.

Create VPC
resource "aws_vpc" "vpc" {
  cidr_block           = var.vpc_cidr_block
  enable_dns_hostnames = true
  enable_dns_support   = true
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - VPC"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"

resource "aws_subnet" "public_subnet_az1" {
  vpc_id                  = aws_vpc.vpc.id
  cidr_block              = var.subnet_cidr_block1
  availability_zone       = "us-east-1a"
  map_public_ip_on_launch = false
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - Public Subnet 1"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"
    Tier        = "Public"

resource "aws_subnet" "public_subnet_az2" {
  vpc_id                  = aws_vpc.vpc.id
  cidr_block              = var.subnet_cidr_block2
  availability_zone       = "us-east-1b"
  map_public_ip_on_launch = false
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - Public Subnet 2"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"
    Tier        = "Public"

resource "aws_subnet" "private_subnet_az1" {
  vpc_id                  = aws_vpc.vpc.id
  cidr_block              = var.subnet_cidr_block_private1
  availability_zone       = "us-east-1a"
  map_public_ip_on_launch = false
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - private subnet 1"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"
    Tier        = "Private"

resource "aws_subnet" "private_subnet_az2" {
  vpc_id                  = aws_vpc.vpc.id
  cidr_block              = var.subnet_cidr_block_private2
  availability_zone       = "us-east-1b"
  map_public_ip_on_launch = false
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - private subnet 2"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"
    Tier        = "Private"

resource "aws_internet_gateway" "internet_gateway" {
  vpc_id = aws_vpc.vpc.id
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - InternetGateway"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"

resource "aws_nat_gateway" "natgw" {
  allocation_id = aws_eip.natgw.id
  subnet_id     = aws_subnet.public_subnet_az1.id
  tags = {
    Name = "NAT Gateway"
  depends_on = [aws_internet_gateway.internet_gateway]

resource "aws_route_table" "public_route_table" {
  vpc_id = aws_vpc.vpc.id
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - RouteTable"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"

resource "aws_route_table" "private_route_table" {
  vpc_id = aws_vpc.vpc.id
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - private route table"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"

resource "aws_route_table_association" "a1" {
  subnet_id      = aws_subnet.public_subnet_az1.id
  route_table_id = aws_route_table.public_route_table.id

resource "aws_route_table_association" "a2" {
  subnet_id      = aws_subnet.public_subnet_az2.id
  route_table_id = aws_route_table.public_route_table.id

resource "aws_route_table_association" "private_association1" {
  subnet_id      = aws_subnet.private_subnet_az1.id
  route_table_id = aws_route_table.private_route_table.id

resource "aws_route_table_association" "private_association2" {
  subnet_id      = aws_subnet.private_subnet_az2.id
  route_table_id = aws_route_table.private_route_table.id

resource "aws_route" "route_public_subnets_to_internet" {
  route_table_id         = aws_route_table.public_route_table.id
  destination_cidr_block = ""
  gateway_id             = aws_internet_gateway.internet_gateway.id

resource "aws_route" "route_private_subnets_to_internet" {
  route_table_id         = aws_route_table.private_route_table.id
  destination_cidr_block = ""
  nat_gateway_id         = aws_nat_gateway.natgw.id

resource "aws_security_group" "ecs_sg" {
  name        = "ECS-allowed-ports"
  description = "ECS allowed ports"
  vpc_id      = aws_vpc.vpc.id
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - ECS SecurityGroup"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"

resource "aws_security_group" "alb_sg" {
  name        = "ELB-allowed-ports"
  description = "ELB allowed ports"
  vpc_id      = aws_vpc.vpc.id
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - ALB SecurityGroup"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"

resource "aws_security_group" "sg_system_manager" {
  name        = "SG-SystemManager"
  description = "Security group for EC2 instance access"
  vpc_id      = aws_vpc.vpc.id
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - security group for System Manager"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"

resource "aws_security_group_rule" "ecs_sg_rule_inbound1" {
  type              = "ingress"
  from_port         = var.ecs_port
  to_port           = var.ecs_port
  protocol          = "tcp"
  cidr_blocks       = [var.vpc_cidr_block]
  security_group_id = aws_security_group.ecs_sg.id

resource "aws_security_group_rule" "ecs_sg_rule_egress1" {
  type                     = "egress"
  from_port                = 443
  to_port                  = 443
  protocol                 = "tcp"
  source_security_group_id = aws_security_group.sg_system_manager.id
  security_group_id        = aws_security_group.ecs_sg.id

resource "aws_security_group_rule" "ecs_sg_rule_egress2" {
  type              = "egress"
  from_port         = 0
  to_port           = 0
  protocol          = "-1"
  cidr_blocks       = [""]
  security_group_id = aws_security_group.ecs_sg.id
  description       = "Docker Hub"

resource "aws_security_group_rule" "alb_sg_rule_ingress1" {
  type              = "ingress"
  from_port         = 80
  to_port           = 80
  protocol          = "tcp"
  cidr_blocks       = [var.source_cidr]
  security_group_id = aws_security_group.alb_sg.id

resource "aws_security_group_rule" "alb_sg_rule_egress1" {
  type              = "egress"
  from_port         = 80
  to_port           = 80
  protocol          = "tcp"
  cidr_blocks       = [var.vpc_cidr_block]
  security_group_id = aws_security_group.alb_sg.id

resource "aws_security_group_rule" "sg_rule_sg_system_manager_ingress1" {
  type                     = "ingress"
  from_port                = 443
  to_port                  = 443
  protocol                 = "tcp"
  source_security_group_id = aws_security_group.ecs_sg.id
  security_group_id        = aws_security_group.sg_system_manager.id

resource "aws_security_group_rule" "sg_rule_sg_system_manager_egress1" {
  type              = "egress"
  from_port         = 0
  to_port           = 0
  protocol          = "-1"
  cidr_blocks       = [""]
  security_group_id = aws_security_group.sg_system_manager.id

resource "aws_vpc_endpoint" "ssmmessages" {
  vpc_id              = aws_vpc.vpc.id
  service_name        = "com.amazonaws.${var.region}.ssmmessages"
  vpc_endpoint_type   = "Interface"
  security_group_ids  = [aws_security_group.sg_system_manager.id]
  subnet_ids          = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id]
  private_dns_enabled = true

resource "aws_vpc_endpoint" "ssm" {
  vpc_id              = aws_vpc.vpc.id
  service_name        = "com.amazonaws.${var.region}.ssm"
  vpc_endpoint_type   = "Interface"
  security_group_ids  = [aws_security_group.sg_system_manager.id]
  subnet_ids          = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id]
  private_dns_enabled = true

resource "aws_vpc_endpoint" "ec2messages" {
  vpc_id              = aws_vpc.vpc.id
  service_name        = "com.amazonaws.${var.region}.ec2messages"
  vpc_endpoint_type   = "Interface"
  security_group_ids  = [aws_security_group.sg_system_manager.id]
  subnet_ids          = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id]
  private_dns_enabled = true

Create S3
resource "aws_s3_bucket" "ecs_exec_s3_bucket" {
  bucket        = "sc-ecs-exec"
  force_destroy = true

Create IAM
data "aws_iam_policy_document" "assume_role_policy" {
  statement {
    actions = ["sts:AssumeRole"]
    principals {
      type        = "Service"
      identifiers = ["ecs-tasks.amazonaws.com"]

data "aws_iam_policy_document" "ecs_exec_demo_task_role_policy" {
  statement {
    effect = "Allow"
    actions = [
    resources = ["*"]
  statement {
    effect = "Allow"
    actions = [
    resources = ["*"]
  statement {
    effect = "Allow"
    actions = [
    resources = ["arn:aws:s3:::${aws_s3_bucket.ecs_exec_s3_bucket.id}/*"]
  statement {
    effect = "Allow"
    actions = [
    resources = ["arn:aws:s3:::${aws_s3_bucket.ecs_exec_s3_bucket.id}/*"]
  statement {
    effect = "Allow"
    actions = [
    resources = ["*"]

resource "aws_iam_role" "iam_role_ecs_task_execution_role" {
  name               = var.ecs_task_execution_role
  assume_role_policy = data.aws_iam_policy_document.assume_role_policy.json

resource "aws_iam_role" "ecs_exec_demo_task_role" {
  name               = "ecs-exec-demo-task-role"
  assume_role_policy = data.aws_iam_policy_document.assume_role_policy.json

resource "aws_iam_role_policy_attachment" "ecs_exec_demo_task_execution" {
  role       = aws_iam_role.iam_role_ecs_task_execution_role.name
  policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"

resource "aws_iam_policy" "ecs_exec_demo_task" {
  name        = "inlinepolicy_ecs"
  description = "Policy for ECS"
  policy      = data.aws_iam_policy_document.ecs_exec_demo_task_role_policy.json

resource "aws_iam_role_policy_attachment" "ecs_exec_demo_task_role" {
  role       = aws_iam_role.ecs_exec_demo_task_role.name
  policy_arn = aws_iam_policy.ecs_exec_demo_task.arn

Create CloudWatch Logs Group
resource "aws_cloudwatch_log_group" "ecs-exec-demo" {
  name = "/aws/ecs/ecs-exec-demo"

Create KMS
resource "aws_kms_key" "ecs" {
  tags = {
    Keyname = "aws-ecs"

resource "aws_kms_alias" "ecs" {
  name          = "alias/ecs-exec-demo-kms-key"
  target_key_id = join("", aws_kms_key.ecs.*.id)

Create ALB
resource "aws_lb_target_group" "target_group_alb_ecsfargate" {
  name        = "tg-ecs-fargate"
  port        = var.ecs_port
  protocol    = "HTTP"
  vpc_id      = aws_vpc.vpc.id
  target_type = "ip"
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - TargetGroup"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"
  lifecycle {
    create_before_destroy = true
  health_check {
    enabled             = true
    healthy_threshold   = 5
    interval            = 30
    matcher             = "200-399"
    path                = "/"
    port                = "traffic-port"
    protocol            = "HTTP"
    timeout             = 5
    unhealthy_threshold = 2

resource "aws_lb" "ecs_alb" {
  name               = "alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets            = [aws_subnet.public_subnet_az1.id, aws_subnet.public_subnet_az2.id]
  tags = {
    Name        = "ECS ${var.ecs_cluster_name} - ALB"
    Description = "Created for ECS cluster ${var.ecs_cluster_name}"
  lifecycle {
    create_before_destroy = true

resource "aws_lb_listener" "lb_listener_alb" {
  load_balancer_arn = aws_lb.ecs_alb.arn
  port              = var.ecs_port
  protocol          = "HTTP"
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.target_group_alb_ecsfargate.arn
  lifecycle {
    create_before_destroy = true

Let’s launch the Fargate task now!
resource "aws_ecs_cluster" "ecs-exec-demo-cluster" {
  name = "ecs-exec-demo-cluster"
  configuration {
    execute_command_configuration {
      kms_key_id = aws_kms_key.ecs.arn
      logging    = "OVERRIDE"
      log_configuration {
        cloud_watch_log_group_name = aws_cloudwatch_log_group.ecs-exec-demo.name
        s3_bucket_name             = aws_s3_bucket.ecs_exec_s3_bucket.id
        s3_key_prefix              = "exec-output"

resource "aws_ecs_task_definition" "ecs-exec-demo" {
  family                   = "ecs-exec-demo"
  cpu                      = 256
  memory                   = 512
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  execution_role_arn       = aws_iam_role.iam_role_ecs_task_execution_role.arn
  task_role_arn            = aws_iam_role.ecs_exec_demo_task_role.arn
  container_definitions = jsonencode([
      logConfiguration = {
        logDriver = "awslogs"
        options = {
          awslogs-group         = aws_cloudwatch_log_group.ecs-exec-demo.name
          awslogs-region        = var.region
          awslogs-stream-prefix = "container-stdout"
      linuxParameters = {
        initProcessEnabled = true
      image = "skycone/nginx"
      name  = "nginx"
      portMappings = [
          hostPort      = 80,
          protocol      = "tcp",
          containerPort = 80

resource "aws_ecs_service" "ecs-exec-demo" {
  name                              = "ecs-exec-demo"
  cluster                           = aws_ecs_cluster.ecs-exec-demo-cluster.id
  task_definition                   = aws_ecs_task_definition.ecs-exec-demo.arn
  desired_count                     = 2
  health_check_grace_period_seconds = 0
  launch_type                       = "FARGATE"
  scheduling_strategy               = "REPLICA"
  enable_execute_command            = true
  platform_version                  = "1.4.0"
  load_balancer {
    target_group_arn = aws_lb_target_group.target_group_alb_ecsfargate.arn
    container_name   = "nginx"
    container_port   = 80
  network_configuration {
    subnets          = [aws_subnet.private_subnet_az1.id, aws_subnet.private_subnet_az2.id]
    security_groups  = [aws_security_group.ecs_sg.id]
    assign_public_ip = false

You can enable the feature at ECS Service level by using the same enable-execute-command flag. The "enable-execute-command" option will instruct the ECS and Fargate agents to bind mount the SSM binaries and launch them along the application. With this opt-in setting, you are now able to exec into the container.

aws ecs describe-tasks \
    --cluster ecs-exec-demo-cluster \
    --tasks ef6260ed8aab49cf926667ab0c52c313
    "tasks": [
            "containers": [
                    "managedAgents": [
                            "name": "ExecuteCommandAgent",
                            "lastStatus": "RUNNING"
            "enableExecuteCommand": true,

Confirm that the "ExecuteCommandAgent" in the task status is also RUNNING and that "enableExecuteCommand" is set to true.

With the feature enabled and appropriate permissions in place, we are ready to exec into one of its containers.

For the purpose of this walkthrough, we will continue to use the IAM role with the Administration policy we have used so far. However, remember that “exec-ing” into a container is governed by the new ecs:ExecuteCommandIAM action and that that action is compatible with conditions on tags.

Execute a command to invoke a shell.Return:
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.

Starting session with SessionId: ecs-execute-command-01bc3dbc6179c8abb
root@ip-10-1-1-5:/# ping -c 2
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=108 time=0.613 ms
64 bytes from icmp_seq=2 ttl=108 time=0.638 ms

--- ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1011ms
rtt min/avg/max/mdev = 0.613/0.625/0.638/0.012 ms
root@ip-10-1-1-5:/# nslookup amazon.com

Non-authoritative answer:
Name:	amazon.com
Name:	amazon.com
Name:	amazon.com
root@ip-10-1-1-5:/# telnet amazon.com 443
Connected to amazon.com.
Escape character is '^]'.
root@ip-10-1-1-5:/# curl https://amazon.com
<head><title>301 Moved Permanently</title></head>
<center><h1>301 Moved Permanently</h1></center>
root@ip-10-1-1-5:/# exit

Exiting session with sessionId: ecs-execute-command-01bc3dbc6179c8abb.

The ls command is part of the payload of the ExecuteCommand API call as logged in AWS CloudTrail. Note the sessionId and the command in this extract of the CloudTrail log content. The sessionId and the various timestamps will help correlate the events.
    "eventVersion": "1.08",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AR**CI:ecs-execute-command",
        "arn": "arn:aws:sts::123456789012:assumed-role/AWSServiceRoleForECS/ecs-execute-command",
        "accountId": "123456789012",
        "accessKeyId": "AS**US",
        "sessionContext": {
            "sessionIssuer": {
                "type": "Role",
                "principalId": "AR**CI",
                "arn": "arn:aws:iam::123456789012:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS",
                "accountId": "123456789012",
                "userName": "AWSServiceRoleForECS"
            "webIdFederationData": {},
            "attributes": {
                "creationDate": "2022-04-25T15:26:42Z",
                "mfaAuthenticated": "false"
        "invokedBy": "ecs.amazonaws.com"
    "eventTime": "2022-04-25T15:26:42Z",
    "eventSource": "ssm.amazonaws.com",
    "eventName": "StartSession",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "ecs.amazonaws.com",
    "userAgent": "ecs.amazonaws.com",
    "requestParameters": {
        "target": "ecs:ecs-exec-demo-cluster_00****79",
        "documentName": "AmazonECS-ExecuteInteractiveCommand",
        "parameters": {
            "cloudWatchEncryptionEnabled": [
            "s3EncryptionEnabled": [
            "s3BucketName": [
            "kmsKeyId": [
            "s3KeyPrefix": [
            "cloudWatchLogGroupName": [
            "command": [
    "responseElements": {
        "sessionId": "ecs-execute-command-0b3e9f9145aebe654",
        "streamUrl": "wss://ssmmessages.us-east-1.amazonaws.com/v1/data-channel/ecs-execute-command-0b3e9f9145aebe654?role=publish_subscribe&cell-number=AA***==",
        "tokenValue": "Value hidden due to security reasons."
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "recipientAccountId": "123456789012",
    "eventCategory": "Management"

This is the output logged to the S3 bucket.
root@ip-10-1-2-148:/# ls
bin   dev		   docker-entrypoint.sh  home  lib64	       media  opt   root  sbin	sys  usr
boot  docker-entrypoint.d  etc			 lib   managed-agents  mnt    proc  run   srv	tmp  var
root@ip-10-1-2-148:/# hostname
root@ip-10-1-2-148:/# exit

This is the output logged to the CloudWatch log stream for the same ls command.
Script started on 2022-04-25 15:26:49+00:00 [<not executed on terminal>]
bin   docker-entrypoint.d   home   managed-agents  opt	 run   sys  var
boot  docker-entrypoint.sh  lib    media	   proc  sbin  tmp
dev   etc		    lib64  mnt		   root  srv   usr

Script done on 2022-04-25 15:26:49+00:00 [COMMAND_EXIT_CODE="0"]

In this post, we have discussed the release of ECS Exec, a feature that allows ECS users to more easily interact with and debug containers deployed on either Amazon EC2 or AWS Fargate. We are eager for you to try it out and tell us what you think about it, and how this is making it easier for you to debug containers on AWS and specifically on Amazon ECS.


