GitHub Actions to ECR to ECS: a Deployment Checklist

Table of Contents

This post is for teams already running workloads on Amazon ECS who want a reliable GitHub Actions pipeline without reinventing the wheel each release. It is not a greenfield Kubernetes guide, and it assumes you have a working cluster, service, and task definition—not a blank AWS account.

If you have shipped Docker apps to production before, most of this will feel familiar. The goal here is a checklist you can hand to a teammate (or your future self) before the next 2 a.m. deploy.

Prerequisites

Before wiring CI, confirm these exist in AWS:

ItemWhy it matters
ECR repositoryStores immutable image tags per build
ECS cluster + serviceTarget for deployments
Task definitionCPU, memory, ports, env vars, secrets, log config
Execution rolePulls images from ECR, writes logs, reads secrets
Task roleRuntime permissions for your app (S3, SQS, etc.)
ALB target group + health checkRolling deploys need a passing health path
Secrets in SSM or Secrets ManagerNever bake credentials into the image

On GitHub, use OIDC federation to assume an IAM role instead of long-lived AWS_ACCESS_KEY_ID secrets when possible. Fewer secrets to rotate, clearer audit trail.

Pipeline checklist

Walk through these steps in order every time you change the deploy workflow.

1. Build and tag the image

  • Tag with the git commit SHA (${{ github.sha }}) on every branch build.
  • Push :latest only from the default branch (usually main).
  • Pass build args for version metadata if your app exposes a /health or /version endpoint.
  • Use layer caching (cache-from / cache-to) if build times are painful—but verify cache invalidation when dependencies change.

2. Push to ECR

  • Log in with aws ecr get-login-password (or the official amazon-ecr-login action).
  • Push all tags in one job so SHA and latest always point at the same digest on main.

3. Render a new task definition revision

  • Start from the current task definition registered in ECS—do not maintain a static JSON file in git unless you have a strong reason.
  • Update only the image URI (and optionally env vars) for routine releases.
  • Register the new revision and capture the ARN or revision number for the deploy step.

Tools like aws-actions/amazon-ecs-render-task-definition work well here; Terraform-managed task defs need a clear owner (pipeline vs. IaC) to avoid drift.

4. Deploy the ECS service

  • Use aws ecs update-service with the new task definition revision.
  • Enable the deployment circuit breaker with rollback when your platform supports it—failed deploys should revert, not limp along at 50% capacity.
  • Set minimumHealthyPercent / maximumPercent intentionally; aggressive settings speed deploys but reduce headroom during rollouts.

5. Smoke test after stability

  • Wait for the service to stabilize: aws ecs wait services-stable.
  • Hit the health check URL through the load balancer—not localhost on the runner.
  • For web apps, verify one authenticated path if auth is easy to automate.

Skip the smoke step once and you will eventually push a broken image that passes ECS “running” but fails application startup.

Common failures (and fixes)

Wrong IAM on the execution role — Symptom: task stays PENDING or exits immediately with ECR pull errors. Fix: ecr:GetAuthorizationToken on * plus ecr:BatchGetImage / ecr:GetDownloadUrlForLayer on the repo ARN.

Stale task definition in the pipeline — Symptom: deploy “succeeds” but old code is live. Fix: always render from the live family revision, not a checked-in template from last quarter.

Health check grace period too short — Symptom: new tasks killed before the app boots (common with JVM, Magento, Laravel on cold start). Fix: raise healthCheckGracePeriodSeconds on the service and align ALB interval/threshold.

Environment variables only in the GitHub workflow — Symptom: local/staging differ from production mysteriously. Fix: env belongs in the task definition (or SSM/Secrets Manager references), not only in CI YAML.

latest tag in production task defs — Symptom: non-reproducible deploys. Fix: pin the image digest or SHA tag in the task definition revision you register.

Minimal workflow skeleton

Abbreviated example—adjust names, regions, and roles for your account. See the amazon-ecs-deploy-task-definition README for a complete sample.

Key steps in your workflow:

  1. actions/checkout
  2. aws-actions/configure-aws-credentials (prefer OIDC role-to-assume)
  3. aws-actions/amazon-ecr-login
  4. docker build / docker push with ${{ github.sha }} tag
  5. aws ecs describe-task-definition to fetch the current family
  6. aws-actions/amazon-ecs-render-task-definition to swap the image URI
  7. aws-actions/amazon-ecs-deploy-task-definition with wait-for-service-stability: true

Add a final step that curls your ALB health endpoint when you are ready for stricter gates.

What’s next

Infrastructure should match the pipeline: next week I will cover how I structure Terraform modules for multi-environment AWS—the layout that keeps dev, staging, and production from becoming three unrelated copies of the same mistakes.


If you only do one thing: pin every production deploy to an immutable image tag (git SHA), and make services-stable plus one real HTTP health check non-optional steps in the workflow.

Share :