GitHub Actions to ECR to ECS: a Deployment Checklist
- Shameem Abdul Salam
- Dev ops
- June 8, 2026
Table of Contents
This post is for teams already running workloads on Amazon ECS who want a reliable GitHub Actions pipeline without reinventing the wheel each release. It is not a greenfield Kubernetes guide, and it assumes you have a working cluster, service, and task definition—not a blank AWS account.
If you have shipped Docker apps to production before, most of this will feel familiar. The goal here is a checklist you can hand to a teammate (or your future self) before the next 2 a.m. deploy.
Prerequisites
Before wiring CI, confirm these exist in AWS:
| Item | Why it matters |
|---|---|
| ECR repository | Stores immutable image tags per build |
| ECS cluster + service | Target for deployments |
| Task definition | CPU, memory, ports, env vars, secrets, log config |
| Execution role | Pulls images from ECR, writes logs, reads secrets |
| Task role | Runtime permissions for your app (S3, SQS, etc.) |
| ALB target group + health check | Rolling deploys need a passing health path |
| Secrets in SSM or Secrets Manager | Never bake credentials into the image |
On GitHub, use OIDC federation to assume an IAM role instead of long-lived AWS_ACCESS_KEY_ID secrets when possible. Fewer secrets to rotate, clearer audit trail.
Pipeline checklist
Walk through these steps in order every time you change the deploy workflow.
1. Build and tag the image
- Tag with the git commit SHA (
${{ github.sha }}) on every branch build. - Push
:latestonly from the default branch (usuallymain). - Pass build args for version metadata if your app exposes a
/healthor/versionendpoint. - Use layer caching (
cache-from/cache-to) if build times are painful—but verify cache invalidation when dependencies change.
2. Push to ECR
- Log in with
aws ecr get-login-password(or the officialamazon-ecr-loginaction). - Push all tags in one job so SHA and
latestalways point at the same digest onmain.
3. Render a new task definition revision
- Start from the current task definition registered in ECS—do not maintain a static JSON file in git unless you have a strong reason.
- Update only the image URI (and optionally env vars) for routine releases.
- Register the new revision and capture the ARN or revision number for the deploy step.
Tools like aws-actions/amazon-ecs-render-task-definition work well here; Terraform-managed task defs need a clear owner (pipeline vs. IaC) to avoid drift.
4. Deploy the ECS service
- Use
aws ecs update-servicewith the new task definition revision. - Enable the deployment circuit breaker with rollback when your platform supports it—failed deploys should revert, not limp along at 50% capacity.
- Set
minimumHealthyPercent/maximumPercentintentionally; aggressive settings speed deploys but reduce headroom during rollouts.
5. Smoke test after stability
- Wait for the service to stabilize:
aws ecs wait services-stable. - Hit the health check URL through the load balancer—not localhost on the runner.
- For web apps, verify one authenticated path if auth is easy to automate.
Skip the smoke step once and you will eventually push a broken image that passes ECS “running” but fails application startup.
Common failures (and fixes)
Wrong IAM on the execution role — Symptom: task stays PENDING or exits immediately with ECR pull errors. Fix: ecr:GetAuthorizationToken on * plus ecr:BatchGetImage / ecr:GetDownloadUrlForLayer on the repo ARN.
Stale task definition in the pipeline — Symptom: deploy “succeeds” but old code is live. Fix: always render from the live family revision, not a checked-in template from last quarter.
Health check grace period too short — Symptom: new tasks killed before the app boots (common with JVM, Magento, Laravel on cold start). Fix: raise healthCheckGracePeriodSeconds on the service and align ALB interval/threshold.
Environment variables only in the GitHub workflow — Symptom: local/staging differ from production mysteriously. Fix: env belongs in the task definition (or SSM/Secrets Manager references), not only in CI YAML.
latest tag in production task defs — Symptom: non-reproducible deploys. Fix: pin the image digest or SHA tag in the task definition revision you register.
Minimal workflow skeleton
Abbreviated example—adjust names, regions, and roles for your account. See the amazon-ecs-deploy-task-definition README for a complete sample.
Key steps in your workflow:
actions/checkoutaws-actions/configure-aws-credentials(prefer OIDCrole-to-assume)aws-actions/amazon-ecr-logindocker build/docker pushwith${{ github.sha }}tagaws ecs describe-task-definitionto fetch the current familyaws-actions/amazon-ecs-render-task-definitionto swap the image URIaws-actions/amazon-ecs-deploy-task-definitionwithwait-for-service-stability: true
Add a final step that curls your ALB health endpoint when you are ready for stricter gates.
What’s next
Infrastructure should match the pipeline: next week I will cover how I structure Terraform modules for multi-environment AWS—the layout that keeps dev, staging, and production from becoming three unrelated copies of the same mistakes.
If you only do one thing: pin every production deploy to an immutable image tag (git SHA), and make services-stable plus one real HTTP health check non-optional steps in the workflow.