Scanning Terraform Before It Reaches AWS

Shifting security left is a slogan until you wire it into your CI pipeline. Here’s what it actually means for Terraform: every pull request that touches infrastructure code gets a three-layer security scan before anyone reviews the business logic. Misconfigurations fail the pipeline. Developers fix them before merge, not after terraform apply.

This post covers the three tools I use, what each one catches, and the exact pipeline configuration that makes them work together without generating alert fatigue.

Why Three Layers?

No single scanner catches everything. They use different approaches and have different coverage areas:

tfsec — deep AWS/GCP/Azure resource knowledge, catches service-specific misconfigurations fast.
Checkov — broader rule set covering more resource types, good for compliance frameworks (CIS, SOC 2, PCI).
OPA (Open Policy Agent) — custom policy logic your team writes. Catches the organisation-specific rules that off-the-shelf scanners don’t know about.

They’re also complementary in terms of false-positive rates. A finding that appears in two out of three tools is almost certainly real. A finding from only one tool might warrant a suppression comment if it’s a documented exception.

Layer 1: tfsec

tfsec is the fastest layer — it runs in under ten seconds on most codebases and gives immediate, human-readable output. Here’s the GitHub Actions step:

- name: tfsec
  uses: aquasecurity/tfsec-action@v1.0.3
  with:
    working_directory: terraform/
    soft_fail: false
    format: sarif
    additional_args: --minimum-severity HIGH

Key decisions here:

soft_fail: false — the step fails the pipeline on findings. There is no point running a scanner that doesn’t block the merge.
minimum-severity HIGH — I start with HIGH and CRITICAL only. Adding MEDIUM and LOW comes later, once the team has fixed the existing findings and the pipeline isn’t drowning in noise. Low-severity findings on day one are a great way to get the scanner turned off.
format: sarif — SARIF output integrates with GitHub Advanced Security, showing findings inline on the PR diff rather than in a separate log.

Common findings tfsec catches: unencrypted S3 buckets, security groups with 0.0.0.0/0 ingress on port 22, RDS instances without deletion protection, CloudTrail not enabled, IAM users with console access and no MFA enforcement.

Layer 2: Checkov

Checkov takes longer but covers more ground. I run it with a framework filter to focus on the checks most relevant to the environment:

- name: Checkov
  uses: bridgecrewio/checkov-action@master
  with:
    directory: terraform/
    framework: terraform
    check: CKV_AWS_*
    soft_fail: false
    output_format: sarif
    output_file_path: checkov.sarif

The CKV_AWS_* filter runs all AWS-specific checks without touching Kubernetes, Dockerfile, or CloudFormation rules. Once you expand to other resource types in your IaC, add the relevant check prefixes.

Checkov adds coverage that tfsec misses: EBS volume encryption, RDS snapshot encryption, VPC flow logging, WAF association on ALBs, CloudWatch log retention periods, and a large catalogue of IAM policy checks including the wildcard action and missing resource constraint issues I described in the IAM post.

Layer 3: OPA Custom Policies

This is where you encode your organisation’s specific requirements. The off-the-shelf tools don’t know that your company mandates all resources have a cost-centre tag, or that production EC2 instances must use a specific AMI family, or that no S3 bucket should have a name matching your internal naming convention for backup buckets unless it’s explicitly classified as a backup.

OPA policies are written in Rego and evaluated against Terraform plan JSON. First, generate the plan:

- name: Terraform plan to JSON
  run: |
    terraform init
    terraform plan -out=tfplan.binary
    terraform show -json tfplan.binary > tfplan.json

Then evaluate with OPA:

- name: OPA policy check
  run: |
    opa eval \
      --data policies/ \
      --input tfplan.json \
      --format pretty \
      "data.terraform.deny" | tee opa-results.txt

    if grep -q '"deny"' opa-results.txt; then
      echo "OPA policy violations found"
      exit 1
    fi

A simple Rego policy that enforces mandatory tags:

package terraform

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"
    not resource.change.after.tags["cost-centre"]
    msg := sprintf("EC2 instance %v is missing required tag: cost-centre",
                   [resource.address])
}

Once you have the pipeline, adding new organisational policies is a pull request to the policies/ directory — no scanner configuration changes needed.

Managing Suppressions

Every scanning pipeline eventually needs a way to handle legitimate exceptions — a publicly accessible S3 bucket that hosts a static website, for example, will fail the “no public S3 bucket” check even though it’s intentional. Suppressions are how you acknowledge the exception without turning off the rule.

In tfsec, add a comment to the resource:

resource "aws_s3_bucket" "website" {
  bucket = "my-public-website"
  # tfsec:ignore:AWS077 - Static website, public access required and intentional
}

In Checkov, use a suppression comment at the top of the resource block:

resource "aws_s3_bucket" "website" {
  #checkov:skip=CKV_AWS_20:Static website bucket, public read intentional
  bucket = "my-public-website"
}

The key discipline: every suppression comment must include a reason. A suppression without a reason is indistinguishable from someone who just wanted the pipeline to pass. Require reasons in your code review checklist and reject suppressions that say “not applicable” without explaining why.

Keeping the Signal High

The biggest risk with security scanning pipelines is that they become background noise. This happens when:

There are too many findings at launch and the team starts adding suppressions indiscriminately to get the pipeline green.
The scanner is run in report-only mode (soft fail) and nobody looks at the reports.
Findings from months ago accumulate and nobody owns fixing them.

The fix is to start small and ratchet. Begin with HIGH/CRITICAL findings only, enforce the pipeline block, fix everything before moving on. Once the pipeline is clean, add MEDIUM severity. Once that’s clean, add custom OPA policies. This way the team always has a clean baseline and new findings are surfaced immediately rather than buried in a backlog of existing violations.