AI Tools Compared

Infrastructure as Code is one of the highest-risk areas for AI hallucinations. A hallucinated Terraform resource that references a non-existent attribute doesn’t just fail tests — it can leave infrastructure in a broken state or, worse, create insecure configurations silently. This guide evaluates AI tools specifically for IaC accuracy and security defaults.

Why IaC Is Harder for AI

Language models are trained on public Terraform registry data, AWS/Azure documentation, and GitHub examples. But:

The result: AI generates syntactically correct HCL that uses deprecated attributes, misses required encryption settings, or creates resources with overly permissive policies.

Testing Approach

Each tool was tested with 10 IaC tasks across Terraform (AWS), Pulumi (TypeScript), and CDK (Python). Key evaluation criteria:

Terraform Generation

Claude Sonnet

Prompt: “Create a Terraform module for an RDS PostgreSQL instance with: Multi-AZ, encryption at rest, automated backups (7 days), and the security group allowing access only from a specific CIDR.”

# Claude output — security-conscious defaults
resource "aws_db_instance" "main" {
  identifier        = var.identifier
  engine            = "postgres"
  engine_version    = "16.1"
  instance_class    = var.instance_class
  allocated_storage = var.allocated_storage

  # Encryption at rest - required for compliance
  storage_encrypted = true
  kms_key_id       = var.kms_key_arn

  # High availability
  multi_az = true

  # Backup configuration
  backup_retention_period = 7
  backup_window          = "03:00-04:00"
  maintenance_window     = "sun:04:00-sun:05:00"

  # Network
  db_subnet_group_name   = aws_db_subnet_group.main.name
  vpc_security_group_ids = [aws_security_group.rds.id]

  # Prevent accidental deletion
  deletion_protection = true
  skip_final_snapshot = false
  final_snapshot_identifier = "${var.identifier}-final-snapshot"

  # Performance Insights (recommended for production)
  performance_insights_enabled          = true
  performance_insights_retention_period = 7

  tags = var.tags
}

resource "aws_security_group" "rds" {
  name_prefix = "${var.identifier}-rds-"
  vpc_id      = var.vpc_id

  ingress {
    from_port   = 5432
    to_port     = 5432
    protocol    = "tcp"
    cidr_blocks = [var.allowed_cidr]
    description = "PostgreSQL access from application tier"
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
    description = "Allow all outbound"
  }

  tags = var.tags

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_db_subnet_group" "main" {
  name_prefix = "${var.identifier}-"
  subnet_ids  = var.subnet_ids
  tags        = var.tags
}

Claude includes deletion_protection = true, skip_final_snapshot = false, and KMS encryption by default — all things that public examples often omit. It also adds a description to the security group ingress rule, which is a best practice that linters like Checkov flag when missing.

GPT-4o

GPT-4o produced similar output but set deletion_protection = false and omitted performance_insights_enabled. These aren’t security issues but are defaults you’d want to override for production.

GitHub Copilot

Copilot’s inline suggestions for Terraform are useful for attribute completion but poor for generating complete resources from scratch. It fills in resource "aws_db_instance" attribute names correctly but doesn’t apply security best practices to the defaults it suggests.

Pulumi TypeScript

Prompt: “Create a Pulumi program that deploys a Lambda function with a Dead Letter Queue (SQS), CloudWatch alarm for error rate, and least-privilege IAM role.”

// Claude's Pulumi output
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const config = new pulumi.Config();
const functionName = config.require("functionName");

// Dead Letter Queue
const dlq = new aws.sqs.Queue(`${functionName}-dlq`, {
    messageRetentionSeconds: 1209600, // 14 days
    tags: { Environment: pulumi.getStack() },
});

// Least-privilege IAM role
const lambdaRole = new aws.iam.Role(`${functionName}-role`, {
    assumeRolePolicy: aws.iam.assumeRolePolicyForPrincipal({
        Service: "lambda.amazonaws.com",
    }),
});

// Only grant basic execution + DLQ send permission
// Not AdministratorAccess or even arn:aws:iam::aws:policy/AWSLambdaFullAccess
new aws.iam.RolePolicyAttachment(`${functionName}-basic-execution`, {
    role: lambdaRole.name,
    policyArn: aws.iam.ManagedPolicies.AWSLambdaBasicExecutionRole,
});

const dlqPolicy = new aws.iam.Policy(`${functionName}-dlq-policy`, {
    policy: pulumi.all([dlq.arn]).apply(([dlqArn]) => JSON.stringify({
        Version: "2012-10-17",
        Statement: [{
            Effect: "Allow",
            Action: ["sqs:SendMessage"],
            Resource: dlqArn,
        }],
    })),
});

new aws.iam.RolePolicyAttachment(`${functionName}-dlq-attach`, {
    role: lambdaRole.name,
    policyArn: dlqPolicy.arn,
});

// Lambda function
const lambdaFunction = new aws.lambda.Function(functionName, {
    runtime: aws.lambda.Runtime.NodeJS20dX,
    code: new pulumi.asset.AssetArchive({
        ".": new pulumi.asset.FileArchive("./dist"),
    }),
    handler: "index.handler",
    role: lambdaRole.arn,
    deadLetterConfig: { targetArn: dlq.arn },
    environment: {
        variables: { NODE_ENV: pulumi.getStack() }
    },
    tags: { Environment: pulumi.getStack() },
});

// CloudWatch alarm for error rate
new aws.cloudwatch.MetricAlarm(`${functionName}-error-alarm`, {
    alarmDescription: `${functionName} error rate exceeds threshold`,
    namespace: "AWS/Lambda",
    metricName: "Errors",
    dimensions: { FunctionName: lambdaFunction.name },
    statistic: "Sum",
    period: 300,  // 5 minutes
    evaluationPeriods: 2,
    threshold: 5,
    comparisonOperator: "GreaterThanThreshold",
    treatMissingData: "notBreaching",
});

export const functionArn = lambdaFunction.arn;
export const dlqUrl = dlq.url;

The IAM policy creates only sqs:SendMessage on the specific DLQ ARN — minimal permissions. This is the opposite of what many public examples do (attach AWSLambdaFullAccess or worse).

Ansible Playbooks

For Ansible, Claude’s output is most reliable because it avoids shell module shortcuts in favor of specific modules:

# Claude generates idiomatic Ansible, not shell hacks
# Bad (what many examples show):
- name: Install nginx
  shell: apt-get install -y nginx

# Good (what Claude generates):
- name: Install nginx
  ansible.builtin.apt:
    name: nginx
    state: present
    update_cache: true
  become: true

Claude also adds no_log: true to tasks that handle passwords or tokens, which prevents secrets from appearing in Ansible output logs.

Security Scanning Integration

Regardless of which AI tool you use, run generated IaC through a scanner:

# Install Checkov
pip install checkov

# Scan Terraform
checkov -d ./terraform --framework terraform

# Scan specific file
checkov -f main.tf

# Output JSON for CI integration
checkov -d . --output json > checkov-results.json

For Pulumi, use pulumi preview with --policy-pack to catch security issues before deployment.

A workflow that combines AI generation with automated scanning catches the most common failures (unencrypted storage, overly permissive SGs, missing logging) before they reach production.

  1. Use Claude Sonnet for initial IaC generation — best security defaults
  2. Run Checkov/tfsec immediately after generation
  3. Fix any scanner findings (often AI-generated code has 1-2 findings)
  4. Use Copilot for inline attribute completion while editing in VS Code

Claude’s security-conscious defaults save 15-20 minutes per resource compared to starting with a template that has insecure defaults and fixing them manually.

Built by theluckystrike — More at zovo.one