Claude Code for Capacity Planning Workflow Tutorial

Capacity planning is one of the most challenging aspects of infrastructure management. Teams often struggle with predicting future resource needs, balancing costs, and responding to demand spikes. Claude Code offers a powerful way to automate and enhance capacity planning workflows through intelligent automation and data-driven decision-making.

This tutorial walks you through building a complete capacity planning workflow using Claude Code skills, complete with practical examples and actionable advice you can apply immediately to your projects.

Understanding Capacity Planning Challenges

Before diving into the technical implementation, it’s worth understanding what makes capacity planning difficult. Traditional approaches rely on static thresholds and manual analysis, which leads to either over-provisioning (wasting money) or under-provisioning (causing outages).

Modern capacity planning requires:

Historical data analysis to identify trends
Demand forecasting to predict future needs
Cost optimization to balance performance and budget
Automated responses to scale resources appropriately

Claude Code can assist with all of these areas through carefully designed skills that handle data collection, analysis, and even execution of scaling actions.

Setting Up Your Capacity Planning Skill

The first step is creating a dedicated Claude Code skill for capacity planning. This skill will encapsulate all the prompts, tools, and logic needed for your workflow.

Create a new skill file at ~/.claude/skills/capacity-planning/skill.md:

---
name: capacity-planning
description: Analyzes resource utilization and generates capacity planning recommendations
tools: [Read, Write, Bash, todo]
---

You are a capacity planning expert. Analyze the provided metrics and generate actionable recommendations for infrastructure scaling.

This minimal skill definition gives you a foundation to build upon. The key here is restricting tools to only what’s necessary for the workflow—in this case, file operations, bash commands for running analysis scripts, and the todo tool for tracking action items.

Collecting and Analyzing Metrics

The core of any capacity planning workflow is data. Your Claude Code skill needs access to metrics from your infrastructure. Here’s how to structure the analysis phase:

#!/usr/bin/env python3
"""
capacity_metrics_collector.py
Collects key metrics for capacity planning analysis
"""
import json
from datetime import datetime, timedelta

def collect_cpu_metrics(hosts, time_range="24h"):
    """Collect CPU utilization metrics from monitored hosts"""
    metrics = []
    for host in hosts:
        # Simulated metric collection
        metrics.append({
            "host": host,
            "avg_cpu": 65.2,
            "peak_cpu": 89.7,
            "p95_cpu": 78.3,
            "timestamp": datetime.now().isoformat()
        })
    return metrics

def collect_memory_metrics(hosts):
    """Collect memory utilization metrics"""
    metrics = []
    for host in hosts:
        metrics.append({
            "host": host,
            "avg_memory": 72.4,
            "peak_memory": 91.2,
            "p95_memory": 84.1,
            "timestamp": datetime.now().isoformat()
        })
    return metrics

if __name__ == "__main__":
    hosts = ["app-server-1", "app-server-2", "db-primary", "db-replica"]
    data = {
        "cpu": collect_cpu_metrics(hosts),
        "memory": collect_memory_metrics(hosts)
    }
    print(json.dumps(data, indent=2))

This script collects CPU and memory metrics from your infrastructure. Run it periodically and store the results for trend analysis. Your Claude Code skill can then read these JSON files and provide analysis.

Building the Analysis Prompt

The real power of Claude Code comes from its ability to understand context and provide intelligent recommendations. Here’s how to structure the analysis prompt within your skill:

## Analysis Task

Review the collected metrics and provide capacity planning recommendations:

1. Identify utilization patterns and trends
2. Flag any resources approaching critical thresholds (>80%)
3. Recommend scaling actions for the next 7 days
4. Estimate cost implications of recommended changes

Present your findings in a structured format with clear action items.

When you invoke this skill with your metrics data, Claude will analyze patterns and provide recommendations based on its understanding of capacity planning best practices. The model can identify correlations you might miss and suggest actions that balance performance with cost efficiency.

Creating Automated Scaling Recommendations

Beyond passive analysis, you can extend your capacity planning skill to generate concrete scaling recommendations. Here’s a practical approach:

#!/usr/bin/env python3
"""
scaling_recommender.py
Generates scaling recommendations based on utilization data
"""
import json

def analyze_scaling_needs(metrics, thresholds={"cpu": 80, "memory": 85}):
    recommendations = []
    
    for host_data in metrics.get("cpu", []):
        if host_data["peak_cpu"] > thresholds["cpu"]:
            recommendations.append({
                "host": host_data["host"],
                "action": "scale_up",
                "reason": f"Peak CPU {host_data['peak_cpu']}% exceeds threshold",
                "current": host_data["peak_cpu"],
                "threshold": thresholds["cpu"]
            })
    
    for host_data in metrics.get("memory", []):
        if host_data["peak_memory"] > thresholds["memory"]:
            recommendations.append({
                "host": host_data["host"],
                "action": "scale_up",
                "reason": f"Peak memory {host_data['peak_memory']}% exceeds threshold",
                "current": host_data["peak_memory"],
                "threshold": thresholds["memory"]
            })
    
    return recommendations

if __name__ == "__main__":
    # Sample input
    sample_metrics = {
        "cpu": [{"host": "app-server-1", "peak_cpu": 92.5}],
        "memory": [{"host": "db-primary", "peak_memory": 88.1}]
    }
    results = analyze_scaling_needs(sample_metrics)
    print(json.dumps(results, indent=2))

This script identifies when resources exceed defined thresholds and generates actionable recommendations. Integrate it into your Claude Code workflow by having the skill call this script and then analyze the output.

Implementing a Complete Workflow

Now let’s put it all together into a complete capacity planning workflow:

#!/bin/bash
# capacity-planning-workflow.sh
# Complete capacity planning workflow orchestration

set -e

echo "=== Starting Capacity Planning Workflow ==="
echo "Collecting metrics..."
python3 ~/scripts/capacity_metrics_collector.py > /tmp/metrics.json

echo "Analyzing and generating recommendations..."
python3 ~/scripts/scaling_recommender.py < /tmp/metrics.json > /tmp/recommendations.json

echo "Review recommendations with Claude..."
claude --print "
Review the following scaling recommendations:
$(cat /tmp/recommendations.json)

For each recommendation, provide:
1. Priority (P1-P3)
2. Implementation steps
3. Expected outcome
"

This workflow collects metrics, generates initial recommendations, and then uses Claude to add intelligent context and prioritization. The human-in-the-loop approach ensures that scaling decisions are reviewed before execution.

Best Practices for Capacity Planning Skills

When building capacity planning workflows with Claude Code, keep these best practices in mind:

Start with clean data. Claude is only as good as the data you provide. Ensure your metric collection is reliable and consistent. Invest time in proper instrumentation before expecting useful recommendations.

Use appropriate thresholds. Not all resources should have the same thresholds. Database servers might need headroom at 70% CPU, while stateless application servers can safely run at 90%. Tailor thresholds to your workload characteristics.

Include cost context. Capacity planning is always a trade-off between performance and cost. Include pricing information in your data so Claude can recommend cost-effective solutions.

Maintain human oversight. Fully automated scaling can be risky. Design your workflow to generate recommendations that humans review before execution, especially for production environments.

Iterate and improve. Start simple and add sophistication over time. Monitor what recommendations Claude provides and refine your prompts based on the results.

Conclusion

Claude Code transforms capacity planning from a reactive, manual process into an intelligent, automated workflow. By combining structured data collection with Claude’s analysis capabilities, you can build systems that proactively identify scaling needs, predict future demand, and optimize resource allocation.

The key is starting simple: collect metrics, generate basic recommendations, and gradually add sophistication as you learn what works for your specific infrastructure. With this tutorial’s patterns as a foundation, you’ll be able to build capacity planning workflows that save both money and prevent outages.

Remember that the most effective capacity planning combines automation with human judgment. Use Claude Code to do the heavy lifting of data analysis and recommendation generation, but keep experienced engineers in the loop for critical decisions.