Claude Skills Guide

Building Production AI Agents with Claude Skills in 2026

Claude Code skills are .md files that extend Claude’s behavior for specific tasks When building production AI agents that run autonomously, these skills provide specialized context — for document processing, TDD workflows, memory management, and more. This guide covers practical patterns for composing them into agents that handle real workloads.

Core Skills for Agent Development

TDD Skill for Quality Assurance

The tdd skill guides Claude through red-green-refactor cycles. In an agent pipeline, invoke it on your agent’s own test suite before deployment:

# Run TDD skill analysis on agent modules
claude -p "/tdd analyze the agent modules in ./src and generate missing test cases"

This generates test cases based on your agent’s actual implementation — state management, tool invocation sequences, and response validation.

PDF Skill for Document Processing

Production agents frequently process PDF documents. Invoke the pdf skill when the agent receives a document:

claude -p "/pdf Extract all tables and section headings from /tmp/incoming-contract.pdf"

The skill handles complex layouts, form fields, and multi-column documents.

Supermemory Skill for Context Management

Long-running agents need persistent context between sessions. The supermemory skill provides retrieval of previously stored knowledge:

/supermemory store: Order #12345 processed for customer acme-corp, result: approved
/supermemory What are the previous orders for acme-corp?

This lets agents maintain decision history and reference previous outputs without re-loading full context on every run.

Building a Production Agent Workflow

A practical order-processing agent sequences skills via claude -p calls from a shell script:

#!/bin/bash
# agent-pipeline.sh

PDF_FILE="$1"
CUSTOMER_ID="$2"

# Step 1: Extract order details from PDF
EXTRACTED=$(claude -p "/pdf Extract order details (items, quantities, totals) from $PDF_FILE")

# Step 2: Retrieve customer history
HISTORY=$(claude -p "/supermemory What are the previous orders for $CUSTOMER_ID?")

# Step 3: Process with full context
RESULT=$(claude -p "
Order details: $EXTRACTED

Customer history: $HISTORY

Determine whether to approve this order based on past account standing.
Reply with APPROVED or HOLD and a one-sentence reason.
")

echo "$RESULT"

# Step 4: Store the result for future reference
claude -p "/supermemory store: Order from $PDF_FILE for $CUSTOMER_ID$RESULT"

Each step is discrete and testable. The tdd skill can generate tests for each step’s expected input/output pairs before you deploy.

Skill Composition Strategies

Sequential processing: Chain claude -p calls where each output feeds the next. Use when transformations build on each other.

Parallel execution: Invoke independent skills simultaneously using background shell processes:

# Run pdf extraction and memory retrieval in parallel
claude -p "/pdf Extract order details from $PDF_FILE" > /tmp/pdf-result.txt &
claude -p "/supermemory What is the order history for $CUSTOMER_ID?" > /tmp/mem-result.txt &
wait

# Then process both results together
RESULT=$(claude -p "PDF result: $(cat /tmp/pdf-result.txt)
Memory: $(cat /tmp/mem-result.txt)
Approve or hold this order?")

Conditional routing: Select skills based on input analysis using shell conditionals:

if [[ "$REQUEST_TYPE" == "document" ]]; then
    claude -p "/pdf Process $FILE"
elif [[ "$REQUEST_TYPE" == "query" ]]; then
    claude -p "/supermemory $QUERY"
else
    claude -p "Handle this general request: $REQUEST"
fi

Error Handling and Resilience

Implement retry logic for skill invocations that may fail transiently:

run_skill_with_retry() {
    local PROMPT="$1"
    local OUTPUT="$2"
    local MAX_RETRIES=3
    local ATTEMPT=0

    while [ $ATTEMPT -lt $MAX_RETRIES ]; do
        RESULT=$(claude -p "$PROMPT" 2>/dev/null)
        if [[ $? -eq 0 && -n "$RESULT" ]]; then
            echo "$RESULT" > "$OUTPUT"
            return 0
        fi
        ATTEMPT=$((ATTEMPT + 1))
        echo "Attempt $ATTEMPT failed, retrying..." >&2
        sleep $((ATTEMPT * 2))
    done

    echo "All retries failed" >&2
    return 1
}

When a skill fails repeatedly, fail fast rather than consuming resources on retries.

Monitoring Skill Execution

Track which skills your agent invokes and their outcomes using structured logging:

invoke_skill() {
    local SKILL="$1"
    local PROMPT="$2"
    local START=$(date +%s%N)

    RESULT=$(claude -p "/$SKILL $PROMPT" 2>/dev/null)
    local EXIT_CODE=$?
    local END=$(date +%s%N)
    local DURATION_MS=$(( (END - START) / 1000000 ))

    if [[ $EXIT_CODE -eq 0 ]]; then
        echo "{\"skill\": \"$SKILL\", \"status\": \"success\", \"duration_ms\": $DURATION_MS}" >> /var/log/agent-metrics.jsonl
    else
        echo "{\"skill\": \"$SKILL\", \"status\": \"failure\", \"duration_ms\": $DURATION_MS}" >> /var/log/agent-metrics.jsonl
    fi

    echo "$RESULT"
}

These logs help you identify which skills cause latency and which fail under load.

Deployment Considerations

Version management: Pin the version of Claude Code you’re using in production. Skills are read from your local ~/.claude/skills/ directory — store them in version control and deploy them alongside your agent.

npm install -g @anthropic-ai/claude-code@1.x.x

Resource allocation: Profile which skills are invoked most frequently. If your agent processes thousands of PDFs daily, ensure you have sufficient API quota for those calls.

Testing before deployment: Use the tdd skill to validate that your skill composition produces expected outputs for representative inputs before shipping to production:

claude -p "/tdd Write integration tests for this agent pipeline script: $(cat agent-pipeline.sh)"


*Built by theluckystrike — More at zovo.one *