Claude Skills Guide

Claude Code for Claude Error Handling Patterns Workflow Guide

Error handling is a critical aspect of building robust Claude Code workflows. When you’re orchestrating AI agents to perform complex tasks, failures are inevitable—whether from API timeouts, malformed responses, or unexpected state changes. This guide explores practical error handling patterns that will make your Claude Code workflows more resilient and maintainable.

Understanding Error Types in Claude Code

Before diving into patterns, it’s essential to understand what can go wrong in a Claude Code workflow. Errors typically fall into several categories:

Understanding these error categories helps you design appropriate handling strategies for each scenario.

Pattern 1: Try-Catch with Tool Results

The fundamental error handling pattern in Claude Code involves checking tool execution results. Every tool returns a result object that indicates success or failure.

result = bash(command="npm install", timeout=300)
if result.exit_code != 0:
    print(f"Installation failed: {result.stderr}")
    # Handle the error appropriately

This pattern works for all tools—always check the return value before proceeding. Many developers make the mistake of assuming tools always succeed, which leads to cascading failures.

Pattern 2: Defensive Parameter Validation

Before calling any tool with user-provided parameters, validate inputs thoroughly. This prevents errors from propagating through your workflow.

def execute_with_validation(command):
    # Validate command is safe
    if not command or not isinstance(command, str):
        raise ValueError("Invalid command provided")
    
    # Check for dangerous patterns
    dangerous_patterns = ["rm -rf /", "curl | sh", "&& rm"]
    for pattern in dangerous_patterns:
        if pattern in command:
            raise SecurityError(f"Dangerous pattern detected: {pattern}")
    
    return bash(command=command)

This validation layer catches errors early and provides clear feedback about what went wrong.

Pattern 3: Retry Logic with Exponential Backoff

Transient errors often resolve themselves if you wait and retry. Implement retry logic with exponential backoff for operations that might succeed on a subsequent attempt:

import time

def retry_with_backoff(func, max_retries=3, base_delay=1):
    for attempt in range(max_retries):
        try:
            return func()
        except TemporaryError as e:
            if attempt == max_retries - 1:
                raise
            delay = base_delay * (2 ** attempt)
            print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay}s...")
            time.sleep(delay)

This pattern is particularly useful for:

Pattern 4: Circuit Breaker for External Services

When working with unreliable external services, implement a circuit breaker pattern to prevent cascading failures:

class CircuitBreaker:
    def __init__(self, failure_threshold=5, timeout=60):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.failures = 0
        self.last_failure_time = None
        self.state = "closed"  # closed, open, half-open
    
    def call(self, func):
        if self.state == "open":
            if time.time() - self.last_failure_time > self.timeout:
                self.state = "half-open"
            else:
                raise CircuitOpenError("Circuit is open")
        
        try:
            result = func()
            self.on_success()
            return result
        except Exception as e:
            self.on_failure()
            raise
    
    def on_success(self):
        self.failures = 0
        self.state = "closed"
    
    def on_failure(self):
        self.failures += 1
        self.last_failure_time = time.time()
        if self.failures >= self.failure_threshold:
            self.state = "open"

The circuit breaker prevents your workflow from repeatedly hitting a failing service, giving it time to recover.

Pattern 5: Graceful Degradation

Not all errors warrant stopping your workflow. Implement graceful degradation to continue operations with reduced functionality:

def get_user_data(user_id, prefer_cache=True):
    try:
        if prefer_cache:
            return cache.get(f"user:{user_id}")
    except CacheError:
        print("Cache unavailable, falling back to database")
    
    try:
        return database.query(f"SELECT * FROM users WHERE id = {user_id}")
    except DatabaseError:
        print("Database unavailable, returning mock data")
        return {"id": user_id, "name": "Unknown", "status": "degraded"}

This pattern ensures your workflow continues even when some components fail.

Pattern 6: Comprehensive Logging and Error Context

Always log sufficient context to diagnose issues later. Include relevant state information in your error messages:

def safe_execute(command, context):
    try:
        result = bash(command=command)
        return result
    except Exception as e:
        logger.error({
            "error": str(e),
            "command": command,
            "context": context,
            "timestamp": time.time(),
            "working_directory": os.getcwd()
        })
        raise

Good error context dramatically reduces debugging time when things go wrong in production.

Pattern 7: Structured Error Recovery Workflows

Design your workflows with explicit recovery paths for common error scenarios:

def workflow_with_recovery():
    # Primary path
    try:
        data = fetch_data()
        process_data(data)
    except NetworkError:
        # Recovery path 1: Use cached data
        print("Network error, attempting recovery with cached data")
        cached = load_cached_data()
        if cached:
            process_data(cached)
        else:
            # Recovery path 2: Use default values
            print("No cache available, using defaults")
            process_data(DEFAULT_DATA)
    except ValidationError as e:
        # Recovery path: Log and skip invalid data
        logger.warning(f"Validation error: {e}")
        skip_record()

Best Practices Summary

  1. Always check tool results - Never assume success
  2. Validate early, fail fast - Check inputs before processing
  3. Implement retries for transient failures with backoff
  4. Use circuit breakers for external dependencies
  5. Log comprehensively - Include context for debugging
  6. Design recovery paths - Plan for failure scenarios
  7. Test your error handling - Simulate failures to verify robustness

Conclusion

Error handling isn’t about preventing all failures—it’s about responding to them gracefully. By implementing these patterns in your Claude Code workflows, you’ll build systems that recover automatically from common issues, provide clear feedback when human intervention is needed, and continue operating even when components fail.

Start with the basics: always check tool results and log errors with context. Then layer in retry logic, circuit breakers, and graceful degradation as your workflows grow more complex. Your future self (and your users) will thank you when something inevitably goes wrong.

Built by theluckystrike — More at zovo.one