Claude Code for Taint Analysis Workflow Tutorial Guide

Taint analysis is a powerful security technique that tracks untrusted data (tainted input) as it flows through your application, helping you identify potential vulnerabilities like SQL injection, cross-site scripting (XSS), and command injection. When combined with Claude Code, you can create efficient, reproducible taint analysis workflows that integrate smoothly into your development process.

This guide walks you through building a practical taint analysis workflow using Claude Code, with concrete examples you can adapt to your projects.

Understanding Taint Analysis Fundamentals

Before diving into the implementation, let’s establish the core concepts. Taint analysis works by:

Marking sources - Identifying where untrusted data enters your system (user input, file reads, network requests)
Propagating taint - Following data as it flows through variables, functions, and data structures
Checking sinks - Detecting when tainted data reaches dangerous functions (database queries, shell commands, HTML output)

The goal is to automatically flag cases where untrusted input reaches sensitive operations without proper sanitization.

Setting Up Your Claude Code Environment

First, ensure you have Claude Code installed and configured:

# Install Claude Code
npm install -g @anthropic-ai/claude-code

# Verify installation
claude --version

Create a new skill for taint analysis:

# Create skill: ~/.claude/skills/taint-analyzer.md

Edit the skill file to include the necessary configuration:

---
name: taint-analyzer
description: "Analyzes code for potential taint flow vulnerabilities"
---

Building the Taint Analysis Workflow

Step 1: Define Taint Sources

The first step in any taint analysis is identifying where untrusted data enters your application. Create a configuration file that defines these sources:

// taint-sources.js
module.exports = {
  // Web application sources
  web: [
    'req.query',
    'req.body',
    'req.params',
    'req.headers',
    'process.env'
  ],
  
  // File system sources
  filesystem: [
    'fs.readFile',
    'fs.readFileSync',
    'fs.readFileAsync',
    'process.argv'
  ],
  
  // Network sources
  network: [
    'fetch',
    'http.request',
    'socket.data',
    'message.body'
  ]
};

Step 2: Identify Dangerous Sinks

Next, define the sinks—functions that become vulnerable when they receive tainted data:

// taint-sinks.js
module.exports = {
  // Database operations
  database: [
    { pattern: 'query', severity: 'high' },
    { pattern: 'execute', severity: 'high' },
    { pattern: 'raw', severity: 'critical' }
  ],
  
  // Command execution
  command: [
    { pattern: 'exec', severity: 'critical' },
    { pattern: 'spawn', severity: 'critical' },
    { pattern: 'system', severity: 'critical' }
  ],
  
  // Output operations
  output: [
    { pattern: 'innerHTML', severity: 'high' },
    { pattern: 'dangerouslySetInnerHTML', severity: 'critical' },
    { pattern: 'eval', severity: 'critical' }
  ]
};

Step 3: Implementing the Analysis Script

Now create the main analysis script that Claude Code will use:

#!/usr/bin/env python3
"""
Taint Analysis Runner
Analyzes code for potential taint flow vulnerabilities
"""

import os
import re
import json
from pathlib import Path

class TaintAnalyzer:
    def __init__(self, sources_file, sinks_file):
        self.sources = self.load_config(sources_file)
        self.sinks = self.load_config(sinks_file)
        self.findings = []
    
    def load_config(self, filepath):
        with open(filepath) as f:
            return json.load(f)
    
    def analyze_file(self, filepath):
        """Analyze a single file for taint flows"""
        with open(filepath, 'r') as f:
            content = f.read()
            lines = content.split('\n')
        
        # Find all potential taint sources
        for line_num, line in enumerate(lines, 1):
            for source_category, sources in self.sources.items():
                for source in sources:
                    if isinstance(source, str) and source in line:
                        self.check_taint_propagation(line_num, line, content)
    
    def check_taint_propagation(self, line_num, line, content):
        """Check if taint from this line reaches any sink"""
        # Simplified analysis: look for sink patterns in subsequent lines
        subsequent_lines = content.split('\n')[line_num:]
        
        for sink_category, sinks in self.sinks.items():
            for sink_info in sinks:
                pattern = sink_info['pattern']
                for idx, subsequent_line in enumerate(subsequent_lines[:20]):  # Check next 20 lines
                    if pattern in subsequent_line:
                        self.findings.append({
                            'source_line': line_num,
                            'sink_line': line_num + idx + 1,
                            'severity': sink_info['severity'],
                            'sink_type': sink_category,
                            'code': subsequent_line.strip()
                        })
    
    def generate_report(self):
        """Output analysis results"""
        print(f"\n{'='*60}")
        print("TAINT ANALYSIS REPORT")
        print(f"{'='*60}\n")
        
        for finding in self.findings:
            severity_emoji = {
                'critical': '🔴',
                'high': '🟠',
                'medium': '🟡'
            }.get(finding['severity'], '⚪')
            
            print(f"{severity_emoji} [{finding['severity'].upper()}]")
            print(f"  Source: Line {finding['source_line']}")
            print(f"  Sink: Line {finding['sink_line']} ({finding['sink_type']})")
            print(f"  Code: {finding['code'][:80]}...")
            print()

if __name__ == '__main__':
    analyzer = TaintAnalyzer('taint-sources.js', 'taint-sinks.py')
    
    # Analyze all JavaScript/Python files in the project
    for ext in ['*.js', '*.py', '*.ts']:
        for filepath in Path('.').rglob(ext):
            analyzer.analyze_file(filepath)
    
    analyzer.generate_report()

Integrating with Claude Code

Create a skill that orchestrates the taint analysis workflow:

# Taint Analysis Skill

This skill runs a comprehensive taint analysis on your codebase.

## Usage

To run a taint analysis:

1. First, I'll scan your project for potential taint sources
2. Then identify dangerous sinks that could be reached by untrusted input
3. Finally, generate a detailed report with severity ratings

## Analysis Coverage

- **SQL Injection**: Tracks user input through database queries
- **XSS Vulnerabilities**: Follows data flow to HTML output
- **Command Injection**: Monitors taint reaching shell commands
- **Path Traversal**: Checks file operations with user-controlled paths

## Recommendations

After analysis, I'll provide:
- Specific vulnerability locations with line numbers
- Severity assessments for each finding
- Remediation suggestions tailored to each issue
- Follow-up tasks to track fixes

Running the Analysis

Execute your taint analysis workflow with Claude Code:

# Run the analysis on your project
claude -s taint-analyzer "Run taint analysis on the src/ directory"

Claude Code will:

Read all source files in the specified directory
Identify taint sources and sinks
Trace data flows between them
Generate a prioritized vulnerability report

Practical Example: Detecting SQL Injection

Consider this vulnerable Node.js code:

// vulnerable.js
app.get('/user', (req, res) => {
  const userId = req.query.id;
  const query = `SELECT * FROM users WHERE id = ${userId}`;
  db.execute(query).then(results => {
    res.json(results);
  });
});

When analyzed, the workflow identifies:

Source: req.query.id (line 2) - untrusted user input
Sink: db.execute(query) (line 3) - database operation
Vulnerability: SQL injection (critical severity)
Remediation: Use parameterized queries instead of string concatenation

Best Practices for Effective Taint Analysis

1. Keep Source/Sink Definitions Updated

As your project evolves, regularly update your taint configuration to include new libraries and patterns:

// Add new sources as you integrate new packages
custom: [
  'router.params',
  'uploadedFile.content',
  'redis.get'
]

2. Set Appropriate Severity Levels

Not all taint flows are equally dangerous. Calibrate your severity settings:

Critical: Direct execution (eval, exec, system)
High: Database queries, file operations
Medium: Logging, string operations

3. Integrate into CI/CD Pipeline

Make taint analysis part of your continuous integration:

# .github/workflows/taint-analysis.yml
name: Taint Analysis
on: [push, pull_request]

jobs:
  taint-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run Taint Analysis
        run: python3 taint_analyzer.py
      - name: Upload Results
        uses: actions/upload-artifact@v3
        with:
          name: taint-report
          path: report.json

4. Focus on High-Risk Areas First

Prioritize analysis on:

Authentication and authorization code
Database query builders
Template rendering engines
File upload handlers
API endpoints processing user input

Conclusion

Claude Code transforms taint analysis from a complex static analysis task into an accessible, reproducible workflow. By defining clear source/sink configurations and using Claude’s ability to read and analyze code across your project, you can identify security vulnerabilities early in the development cycle.

The key is starting simple: begin with basic source/sink definitions, run initial analyses, and progressively refine your detection rules as you understand your codebase’s patterns. With this workflow in place, you’ll catch injection vulnerabilities before they reach production.

Remember: Taint analysis is one layer of defense. Combine it with input validation, output encoding, and regular security audits for comprehensive application security.

Built by theluckystrike — More at zovo.one