Claude Code for Taint Analysis Workflow Tutorial Guide
Taint analysis is a powerful security technique that tracks untrusted data (tainted input) as it flows through your application, helping you identify potential vulnerabilities like SQL injection, cross-site scripting (XSS), and command injection. When combined with Claude Code, you can create efficient, reproducible taint analysis workflows that integrate smoothly into your development process.
This guide walks you through building a practical taint analysis workflow using Claude Code, with concrete examples you can adapt to your projects.
Understanding Taint Analysis Fundamentals
Before diving into the implementation, let’s establish the core concepts. Taint analysis works by:
- Marking sources - Identifying where untrusted data enters your system (user input, file reads, network requests)
- Propagating taint - Following data as it flows through variables, functions, and data structures
- Checking sinks - Detecting when tainted data reaches dangerous functions (database queries, shell commands, HTML output)
The goal is to automatically flag cases where untrusted input reaches sensitive operations without proper sanitization.
Setting Up Your Claude Code Environment
First, ensure you have Claude Code installed and configured:
# Install Claude Code
npm install -g @anthropic-ai/claude-code
# Verify installation
claude --version
Create a new skill for taint analysis:
# Create skill: ~/.claude/skills/taint-analyzer.md
Edit the skill file to include the necessary configuration:
---
name: taint-analyzer
description: "Analyzes code for potential taint flow vulnerabilities"
---
Building the Taint Analysis Workflow
Step 1: Define Taint Sources
The first step in any taint analysis is identifying where untrusted data enters your application. Create a configuration file that defines these sources:
// taint-sources.js
module.exports = {
// Web application sources
web: [
'req.query',
'req.body',
'req.params',
'req.headers',
'process.env'
],
// File system sources
filesystem: [
'fs.readFile',
'fs.readFileSync',
'fs.readFileAsync',
'process.argv'
],
// Network sources
network: [
'fetch',
'http.request',
'socket.data',
'message.body'
]
};
Step 2: Identify Dangerous Sinks
Next, define the sinks—functions that become vulnerable when they receive tainted data:
// taint-sinks.js
module.exports = {
// Database operations
database: [
{ pattern: 'query', severity: 'high' },
{ pattern: 'execute', severity: 'high' },
{ pattern: 'raw', severity: 'critical' }
],
// Command execution
command: [
{ pattern: 'exec', severity: 'critical' },
{ pattern: 'spawn', severity: 'critical' },
{ pattern: 'system', severity: 'critical' }
],
// Output operations
output: [
{ pattern: 'innerHTML', severity: 'high' },
{ pattern: 'dangerouslySetInnerHTML', severity: 'critical' },
{ pattern: 'eval', severity: 'critical' }
]
};
Step 3: Implementing the Analysis Script
Now create the main analysis script that Claude Code will use:
#!/usr/bin/env python3
"""
Taint Analysis Runner
Analyzes code for potential taint flow vulnerabilities
"""
import os
import re
import json
from pathlib import Path
class TaintAnalyzer:
def __init__(self, sources_file, sinks_file):
self.sources = self.load_config(sources_file)
self.sinks = self.load_config(sinks_file)
self.findings = []
def load_config(self, filepath):
with open(filepath) as f:
return json.load(f)
def analyze_file(self, filepath):
"""Analyze a single file for taint flows"""
with open(filepath, 'r') as f:
content = f.read()
lines = content.split('\n')
# Find all potential taint sources
for line_num, line in enumerate(lines, 1):
for source_category, sources in self.sources.items():
for source in sources:
if isinstance(source, str) and source in line:
self.check_taint_propagation(line_num, line, content)
def check_taint_propagation(self, line_num, line, content):
"""Check if taint from this line reaches any sink"""
# Simplified analysis: look for sink patterns in subsequent lines
subsequent_lines = content.split('\n')[line_num:]
for sink_category, sinks in self.sinks.items():
for sink_info in sinks:
pattern = sink_info['pattern']
for idx, subsequent_line in enumerate(subsequent_lines[:20]): # Check next 20 lines
if pattern in subsequent_line:
self.findings.append({
'source_line': line_num,
'sink_line': line_num + idx + 1,
'severity': sink_info['severity'],
'sink_type': sink_category,
'code': subsequent_line.strip()
})
def generate_report(self):
"""Output analysis results"""
print(f"\n{'='*60}")
print("TAINT ANALYSIS REPORT")
print(f"{'='*60}\n")
for finding in self.findings:
severity_emoji = {
'critical': '🔴',
'high': '🟠',
'medium': '🟡'
}.get(finding['severity'], '⚪')
print(f"{severity_emoji} [{finding['severity'].upper()}]")
print(f" Source: Line {finding['source_line']}")
print(f" Sink: Line {finding['sink_line']} ({finding['sink_type']})")
print(f" Code: {finding['code'][:80]}...")
print()
if __name__ == '__main__':
analyzer = TaintAnalyzer('taint-sources.js', 'taint-sinks.py')
# Analyze all JavaScript/Python files in the project
for ext in ['*.js', '*.py', '*.ts']:
for filepath in Path('.').rglob(ext):
analyzer.analyze_file(filepath)
analyzer.generate_report()
Integrating with Claude Code
Create a skill that orchestrates the taint analysis workflow:
# Taint Analysis Skill
This skill runs a comprehensive taint analysis on your codebase.
## Usage
To run a taint analysis:
1. First, I'll scan your project for potential taint sources
2. Then identify dangerous sinks that could be reached by untrusted input
3. Finally, generate a detailed report with severity ratings
## Analysis Coverage
- **SQL Injection**: Tracks user input through database queries
- **XSS Vulnerabilities**: Follows data flow to HTML output
- **Command Injection**: Monitors taint reaching shell commands
- **Path Traversal**: Checks file operations with user-controlled paths
## Recommendations
After analysis, I'll provide:
- Specific vulnerability locations with line numbers
- Severity assessments for each finding
- Remediation suggestions tailored to each issue
- Follow-up tasks to track fixes
Running the Analysis
Execute your taint analysis workflow with Claude Code:
# Run the analysis on your project
claude -s taint-analyzer "Run taint analysis on the src/ directory"
Claude Code will:
- Read all source files in the specified directory
- Identify taint sources and sinks
- Trace data flows between them
- Generate a prioritized vulnerability report
Practical Example: Detecting SQL Injection
Consider this vulnerable Node.js code:
// vulnerable.js
app.get('/user', (req, res) => {
const userId = req.query.id;
const query = `SELECT * FROM users WHERE id = ${userId}`;
db.execute(query).then(results => {
res.json(results);
});
});
When analyzed, the workflow identifies:
- Source:
req.query.id(line 2) - untrusted user input - Sink:
db.execute(query)(line 3) - database operation - Vulnerability: SQL injection (critical severity)
- Remediation: Use parameterized queries instead of string concatenation
Best Practices for Effective Taint Analysis
1. Keep Source/Sink Definitions Updated
As your project evolves, regularly update your taint configuration to include new libraries and patterns:
// Add new sources as you integrate new packages
custom: [
'router.params',
'uploadedFile.content',
'redis.get'
]
2. Set Appropriate Severity Levels
Not all taint flows are equally dangerous. Calibrate your severity settings:
- Critical: Direct execution (eval, exec, system)
- High: Database queries, file operations
- Medium: Logging, string operations
3. Integrate into CI/CD Pipeline
Make taint analysis part of your continuous integration:
# .github/workflows/taint-analysis.yml
name: Taint Analysis
on: [push, pull_request]
jobs:
taint-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Taint Analysis
run: python3 taint_analyzer.py
- name: Upload Results
uses: actions/upload-artifact@v3
with:
name: taint-report
path: report.json
4. Focus on High-Risk Areas First
Prioritize analysis on:
- Authentication and authorization code
- Database query builders
- Template rendering engines
- File upload handlers
- API endpoints processing user input
Conclusion
Claude Code transforms taint analysis from a complex static analysis task into an accessible, reproducible workflow. By defining clear source/sink configurations and using Claude’s ability to read and analyze code across your project, you can identify security vulnerabilities early in the development cycle.
The key is starting simple: begin with basic source/sink definitions, run initial analyses, and progressively refine your detection rules as you understand your codebase’s patterns. With this workflow in place, you’ll catch injection vulnerabilities before they reach production.
Remember: Taint analysis is one layer of defense. Combine it with input validation, output encoding, and regular security audits for comprehensive application security.
Related Reading
- Claude Code for Beginners: Complete Getting Started Guide
- Best Claude Skills for Developers in 2026
- Claude Skills Guides Hub
Built by theluckystrike — More at zovo.one