How to Use AI for Technical Debt Management

Technical debt accumulates in every codebase. AI tools don’t eliminate debt, but they can accelerate the three stages that matter: identifying it systematically, prioritizing what to fix, and generating the refactored code. This guide covers practical workflows for each stage.

Stage 1: Identification

AI-assisted debt identification goes beyond what linters catch. Linters find style violations; AI identifies architectural problems, outdated patterns, and code that works but was written before modern idioms existed.

Codebase Audit Prompt

Paste a file or module into Claude with this prompt:

Analyze this code for technical debt. Categorize each issue as:

1. CRITICAL - likely to cause bugs, security issues, or major maintenance problems
2. IMPORTANT - makes the code harder to maintain or extend
3. MINOR - style or modernization issues that don't affect function

For each issue include:
- The specific line(s) or function
- Why it's a problem
- Effort to fix: Low (< 1 hour), Medium (1-8 hours), High (> 1 day)
- Risk to fix: Low (isolated change), Medium (affects callers), High (system-wide impact)

This produces a structured list you can add directly to your issue tracker.

Automated Scanning Script

For a large codebase, run AI analysis across multiple files systematically:

#!/usr/bin/env python3
# scripts/debt_audit.py

import anthropic
from pathlib import Path
import json
from datetime import datetime

client = anthropic.Anthropic()

DEBT_PROMPT = """Analyze this {language} file for technical debt.
Output JSON only, no explanation:
{{
  "issues": [
    {{
      "line": <line_number_or_null>,
      "function": "<function_name_or_null>",
      "category": "CRITICAL|IMPORTANT|MINOR",
      "description": "<what the problem is>",
      "fix_effort": "LOW|MEDIUM|HIGH",
      "fix_risk": "LOW|MEDIUM|HIGH"
    }}
  ],
  "debt_score": <1-10, where 10 is worst>
}}"""

def audit_file(file_path: Path) -> dict:
    extension_map = {
        '.py': 'Python', '.ts': 'TypeScript', '.js': 'JavaScript',
        '.go': 'Go', '.java': 'Java', '.cs': 'C#'
    }

    language = extension_map.get(file_path.suffix, 'code')
    content = file_path.read_text()

    if len(content) < 50:  # Skip trivial files
        return None

    response = client.messages.create(
        model='claude-haiku-4-5',  # Fast and cheap for bulk scanning
        max_tokens=1024,
        messages=[{
            'role': 'user',
            'content': f"{DEBT_PROMPT.format(language=language)}\n\nFile: {file_path}\n\n{content}"
        }]
    )

    try:
        json_text = response.content[0].text
        if '```json' in json_text:
            json_text = json_text.split('```json')[1].split('```')[0]
        result = json.loads(json_text)
        result['file'] = str(file_path)
        return result
    except json.JSONDecodeError:
        return {'file': str(file_path), 'issues': [], 'debt_score': 0, 'parse_error': True}

def audit_directory(root: str, extensions: list[str] = ['.py', '.ts']):
    results = []
    files = [p for ext in extensions for p in Path(root).rglob(f'*{ext}')
             if 'node_modules' not in str(p) and '.git' not in str(p)]

    print(f"Auditing {len(files)} files...")
    for i, f in enumerate(files):
        print(f"  [{i+1}/{len(files)}] {f}")
        result = audit_file(f)
        if result:
            results.append(result)

    return results

def generate_report(results: list[dict]) -> str:
    critical = sum(1 for r in results for i in r.get('issues', []) if i['category'] == 'CRITICAL')
    important = sum(1 for r in results for i in r.get('issues', []) if i['category'] == 'IMPORTANT')
    high_debt = sorted(results, key=lambda r: r.get('debt_score', 0), reverse=True)[:10]

    report = f"""# Technical Debt Audit — {datetime.now().strftime('%Y-%m-%d')}

## Summary
- Files scanned: {len(results)}
- Critical issues: {critical}
- Important issues: {important}
- Average debt score: {sum(r.get('debt_score', 0) for r in results) / len(results):.1f}/10

## Top 10 Most Indebted Files
"""
    for r in high_debt:
        report += f"\n### {r['file']} (Score: {r.get('debt_score', 0)}/10)\n"
        for issue in r.get('issues', []):
            report += f"- [{issue['category']}] {issue['description']} "
            report += f"(Effort: {issue['fix_effort']}, Risk: {issue['fix_risk']})\n"

    return report

if __name__ == '__main__':
    results = audit_directory('./src')
    report = generate_report(results)
    Path('debt-audit.md').write_text(report)
    print(f"\nReport written to debt-audit.md")

Run this on a 200-file codebase in about 10 minutes at ~$2-3 in API costs with Claude Haiku.

Stage 2: Prioritization

Raw debt lists are not actionable. Prioritize by combining impact and effort:

def prioritize_debt(audit_results: list[dict]) -> list[dict]:
    """Score each issue by bang-for-buck: high impact, low effort, low risk."""

    impact_score = {'CRITICAL': 3, 'IMPORTANT': 2, 'MINOR': 1}
    effort_cost = {'LOW': 1, 'MEDIUM': 2, 'HIGH': 4}
    risk_penalty = {'LOW': 0, 'MEDIUM': 1, 'HIGH': 3}

    prioritized = []
    for result in audit_results:
        for issue in result.get('issues', []):
            score = (
                impact_score.get(issue['category'], 0) /
                (effort_cost.get(issue['fix_effort'], 2) + risk_penalty.get(issue['fix_risk'], 1))
            )
            prioritized.append({
                **issue,
                'file': result['file'],
                'priority_score': round(score, 2)
            })

    return sorted(prioritized, key=lambda x: x['priority_score'], reverse=True)

The prioritization formula rewards: high category issues that are easy to fix with low risk. A CRITICAL issue that takes 30 minutes and affects only one file scores higher than an IMPORTANT issue that touches 20 files.

Stage 3: Fixing with AI Assistance

For identified debt items, AI generates the refactored code:

def generate_fix(file_path: str, issue: dict) -> str:
    content = Path(file_path).read_text()

    response = client.messages.create(
        model='claude-sonnet-4-5',  # Use better model for actual fixes
        max_tokens=3000,
        messages=[{
            'role': 'user',
            'content': f"""Fix this technical debt issue in the file below.

Issue: {issue['description']}
Location: {issue.get('function') or f'line {issue.get("line")}'}
Category: {issue['category']}

Requirements:
- Fix ONLY this specific issue, don't refactor anything else
- Maintain the same public API and function signatures
- Keep all existing tests passing
- Use the same style as the surrounding code

File: {file_path}

{content}

Return the complete updated file."""
        }]
    )

    return response.content[0].text

The “fix ONLY this specific issue” instruction prevents AI from over-refactoring, which introduces risk.

Measuring Progress

Track debt reduction over time:

# Add to CI pipeline — fail if debt score exceeds threshold
python scripts/debt_audit.py --output json | \
  python -c "
import sys, json
data = json.load(sys.stdin)
critical = sum(1 for r in data for i in r.get('issues',[]) if i['category']=='CRITICAL')
if critical > 0:
    print(f'FAIL: {critical} critical debt issues found')
    sys.exit(1)
print(f'OK: 0 critical issues')
"

Set a zero-critical-debt policy: new code can’t introduce critical technical debt. Existing debt is tracked and paid down sprint by sprint.

The 20% Time Model

The most successful teams allocate 20% of each sprint to debt reduction. AI makes this viable because:

AI identifies debt faster than manual review
AI generates fixes faster than manual refactoring
The remaining human time goes to review and testing, not writing

A developer who would previously fix 2 debt items in a sprint can address 6-8 with AI assistance.

Built by theluckystrike — More at zovo.one