AI Tools Compared

Large codebase refactoring is the hardest thing to do with AI assistance. The context window fills fast, changes ripple across files in unexpected ways, and a partial refactor is often worse than no refactor. This guide documents what actually works when using Claude Code and Cursor Composer for refactoring 50,000+ line codebases.

Phase 3: Use Cursor for complex individual files

Phase 4: Final verification

claude “Run the full test suite, mypy, and ruff.

The Core Problem: Context Limits

Both tools run into the same constraint: you can’t fit a large codebase into one context window. The strategies differ in how they work around this.

Claude Code approach: Runs in an agentic loop, reads files as needed, maintains a mental model of the codebase over multiple tool calls. Can handle large refactors by decomposing them into layers.

Cursor Composer approach: You manually @ mention files. Cursor has better IDE integration (inline diffs, file tree context) but depends on you knowing which files to include.

Setup: Claude Code for Large Refactors

# Install Claude Code
npm install -g @anthropic-ai/claude-code

# Initialize in your repo
cd /path/to/large-repo
claude

# Give it a CLAUDE.md with codebase context
cat > CLAUDE.md << 'EOF'
# Codebase Overview
- Python 3.12 FastAPI application
- 50k lines across 120 files
- PostgreSQL via SQLAlchemy 2.0 (async)
- Repository pattern: models/ → repositories/ → services/ → routers/
- Tests in tests/ using pytest-asyncio

# Naming Conventions
- All DB operations go through repository classes, never direct session queries in services
- Pydantic v2 for all schemas (no v1 compatibility layer)
EOF

Refactoring Task 1: Migrate from sync to async SQLAlchemy

Claude Code approach:

Refactor this codebase from synchronous SQLAlchemy to async SQLAlchemy 2.0.

The migration involves:
1. Changing Session → AsyncSession everywhere
2. Adding await to all database calls
3. Updating the engine creation in db/session.py
4. Updating all repository methods to be async
5. Updating all service methods that call repositories

Start by listing all files that import from sqlalchemy or db/ directory.
Then create a migration plan before touching any files.
Do NOT make all changes at once — do one layer at a time (db → repositories → services → routers).
After each layer, run: pytest tests/ -x -q and report results.

Claude Code will:

  1. Use shell tools to find all affected files
  2. Create a migration plan
  3. Modify files layer by layer
  4. Run tests after each layer and fix failures before proceeding

Cursor Composer approach for the same task:

@db/session.py @repositories/user_repository.py @services/user_service.py @routers/users.py

Migrate these files from sync to async SQLAlchemy 2.0.
Change Session → AsyncSession, add await to all db calls,
make all methods async.

Cursor does it faster for the files you specify but requires you to identify every affected file manually. For 120 files, this is impractical.

Refactoring Task 2: Extract a Service Layer

Codebase has logic scattered in route handlers. Extract it into service classes.

Claude Code handles this well:

This codebase has business logic directly in FastAPI route handlers.
Extract it following the existing repository pattern:
- Create services/ directory if it doesn't exist
- For each router file, extract non-routing logic into a service class
- Services should use dependency injection (FastAPI Depends)
- Keep route handlers thin: validate input → call service → return response
- Start with routers/payments.py — it's the most complex

Claude reads routers/payments.py, identifies what’s business logic vs routing, creates services/payment_service.py, updates the router to use it, and tests.

Cursor Composer for this task:

@routers/payments.py

Extract the business logic from this router into a new service class.
Create services/payment_service.py. Use dependency injection.
Keep the router thin.

Cursor’s output is higher quality for a single file because it has full file context and generates clean diffs. The tradeoff: you repeat this for every router file.

Refactoring Task 3: Breaking Up God Classes

One common refactoring challenge is a “god class” — a module that has grown to handle too many concerns. Both tools approach this differently.

Claude Code prompt for god class decomposition:

UserService in services/user_service.py has grown to 1,400 lines and handles:
- Authentication (login, token refresh, password reset)
- Profile management (update, avatar upload, preferences)
- Subscription billing (plan changes, invoice retrieval)
- Notification settings

Split it into focused service classes:
- AuthService (auth concerns only)
- ProfileService (profile management)
- BillingService (subscription + invoices)
- NotificationService (notification preferences)

Rules:
- Preserve all existing method signatures (callers must not change)
- Create a thin UserService facade that delegates to the new services
- Each new file gets its own test file mirroring tests/test_user_service.py structure
- Run tests after each class is extracted

Claude Code is particularly good at this because it can scan all callers of the original class to ensure nothing breaks during extraction.

Cursor for the same task:

@services/user_service.py @tests/test_user_service.py

Extract the authentication methods (login, refresh_token, reset_password)
from UserService into a new AuthService class in services/auth_service.py.
Keep UserService delegating to AuthService for backward compatibility.

Cursor produces a cleaner extraction for that one concern. You then repeat the process for each group of methods.

Verification Strategy

Verification is where both tools need explicit guidance.

For Claude Code, add a verification script:

# scripts/verify-refactor.py
import subprocess
import sys

checks = [
    ("Type checking", ["mypy", ".", "--ignore-missing-imports"]),
    ("Tests", ["pytest", "tests/", "-x", "-q"]),
    ("Linting", ["ruff", "check", "."]),
    ("Import check", ["python", "-c", "import app.main"]),
]

failed = []
for name, cmd in checks:
    result = subprocess.run(cmd, capture_output=True, text=True)
    if result.returncode != 0:
        failed.append((name, result.stdout, result.stderr))
        print(f"FAIL: {name}")
        print(result.stdout[-500:])
    else:
        print(f"PASS: {name}")

if failed:
    sys.exit(1)

Tell Claude:

After every batch of changes, run: python scripts/verify-refactor.py
Fix all failures before continuing to the next file/module.

Context Management for Claude Code

For very large refactors, decompose into separate sessions:

# Session 1:
"Refactor the db/ layer and repositories/ only.
Create a commit when done: git commit -m 'refactor: async SQLAlchemy db layer'"

# Session 2 (new claude session):
"The db/ and repositories/ layers have been migrated to async SQLAlchemy.
Now migrate services/ to use the async repositories."

# Session 3:
"The db, repository, and service layers are async.
Now update routers/ to use async services and add await where needed."

Breaking the refactor into discrete commits also gives you safe rollback points.

Prompt Engineering Patterns That Work

Both tools benefit from well-structured refactoring prompts. These patterns consistently produce better results:

The “constraint-first” pattern — state what must NOT change before what should change:

Do not change any public method signatures.
Do not modify files outside of services/ and tests/.
Now: extract the billing logic from UserService into BillingService.

The “breadcrumb” pattern — explicitly tell Claude what has already been done in prior sessions:

Context: We have already migrated db/ and repositories/ to async.
The following files are complete: [list]
The following files still need migration: [list]
Start with services/order_service.py.

The “rollback checkpoint” pattern — commit before each risky change:

Before changing routers/payments.py, commit current state:
git add -A && git commit -m "chore: checkpoint before payments router refactor"
Then proceed with the refactor.
If tests fail, we can git reset --hard HEAD to recover.

Cursor’s Strength: Targeted Edits

Where Cursor beats Claude Code: when you know exactly what needs changing and want high-quality targeted diffs.

@models/user.py @schemas/user.py @repositories/user_repository.py

Add soft delete support:
- Add deleted_at: Optional[datetime] = None to User model
- Add is_deleted property
- Update UserRepository.list() to filter WHERE deleted_at IS NULL
- Update UserRepository.delete() to set deleted_at instead of hard delete
- Add UserRepository.restore() method
- Update UserSchema to exclude deleted_at from public responses

Cursor handles this in one shot with clean per-file diffs you can review individually.

Combining Both Tools

The optimal workflow for large refactors:

  1. Claude Code for planning — analyze the codebase and produce a migration plan
  2. Claude Code for mechanical bulk changes — renaming, import updates, adding awaits
  3. Cursor for quality-sensitive individual file edits where you want clean diffs
  4. Claude Code for testing and verification — it can run tests and fix failures iteratively
# Phase 1: Get the plan
claude "Analyze this codebase and create a detailed migration plan for [refactor].
List files in order, dependencies between changes, and risk for each step.
Write the plan to REFACTOR_PLAN.md but do not start making changes yet."

# Phase 2: Execute mechanical changes
claude "Execute steps 1-5 from REFACTOR_PLAN.md.
After each step, run pytest and fix failures before continuing."

# Phase 3: Use Cursor for complex individual files

# Phase 4: Final verification
claude "Run the full test suite, mypy, and ruff. Fix any remaining issues."

Performance Comparison

Dimension Claude Code Cursor Composer
Finding affected files Automated (shell tools) Manual (@mentions)
Multi-file coordination Excellent Requires manual sequencing
Diff quality per file Good Excellent
Test-and-fix loop Automated Manual
Context for 50k+ line repos Handles via tool calls Context window limits
IDE integration Terminal only Full IDE
Speed for single-file edits Moderate Fast
God class decomposition Strong (scans all callers) Good for one class at a time
Prompt-driven planning Excellent Limited

When to Use Which Tool

Use Claude Code when:

Use Cursor Composer when: