Python type annotations have a long tail of complexity: TypeVar bounds, Protocol for structural typing, ParamSpec for decorator type safety, and TypeGuard for narrowing. AI tools that only know str | None annotations will annotate simple code correctly but fail on anything generic. This guide tests Claude Code, Copilot, and MonkeyType against four real-world annotation scenarios.
Table of Contents
- The Annotation Quality Spectrum
- Why Type Annotations Pay Off
- Setting Up Your Annotation Workflow
- Task 1: Generic Functions
- Task 2: Decorator Type Safety with ParamSpec
- Task 3: Protocol for Structural Typing
- Task 4: TypedDict and Overload
- Task 5: TypeGuard for Narrowing Functions
- Automated Annotation with MonkeyType + AI Review
- Choosing the Right Tool by Team Size
- Tool Comparison
- Related Reading
The Annotation Quality Spectrum
Most AI tools handle these well:
def greet(name: str) -> str: ...
def fetch_user(user_id: int) -> User | None: ...
These are where they diverge:
def map_values(d: dict, fn: ???) -> ???: ... # Generic typing
def retry(fn: ???) -> ???: ... # Decorator typing
class Repository(???): ... # Protocol/ABC typing
The difference matters in real codebases. dict without parameters is technically valid but tells the type checker nothing useful. Any is worse — it silently disables checking downstream.
Why Type Annotations Pay Off
Type annotations are documentation that can’t go stale. When a function signature says fn: Callable[[V], R], every IDE and type checker understands the relationship between inputs and outputs. When someone changes the callable’s return type, the type checker immediately flags every call site that assumes the old return type.
The investment also compounds. A codebase with 60%+ annotation coverage catches refactoring errors before they reach CI. Teams that add annotations incrementally — starting with public APIs and utility functions — see meaningful error detection within weeks. AI tools dramatically accelerate this process because generating annotations is mechanical work that doesn’t require understanding business logic.
Setting Up Your Annotation Workflow
Before comparing tools, a quick setup note. For interactive use, pyright gives instant feedback:
pip install pyright mypy monkeytype
# VS Code: install the Pylance extension (uses pyright internally)
# Check a file for annotation errors
pyright your_module.py
# Strict mode: catch everything
pyright --strict your_module.py
For CI, add mypy to your pipeline:
# .github/workflows/typecheck.yml
- name: Run mypy
run: mypy --strict src/ --ignore-missing-imports
Task 1: Generic Functions
Prompt: “Add type annotations to this function that maps dict values.”
# Before
def map_values(d, fn):
return {k: fn(v) for k, v in d.items()}
Copilot’s annotation:
def map_values(d: dict, fn: callable) -> dict:
return {k: fn(v) for k, v in d.items()}
callable is not a valid type. This is Callable from typing. Also, dict without parameters loses all type information.
Claude’s annotation:
from typing import TypeVar
from collections.abc import Callable
K = TypeVar("K")
V = TypeVar("V")
R = TypeVar("R")
def map_values(d: dict[K, V], fn: Callable[[V], R]) -> dict[K, R]:
"""Transform dictionary values using the provided function.
The keys are preserved; each value is replaced by fn(value).
"""
return {k: fn(v) for k, v in d.items()}
# Now mypy/pyright knows the full types:
prices: dict[str, int] = {"apple": 100, "banana": 50}
discounted: dict[str, float] = map_values(prices, lambda p: p * 0.9) # Correct!
Claude’s version allows type checkers to infer the output type based on the callable’s return type. This means map_values(user_ids, fetch_user) correctly infers dict[str, User | None] without any additional annotation.
Task 2: Decorator Type Safety with ParamSpec
Prompt: “Add type annotations to this retry decorator that preserves the wrapped function’s signature.”
# Before
def retry(max_attempts=3, delay=1.0):
def decorator(fn):
def wrapper(*args, **kwargs):
for attempt in range(max_attempts):
try:
return fn(*args, **kwargs)
except Exception as e:
if attempt == max_attempts - 1:
raise
time.sleep(delay)
return wrapper
return decorator
Copilot’s annotation:
def retry(max_attempts: int = 3, delay: float = 1.0) -> Callable:
def decorator(fn: Callable) -> Callable:
def wrapper(*args: Any, **kwargs: Any) -> Any:
...
return wrapper
return decorator
This loses all type information from the wrapped function. fn(args) won’t type-check inside the wrapper.
Claude’s annotation:
from typing import TypeVar
from collections.abc import Callable
from functools import wraps
import time
P = ParamSpec("P")
T = TypeVar("T")
def retry(
max_attempts: int = 3,
delay: float = 1.0,
exceptions: tuple[type[Exception], ...] = (Exception,)
) -> Callable[[Callable[P, T]], Callable[P, T]]:
"""Retry decorator that preserves the wrapped function's type signature."""
def decorator(fn: Callable[P, T]) -> Callable[P, T]:
@wraps(fn)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
last_error: Exception | None = None
for attempt in range(max_attempts):
try:
return fn(*args, **kwargs)
except exceptions as e:
last_error = e
if attempt < max_attempts - 1:
time.sleep(delay * (2 ** attempt)) # Exponential backoff
raise last_error # type: ignore[misc]
return wrapper
return decorator
# Type safety is preserved:
@retry(max_attempts=3)
def fetch_data(url: str, timeout: int = 30) -> bytes:
...
result: bytes = fetch_data("https://example.com", timeout=10) # Correctly typed
fetch_data(123) # mypy error: Argument 1 has incompatible type "int"; expected "str"
ParamSpec is the key — it preserves the parameter types of the wrapped function. This was introduced in Python 3.10 and backported via typing_extensions for earlier versions.
Task 3: Protocol for Structural Typing
Prompt: “Write a function that accepts any object with a .save() method. Use structural typing.”
Copilot’s output:
def persist(obj: Any) -> None: # Uses Any — no type safety
obj.save()
Claude’s output:
from typing import Protocol, runtime_checkable
@runtime_checkable
class Saveable(Protocol):
"""Protocol for objects that can be persisted."""
def save(self) -> None: ...
class SaveWithResult(Protocol):
"""Protocol for objects whose save returns a result."""
def save(self) -> bool: ...
def persist(obj: Saveable) -> None:
"""Persist any object that implements the Saveable protocol.
This function works with any class that has a save() method,
without requiring inheritance from a base class.
"""
obj.save()
def persist_with_check(obj: SaveWithResult) -> bool:
return obj.save()
# Works with any class that has save() — no inheritance needed
class UserModel:
def save(self) -> None:
print("Saving user to database")
class InMemoryStore:
def save(self) -> None:
print("Saving to memory")
persist(UserModel()) # OK
persist(InMemoryStore()) # OK
persist("not saveable") # mypy error: Argument 1 missing save()
# runtime_checkable enables isinstance checks
assert isinstance(UserModel(), Saveable) # True at runtime
The practical advantage of Protocol over ABC is that existing classes satisfy it without modification. If you’re consuming third-party objects that happen to have a save() method, Protocol lets you type-check against them without monkey-patching.
Task 4: TypedDict and Overload
# Before
def process_config(config):
if config.get("mode") == "batch":
return {"status": "queued", "job_id": generate_id()}
return {"status": "running", "result": execute(config)}
Claude’s annotation:
from typing import TypedDict, Literal, overload
class BatchConfig(TypedDict):
mode: Literal["batch"]
batch_size: int
class StreamConfig(TypedDict):
mode: Literal["stream"]
buffer_size: int
class BatchResult(TypedDict):
status: Literal["queued"]
job_id: str
class StreamResult(TypedDict):
status: Literal["running"]
result: Any
@overload
def process_config(config: BatchConfig) -> BatchResult: ...
@overload
def process_config(config: StreamConfig) -> StreamResult: ...
def process_config(config: BatchConfig | StreamConfig) -> BatchResult | StreamResult:
if config["mode"] == "batch":
return BatchResult(status="queued", job_id=generate_id())
return StreamResult(status="running", result=execute(config))
# Type checker knows the exact return type based on input:
batch_result = process_config(BatchConfig(mode="batch", batch_size=100))
# batch_result.job_id is valid — type checker knows it's BatchResult
@overload lets you express that a function’s return type depends on its input type. Without it, the return type would be BatchResult | StreamResult and you’d need a type narrowing assertion every time you use the result.
Task 5: TypeGuard for Narrowing Functions
One pattern where tool quality diverges sharply is TypeGuard — functions that narrow a union type inside conditionals.
# Before — type checker can't narrow inside this branch
def is_valid_user(obj: dict) -> bool:
return "id" in obj and "email" in obj
# After — Claude adds TypeGuard
from typing import TypeGuard
class UserDict(TypedDict):
id: str
email: str
name: str
def is_valid_user(obj: dict) -> TypeGuard[UserDict]:
"""Return True if obj has all required UserDict fields."""
return (
isinstance(obj.get("id"), str) and
isinstance(obj.get("email"), str) and
isinstance(obj.get("name"), str)
)
# Now type checker narrows inside the branch
def process_api_response(data: dict) -> str | None:
if is_valid_user(data):
return data["email"] # data is UserDict here — no error
return None
Copilot produces -> bool without the TypeGuard wrapper. This works at runtime but the type checker treats the branch as still having type dict, so field access still generates warnings.
Automated Annotation with MonkeyType + AI Review
# Collect runtime type information
pip install monkeytype
monkeytype run your_script.py
monkeytype apply your_module
Then use Claude to review and improve the generated annotations:
from anthropic import Anthropic
from pathlib import Path
client = Anthropic()
source = Path("your_module.py").read_text()
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=2000,
messages=[{
"role": "user",
"content": f"""Review these MonkeyType-generated annotations and improve them:
1. Replace Any with proper generics where possible
2. Add Protocol types where duck typing is used
3. Use Literal for string enums
4. Add TypeVar/ParamSpec where functions are generic
5. Add TypeGuard where isinstance checks narrow types
Source:
{source[:4000]}
Return the improved source with annotations."""
}]
)
print(response.content[0].text)
MonkeyType captures what types actually flowed through at runtime, which eliminates guesswork about what values a function actually receives. The AI review then upgrades flat str annotations to Literal["pending", "shipped"] where the sample values suggest it, and adds generics where MonkeyType fell back to Any.
Choosing the Right Tool by Team Size
For solo projects or small teams (1–5 engineers): Use Claude Code interactively. Ask it to annotate one module at a time and review the output. Focus on public function signatures first; internal helpers can wait.
For medium teams (5–20 engineers): Add mypy to CI with --strict on new code only (--exclude existing modules). Use MonkeyType to collect runtime types for existing modules, then AI-review the generated stubs before committing.
For large codebases: Run pyright in watch mode for developers and mypy in CI. Use a staged rollout — annotate utilities and shared libraries first, then work outward. Copilot handles the mechanical volume; Claude handles the complex patterns like ParamSpec and Protocol.
Tool Comparison
| Pattern | Claude Code | Copilot | MonkeyType |
|---|---|---|---|
| Basic annotations | Correct | Correct | Correct |
| Generic TypeVar | Full generics | dict without params |
Runtime-inferred |
| ParamSpec decorators | Correct | Uses Any |
Partial |
| Protocol types | Full with @runtime_checkable | Uses Any |
No |
| TypedDict + overload | Complete | Basic TypedDict | No |
| Literal types | Yes | Sometimes | No |
| TypeGuard | Yes | Rarely | No |
Related Articles
- Best AI Tools for Writing Python Type Hints 2026
- Cursor vs Copilot for Adding Type Hints to Untyped Python
- Best AI Tools for TypeScript Type Inference and Generic
- Best AI Assistant for Fixing TypeScript Strict Mode Type
- How Well Do AI Tools Handle Go Generics Type Parameter Built by theluckystrike — More at zovo.one