Technical documentation has a quality ceiling that most AI tools don’t reach: accurate code examples, the right level of detail for the audience, and explanations that don’t talk down to experienced engineers. This guide compares Claude and GPT-4 across four documentation types that actually get written in engineering teams.
PYM-001**: Payment Service P95 Latency > 2s
Severity: P2 — escalate to P1 if >5s or error rate >1% On-call contact: #payments-oncall
1.
- No constraints on
amount: no explanation of when to useidempotency_key, no error scenarios. - Restart the service if: needed 5.
- GPT-4’s requires a senior: engineer to translate into actual steps.
Test 1: API Reference from Code
Both models were given the same Python function with a docstring stub and asked to generate API reference documentation.
def create_payment_intent(
amount: int,
currency: str,
customer_id: str | None = None,
metadata: dict | None = None,
idempotency_key: str | None = None
) -> PaymentIntent:
"""
Creates a payment intent for a given amount.
"""
GPT-4’s output:
## create_payment_intent
Creates a payment intent for a given amount.
### Parameters
- `amount` (int): The amount to charge.
- `currency` (str): The currency code.
- `customer_id` (str, optional): The customer ID.
- `metadata` (dict, optional): Metadata.
- `idempotency_key` (str, optional): Idempotency key.
### Returns
`PaymentIntent`: The created payment intent.
Accurate but generic. No constraints on amount, no explanation of when to use idempotency_key, no error scenarios.
Claude’s output:
## create_payment_intent
Creates a payment intent representing a pending charge that can be confirmed
or cancelled. Payment intents are the recommended way to handle complex
payment flows, including 3D Secure authentication.
### Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| `amount` | `int` | Yes | Amount in the **smallest currency unit** (cents for USD). Minimum: 50. |
| `currency` | `str` | Yes | ISO 4217 currency code (e.g., `"usd"`, `"eur"`). Lowercase. |
| `customer_id` | `str \| None` | No | Customer ID from `create_customer()`. Attach to track payment history. |
| `metadata` | `dict \| None` | No | Up to 50 key-value pairs. Keys max 40 chars, values max 500 chars. |
| `idempotency_key` | `str \| None` | No | UUID to safely retry on network failure without double-charging. |
### Returns
`PaymentIntent` — See [PaymentIntent object reference](#paymentintent-object).
### Raises
- `InvalidAmountError`: `amount` is below the minimum (50 cents) or exceeds the maximum.
- `CurrencyNotSupportedError`: `currency` is not in the supported list.
- `CustomerNotFoundError`: `customer_id` does not exist.
### Example
```python
import uuid
from payments import create_payment_intent
# Charge $29.99 USD, safe to retry
intent = create_payment_intent(
amount=2999, # $29.99 in cents
currency="usd",
customer_id="cus_abc123",
metadata={"order_id": "ord_xyz"},
idempotency_key=str(uuid.uuid4())
)
print(f"Intent created: {intent.id}, status: {intent.status}")
Notes
- Payment intents expire after 24 hours if not confirmed.
- For recurring charges, use
create_subscription()instead. ```
Claude inferred the amount is in cents, provided a realistic example with uuid4() for idempotency, and added error types that make logical sense even without seeing the implementation.
Test 2: Runbook for On-Call Engineers
Prompt: “Write a runbook section for: payment service P95 latency exceeds 2 seconds.”
Both models produced runbook entries, but the structure differed significantly.
GPT-4’s runbook:
## Payment Service High Latency
If P95 latency > 2 seconds:
1. Check the payment service dashboard
2. Look for errors in the logs
3. Check database connections
4. Restart the service if needed
5. Escalate if issue persists
Generic enough to apply to any service. No specific commands, no metrics to check.
Claude’s runbook:
## PYM-001: Payment Service P95 Latency > 2s
**Severity**: P2 — escalate to P1 if >5s or error rate >1%
**On-call contact**: #payments-oncall
### 1. Triage (2 minutes)
```bash
# Current P95 across all instances
kubectl exec -n payments deploy/payment-svc -- \
curl -s localhost:9090/metrics | grep http_request_duration_p95
# Error rate trend
kubectl logs -n payments -l app=payment-svc --since=10m | \
grep '"level":"error"' | wc -l
Check: Payments Latency Dashboard
2. Identify the Layer (5 minutes)
Database latency?
kubectl exec -n payments deploy/payment-svc -- \
curl -s localhost:9090/metrics | grep db_query_duration_p95
If db_query_duration_p95 > 1.5s → go to [DB Runbook PYM-DB-001]
External API (Stripe) latency?
kubectl exec -n payments deploy/payment-svc -- \
curl -s localhost:9090/metrics | grep stripe_api_duration_p95
If stripe_api_duration_p95 > 500ms → check Stripe Status Page
Pod resource saturation?
kubectl top pods -n payments -l app=payment-svc
If CPU > 80% or memory > 85% → scale horizontally:
kubectl scale deploy/payment-svc -n payments --replicas=6
3. Known Causes
| Symptom | Likely Cause | Fix |
|---|---|---|
| Slow around :00/:30 minute | Cron job running reports | Move cron to off-peak |
| Latency after deploy | N+1 query in new code | Roll back, see PYM-002 |
| Gradual increase over hours | Connection pool exhaustion | Restart payment-svc |
4. Escalation
Escalate to #payments-eng if:
- Issue persists >15 minutes after mitigation
- P95 > 5s
- Transaction failure rate > 0.5% ```
Claude’s runbook is immediately actionable with real commands and thresholds. GPT-4’s requires a senior engineer to translate into actual steps.
Test 3: Tutorial Writing
Prompt: “Write an introduction section for a tutorial on setting up Redis caching in a FastAPI app.”
Both models write clear prose. The difference is in code example quality and conceptual accuracy.
GPT-4 introduction:
In this tutorial, we’ll learn how to add Redis caching to a FastAPI application. Redis is an in-memory database that can significantly speed up your app. We’ll cover installation, configuration, and implementing a simple cache.
Claude introduction:
FastAPI’s async architecture makes it a natural fit for Redis — both are non-blocking by design. Adding a cache layer with
redis-py’s async client reduces database load for read-heavy endpoints without changing your API surface.By the end of this tutorial, you’ll have a working cache-aside pattern: on cache miss, fetch from the database and write to Redis with a TTL; on cache hit, return the cached value and skip the database entirely.
What you’ll build:
- A
CacheManagerclass wrappingredis.asyncio- A
@cacheddecorator for endpoint-level caching- Cache invalidation when data changes
Prerequisites: FastAPI app running, Redis 7+ available (
docker run -p 6379:6379 redis:7)
Claude explains why Redis fits FastAPI (async alignment), describes the cache-aside pattern before showing code, and gives a concrete prerequisite with a runnable Docker command. GPT-4 uses generic tutorial phrasing.
Test 4: README Generation from Code
Claude consistently produces better READMEs by inferring purpose and usage from code structure. Given a Python CLI tool with argparse, Claude generates:
- Accurate installation instructions based on the package structure
- Usage examples that match the actual argument names
- A “Why this tool” section that captures the design intent
GPT-4 produces accurate but generic README templates that require significant editing.
Side-by-Side Score
| Documentation Type | Claude | GPT-4 |
|---|---|---|
| API reference | Parameter constraints, errors, examples | Accurate but generic |
| Runbooks | Specific commands and thresholds | Needs translation to commands |
| Tutorials | Pattern explanation + code | Clear but surface-level |
| README generation | Infers intent from code | Generic template |
| Speed | Slower for long docs | Faster |
| Token efficiency | Longer output | More concise |
When to Use GPT-4 for Docs
GPT-4 is better when you need:
- High volume, good-enough documentation
- Translation or localization of existing docs
- First draft that a human will heavily edit
- Fast iteration on doc structure
Claude is better for:
- Documentation that ships without human review
- Runbooks and operational guides
- API references from complex code
- Explaining non-obvious design decisions
Related Articles
- Claude Code Runbook Documentation Guide
- Best AI Tools for Writing Swagger API Documentation 2026
- Gemini vs Claude for Generating Markdown Documentation
- Best AI Tools for Technical Documentation Writing in 2026
- ChatGPT vs Claude for Writing API Documentation
Built by theluckystrike — More at zovo.one