Claude Skills Guide

Claude Skills Slow Performance: Speed Up Guide

If your Claude Code skill workflows feel sluggish — long waits between tool calls, slow response generation, or sessions that drag on far longer than they should — the cause is almost always one of four things: too many tokens in context, unneeded tool round-trips, overly verbose skill definitions, or poor session scoping. This guide covers each one with actionable fixes.

Why Claude Skills Can Be Slow

Claude Code skill performance has two distinct components:

  1. API latency — how long the model takes to generate a response, which scales with context window size
  2. Tool execution time — how long Bash commands, file reads, or external processes take to complete

Both can make a skill feel slow, but the fixes are different. Most developers focus on tool execution when the real bottleneck is context size.

Fix 1: Trim Your Skill Definitions

Every word in a skill file is loaded into context every time you invoke the skill. A 1,500-word skill definition adds ~2,000 tokens to every request. With three active skills in a session, that is 6,000 tokens of static overhead before you write a single character.

Audit your skill files:

wc -w ~/.claude/skills/*.md

Anything above 400 words is worth reviewing. Common sources of bloat:

Cut to the minimal set of instructions that changes Claude’s behavior. The tdd skill, for example, needs to say “write tests before implementation” and specify the test framework — not explain what TDD is.

Before:

## What is TDD?
Test-Driven Development (TDD) is a software development approach where you write tests
before writing the implementation. This helps ensure that your code is testable and
that all requirements are captured as tests before you start coding...
[200 words of explanation]

After:

## Approach
Write the failing test first. Then write the minimum implementation to make it pass.
Then refactor. Use Jest for JavaScript, pytest for Python.

Fix 2: Reduce Tool Call Round-Trips

Each tool call (Bash execution, file read, file write) requires a separate API request. If your skill results in 20 sequential tool calls to accomplish something that could be done in 5, you are paying for 15 extra round-trips.

Batch reads:

# Slow — three separate read requests
Read src/auth.ts
[response]
Read src/user.ts
[response]
Read src/session.ts

# Fast — one request
Read src/auth.ts, src/user.ts, and src/session.ts. Summarize each in one sentence.

Combine Bash operations:

# Slow — separate commands
npm run lint
npm run typecheck
npm test

# Fast — one Bash call
npm run lint && npm run typecheck && npm test

For tdd workflows, instruct Claude to batch the test run and implementation in fewer cycles rather than checking after every minor change.

Fix 3: Use /compact Before Long Tasks

Running /compact at the start of a session that already has some history compresses prior turns and frees up context budget. This means the model processes less on every subsequent request:

/compact
Now run /tdd and implement the AuthService tests.

For the frontend-design skill, compact after completing each component — not at the end of the whole session.

Fix 4: Scope Each Session Tightly

The more prior conversation history in a session, the longer each new response takes to generate. For complex tasks, structure work into focused sessions rather than one marathon session:

Slow — one long session:

Session: /supermemory + /tdd + /frontend-design + all feature work in one go

Fast — focused sessions:

Session 1: /tdd — write all tests for UserService
Session 2: /tdd — implement UserService
Session 3: /frontend-design — build UserForm component

Each session starts fresh with a small context, producing faster responses throughout.

Fix 5: Cut supermemory Read Scope

The supermemory skill can slow down session starts if it reads a large memory store. If you have been using supermemory for months across many projects, the store may have hundreds of entries.

Fix — scope memory reads:

/supermemory
Load only memory entries tagged with "project:myapp" from the last 30 days.
Ignore everything else.

Or purge stale entries:

/supermemory
Delete all entries older than 60 days.

Regularly trimming the memory store keeps supermemory fast on session restore.

Fix 6: Avoid Verbose Tool Output

Tool outputs become part of the context window. A Bash command that dumps 500 lines of log output adds thousands of tokens to the session and slows every subsequent response.

Suppress unnecessary output in your tool calls:

# Slow — full test output
npm test

# Fast — summary only
npm test 2>&1 | tail -20

# Fast — just pass/fail counts
npm test --silent 2>&1 | grep -E "(Tests:|PASS|FAIL)" | tail -5

Teach your skill to use summary flags:

## Tool Usage
When running tests, use summary output only. Run: `npm test --silent`
For build errors, show only the first 20 lines: `npm run build 2>&1 | head -20`

Fix 7: Scope Skill Instructions Precisely

The more specific your skill instructions, the faster Claude produces focused output. Remove unnecessary guidance from skill files that covers cases your use case never triggers:

# Before — overly broad instructions
---
name: tdd
description: TDD workflow
---
Handle all types of testing: unit, integration, E2E, performance, security...

# After — focused instructions
---
name: tdd
description: TDD workflow
---
Write unit tests for the code I provide. Use the testing framework I specify.

Skills are plain Markdown files with only name: and description: in the front matter — there is no tools: field.

Fix 8: Use the pdf Skill With Page Ranges

The pdf skill processing a 200-page document is inherently slow because of the volume of text being loaded. Always specify page ranges for large documents:

/pdf
Extract the executive summary from pages 1-5 of Q4-report.pdf only.

Process sections in separate focused sessions rather than the entire document in one session.

Fix 9: Profile With Session Timing

To identify which specific part of your skill workflow is slow, time individual steps:

# In a Bash tool call — measure a specific command
time npm run build 2>&1 | tail -5

Then compare whether the slow phase is the tool execution (the time output shows it) or the Claude response generation (the pause before Claude shows its analysis). If tool execution is slow, optimize the command. If response generation is slow, reduce context.

Benchmark: What “Fast” Looks Like

For reference, a well-tuned Claude Code skill session should feel like:

If you are consistently beyond these ranges on simple tasks, context bloat or tool verbosity is the likely cause.


Built by theluckystrike — More at zovo.one