Claude Skills Guide

Claude Code for Go Profile-Guided Optimization

Profile-Guided Optimization (PGO) is one of the most powerful techniques for optimizing Go applications, yet many developers underutilize it because they don’t know how to properly collect and apply profiling data. Claude Code can significantly streamline this process, helping you identify optimization opportunities, generate the right profiling code, and interpret results effectively. This guide shows you how to combine Claude Code’s capabilities with Go’s built-in pprof tooling to achieve meaningful performance improvements.

Understanding Profile-Guided Optimization in Go

Profile-Guided Optimization works by letting the compiler make intelligent decisions about inlining, code placement, and register allocation based on real execution data. When you enable PGO in Go 1.21+, the compiler uses CPU profiles to hot path optimize your code, typically delivering 2-10% performance improvements automatically.

The workflow involves three main stages: collecting representative CPU profiles from production or realistic test scenarios, processing those profiles into a format the compiler understands, and rebuilding your application with the profile data. Each stage presents opportunities for Claude Code to assist you.

Claude Code can help by generating the profiling infrastructure, explaining complex profiling results, and guiding you through the optimization decisions based on what the profiles reveal. The key is ensuring you collect profiles that represent actual production workloads rather than synthetic benchmarks.

Setting Up Profiling Infrastructure

Before you can optimize with PGO, you need reliable profiling infrastructure. Claude Code can help you set this up correctly the first time, avoiding common pitfalls like profiling in development environments that don’t match production behavior.

Start by creating a profiling wrapper that captures CPU samples during representative workloads:

package main

import (
    "os"
    "runtime/pprof"
    "log"
)

func main() {
    // Create CPU profile file
    f, err := os.Create("cpu.pprof")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    
    // Start CPU profiling
    if err := pprof.StartCPUProfile(f); err != nil {
        log.Fatal(err)
    }
    defer pprof.StopCPUProfile()
    
    // Your application logic here
    runApplication()
}

func runApplication() {
    // This represents your actual workload
}

Claude Code can help you extend this to include memory profiling, block profiling, and goroutine profiling. For production environments, you’ll want to use the net/http/pprof package to expose profiling endpoints that can be triggered safely:

import _ "net/http/pprof"

func init() {
    go func() {
        // Note: In production, add authentication!
        log.Println(http.ListenAndServe("localhost:6060", nil))
    }()
}

The important thing is collecting profiles during periods of representative load. Ask Claude Code to help you design a profiling strategy that captures the right scenarios—peak traffic, complex queries, or batch processing depending on your application.

Collecting and Processing Profiles

Once your profiling infrastructure is in place, you need to collect profiles that represent how your application actually performs in production. This is where many developers go wrong: they profile development workloads that don’t match real usage patterns.

Work with Claude Code to create a comprehensive profiling plan:

  1. Identify your critical user journeys and hot paths
  2. Set up automated profile collection during those scenarios
  3. Aggregate multiple profiles to capture variance in workloads
  4. Validate that collected profiles show consistent patterns

For web services, this typically means collecting profiles during realistic load tests using tools like hey, wrk, or k6. For CLI tools, it means running representative commands. For libraries, it means executing the most common call patterns.

Once collected, process your profiles into the format Go’s PGO system expects:

# Collect CPU profile
go tool pprof -proto cpu.pprof > cpu.prof

# Or use the newer pprof convention
go tool pprof -output=cpu.prof cpu.pprof

Place the resulting .prof file in your module root, named default.pprof for automatic detection by the Go compiler.

Building with PGO Enabled

With your profile data collected and processed, enabling PGO is straightforward. The Go compiler automatically looks for default.pprof in your module root:

# Build with PGO (Go 1.21+)
go build -pgo=auto .

# Or explicitly specify profile
go build -pgo=cpu.prof .

When you rebuild with PGO, you’ll see the compiler making different optimization decisions. The output binary should be faster for your profiled workloads. However, this is only the beginning of the optimization process.

Claude Code can help you interpret what’s happening during the PGO build. Ask it to explain what specific optimizations the compiler is applying and whether there are additional changes you can make to improve the profile quality.

Iterating on Profile Data

Single-profile PGO is good, but iterative profiling produces better results. As you make changes to your code, new hot paths emerge and old ones disappear. Your profile data becomes stale.

Establish a regular profiling cadence:

Claude Code can help you automate profile collection and comparison. Create scripts that capture profiles, compare them against baselines, and alert on significant changes:

package main

import (
    "fmt"
    "os"
    "runtime/pprof"
    "time"
)

func collectProfile(duration time.Duration, output string) error {
    f, err := os.Create(output)
    if err != nil {
        return err
    }
    defer f.Close()
    
    if err := pprof.StartCPUProfile(f); err != nil {
        return err
    }
    
    time.Sleep(duration)
    pprof.StopCPUProfile()
    
    return nil
}

func main() {
    // Collect 30-second profile during load test
    if err := collectProfile(30*time.Second, "cpu.prof"); err != nil {
        fmt.Fprintf(os.Stderr, "Profile collection failed: %v\n", err)
        os.Exit(1)
    }
    
    fmt.Println("Profile saved to cpu.prof")
}

Compare profiles over time to understand whether optimizations are working or if new performance issues are emerging. This iterative approach ensures your PGO data remains current and valuable.

Beyond Basic PGO: Deeper Optimizations

While PGO provides automatic compiler optimizations, the profiling data it generates can guide many other performance improvements. Use pprof visualizations and analysis to identify:

Claude Code excels at helping you analyze pprof output and translate it into actionable code changes. Share pprof text output with Claude Code and ask for specific optimization suggestions:

go tool pprof -text cpu.prof | head -50

This shows the top CPU-consuming functions. Claude Code can then suggest concrete improvements, whether that’s caching frequently computed values, using more efficient data structures, or parallelizing sequential operations.

Best Practices for PGO Success

To get the most from PGO, follow these proven practices:

Collect representative profiles. The优化 effect is only as good as your profiling data. Profile production-like workloads, not artificial benchmarks.

Update profiles regularly. As your code evolves, so do your hot paths. Re-profile after significant changes.

Focus on the biggest wins first. PGO helps most where it matters most—in your actual hot paths. Don’t over-optimize code that barely appears in profiles.

Combine with other optimizations. PGO works alongside other techniques like algorithmic improvements, caching, and concurrency. Use profiling data to prioritize where manual effort will have the most impact.

Test performance regressions. Ensure your optimizations don’t break functionality. Profile before and after changes to verify improvements.

Claude Code can help you implement all of these practices, from generating profiling scripts to analyzing results and implementing suggested optimizations. By combining AI assistance with Go’s built-in tooling, you can achieve significant performance improvements with less trial and error.

The key is treating PGO as part of an ongoing performance engineering process rather than a one-time optimization. With Claude Code’s help, you can establish this cycle and continuously improve your application’s performance over time.

Built by theluckystrike — More at zovo.one