Claude Skills Guide

Claude Code for OpenTelemetry Metrics Workflow Guide

OpenTelemetry has become the go-to standard for observability in modern applications. While distributed tracing gets most of the attention, metrics are equally crucial for understanding system health, capacity planning, and performance optimization. This guide shows you how to use Claude Code to implement OpenTelemetry metrics efficiently in your projects.

Understanding OpenTelemetry Metrics

OpenTelemetry metrics provide quantitative measurements about your system’s behavior. Unlike traces that follow individual requests, metrics aggregate data over time—think request counts, response latencies, memory usage, and error rates. These aggregated values help you understand overall system performance and detect trends before they become critical issues.

OpenTelemetry supports three main metric types: counters ( monotonically increasing values), gauges (point-in-time measurements), and histograms (statistical distributions). Each serves different monitoring purposes, and Claude Code can help you understand which to use when.

When working with Claude Code, start by explaining your monitoring goals clearly. For example: “I need to track API request counts and response times for my Node.js service.” This helps Claude provide targeted guidance for your specific use case.

Setting Up OpenTelemetry Metrics with Claude Code

Initial Project Setup

Begin by describing your requirements to Claude in your CLAUDE.md file or during your session:

I have a Python FastAPI application running on Kubernetes. I need to add OpenTelemetry metrics to track:
1. Request count by endpoint and status code
2. Response time histogram
3. Active connections gauge
4. Custom business metrics for user signups and transactions

Claude will generate the appropriate setup code. For Python, this typically involves installing the OpenTelemetry SDK and exporters:

pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-prometheus opentelemetry-instrumentation-fastapi

Creating Metrics Configuration

Claude can generate a comprehensive metrics setup file tailored to your framework. Here’s an example of what the configuration might look like:

from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.resources import Resource
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor

# Create a resource that describes your service
resource = Resource.create({
    "service.name": "my-fastapi-service",
    "service.version": "1.0.0"
})

# Set up the Prometheus reader for metrics export
reader = PrometheusMetricReader()
provider = MeterProvider(resource=resource, metric_readers=[reader])
metrics.set_meter_provider(provider)

# Get a meter for your service
meter = metrics.get_meter(__name__)

# Create metrics
request_counter = meter.create_counter(
    "http_requests_total",
    description="Total number of HTTP requests",
    unit="requests"
)

request_histogram = meter.create_histogram(
    "http_request_duration_seconds",
    description="HTTP request duration in seconds"
)

Claude will explain each component and help you customize it for your specific needs.

Implementing Custom Metrics

Business Metrics That Matter

Beyond infrastructure metrics, business metrics provide insight into user behavior and revenue. Claude can help you implement metrics that align with your business objectives:

# Track user signups
signup_counter = meter.create_counter(
    "user_signups_total",
    description="Total number of user signups",
    unit="signups"
)

# Track transactions by type
transaction_counter = meter.create_counter(
    "transactions_total",
    description="Total number of transactions",
    unit="transactions"
).bind(type="payment")

# Track queue depth
queue_gauge = meter.create_gauge(
    "queue_depth",
    description="Number of items in processing queue",
    unit="items"
)

When defining custom metrics, consider what decisions they inform. Metrics should answer specific questions about your system’s behavior and business performance.

Adding Labels for Better Granularity

Labels (known as attributes in OpenTelemetry) let you slice and dice metrics meaningfully:

# Add labels to metrics for better segmentation
request_counter.add(1, {
    "endpoint": request.url.path,
    "method": request.method,
    "status_code": response.status_code
})

Claude can help you identify which labels provide the most value without creating cardinality explosion—a common pitfall where too many unique label values overwhelm your metrics backend.

Integrating with Prometheus and Grafana

Prometheus Configuration

OpenTelemetry metrics typically export to Prometheus. Claude can help you configure the integration:

from opentelemetry.exporter.prometheus import PrometheusMetricReader
from prometheus_client import start_http_server

# Start Prometheus metrics server on port 8000
start_http_server(port=8000, addr="0.0.0.0")

Your Prometheus configuration should include the metrics endpoint:

scrape_configs:
  - job_name: 'my-service'
    static_configs:
      - targets: ['my-service:8000']

Grafana Dashboard Creation

Once metrics flow to Prometheus, Grafana visualizes them effectively. Claude can generate useful dashboard queries:

# Request rate per endpoint
sum(rate(http_requests_total[5m])) by (endpoint)

# P99 latency
histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))

# Error rate
sum(rate(http_requests_total{status_code=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))

Best Practices for Metrics Implementation

Start Simple, Iterate Based on Needs

Don’t try to instrument everything at once. Begin with basic request metrics and add custom metrics as you identify monitoring gaps. Claude can help you prioritize:

  1. First: Core HTTP metrics (count, latency, errors)
  2. Second: Infrastructure metrics (CPU, memory, disk)
  3. Third: Application-specific metrics (business events, queue depths)
  4. Fourth: Detailed debugging metrics (cache hit rates, external API calls)

Avoid Common Pitfalls

Cardinality explosion happens when label combinations grow exponentially. A common mistake:

# Bad: High cardinality
request_counter.add(1, {"user_id": get_current_user_id()})

# Good: Bounded cardinality  
request_counter.add(1, {"user_tier": get_user_tier()})

Claude will warn you about potential cardinality issues when you describe your metric labeling strategy.

Maintain Consistent Naming

Follow OpenTelemetry semantic conventions for metric names:

# Good: Follows conventions
meter.create_counter("http.server.request.duration", unit="s")

# Avoid: Non-standard naming
meter.create_counter("request_time_ms")

Automating Metrics with Claude Code Skills

For teams using Claude Code extensively, create a skill that automates metrics implementation:

## Metrics Implementation Skill
When asked to add OpenTelemetry metrics:
1. Identify the framework (FastAPI, Express, Spring Boot, etc.)
2. Determine which endpoints or functions need instrumentation
3. Generate appropriate metric types based on use case
4. Add labels following cardinality guidelines
5. Verify metrics are exported correctly

This allows consistent metrics implementation across your codebase without repetitive manual guidance.

Conclusion

OpenTelemetry metrics provide essential visibility into your application’s behavior and business performance. Claude Code accelerates implementation by generating boilerplate, explaining concepts, and helping you avoid common pitfalls. Start with basic infrastructure metrics, add business metrics that inform real decisions, and iterate as your monitoring needs evolve.

Remember: metrics should answer questions that matter for operating and improving your system. Let your monitoring goals guide what you instrument, and use Claude Code to implement efficiently.

Built by theluckystrike — More at zovo.one