Claude Skills Guide

Claude Code for Incident Tracking Workflow Tutorial

Incident tracking is critical for maintaining reliable software systems. When something goes wrong in production, you need a fast, consistent way to log, triage, and resolve issues. Claude Code combined with custom skills provides a powerful framework for building incident management workflows that integrate smoothly into your development environment.

This tutorial walks you through creating an incident tracking system using Claude Code skills. You’ll learn how to log incidents, assign severity levels, track resolution progress, and generate incident reports—all through natural language interactions with Claude.

Setting Up Your Incident Tracking Skill

Before building workflows, you need a skill that understands incident tracking concepts. Create a new skill file at ~/.claude/skills/incident-tracker/skill.md with the following structure:

---
name: incident-tracker
description: Track and manage software incidents with severity levels, assignments, and resolution workflows
---

# Incident Tracker Skill

This skill helps teams track software incidents from detection to resolution.

The front matter defines the skill’s capabilities and available tools. The tools field restricts this skill to file operations and shell commands, ensuring it can’t make unintended network calls.

Creating the Incident Data Store

Your incident tracker needs a place to store incident data. A simple JSON file works well for single-user scenarios, while teams might prefer a SQLite database or integration with external systems like PagerDuty or Jira.

Create an incidents data file at ~/.claude/skills/incident-tracker/incidents.json:

{
  "incidents": [],
  "next_id": 1,
  "statuses": ["open", "investigating", "identified", "monitoring", "resolved"],
  "severities": ["critical", "high", "medium", "low"]
}

This structure keeps incidents organized and allows for quick lookups. The next_id field ensures each incident gets a unique identifier.

Implementing Incident Logging Commands

The core of your incident tracker is the ability to log new incidents. Add command handlers to your skill that parse incident details and write to your data store.

Here’s how to log an incident using Claude Code’s built-in tools:

# Log a new incident
INCIDENT_ID=$(cat ~/.claude/skills/incident-tracker/incidents.json | jq '.next_id')
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")

# Create incident record
jq --arg id "$INCIDENT_ID" \
   --arg timestamp "$TIMESTAMP" \
   --arg title "$INCIDENT_TITLE" \
   --arg severity "$INCIDENT_SEVERITY" \
   --arg description "$INCIDENT_DESCRIPTION" \
   '.incidents += [{
     id: ($id | tonumber),
     title: $title,
     severity: $severity,
     description: $description,
     status: "open",
     created_at: $timestamp,
     updated_at: $timestamp,
     assignee: null,
     resolution: null
   }] | .next_id += 1' \
   ~/.claude/skills/incident-tracker/incidents.json > /tmp/incidents.tmp.json

mv /tmp/incidents.tmp.json ~/.claude/skills/incident-tracker/incidents.json

This script generates a unique incident ID, timestamps the creation, and stores all incident details. The jq command handles JSON manipulation cleanly.

Building the Incident Triage Workflow

Once incidents are logged, you need a triage process to prioritize response. Create a triage workflow that evaluates incidents based on severity and assigns appropriate actions.

---
name: incident-triage
description: Triage incidents by severity and assign resolution priorities
---

## Triage Guidelines

When triaging incidents, apply these rules:

1. **Critical (SEV1)**: Service completely down, data loss, security breach
   - Response time: Immediate
   - Assign: On-call senior engineer
   - Actions: Page on-call, create war room, notify leadership

2. **High (SEV2)**: Major feature broken, significant degradation
   - Response time: Within 1 hour
   - Assign: Team lead
   - Actions: Notify stakeholders, begin investigation

3. **Medium (SEV3)**: Minor feature affected, workaround available
   - Response time: Within 4 hours
   - Assign: Available engineer
   - Actions: Schedule fix in current sprint

4. **Low (SEV4)**: Cosmetic issue, minor inconvenience
   - Response time: Within 24 hours
   - Assign: Backlog
   - Actions: Create ticket for future sprint

This skill guides Claude to categorize new incidents consistently. When you describe an incident to Claude, it applies these rules to suggest the appropriate severity level and response actions.

Tracking Incident Resolution

Incident resolution requires documenting what happened, how it was fixed, and what prevention measures were taken. Create a resolution workflow skill:

---
name: incident-resolution
description: Document and resolve tracked incidents with RCA and prevention notes
---

## Resolution Workflow

When resolving an incident, capture these fields:

1. **Root Cause**: What actually caused the issue?
2. **Resolution Steps**: How was it fixed?
3. **Prevention**: What prevents this from recurring?
4. **Lessons Learned**: What did the team learn?

Use the update-incident command to record resolution details:

Update incident #{{incident_id}}:

The double curly braces {{incident_id}} are placeholders that Claude fills in with actual values when you invoke the command.

Generating Incident Reports

Regular incident reviews help teams improve their systems. Create a reporting skill that generates summaries from your incident data:

#!/bin/bash
# incident-report.sh - Generate incident summary report

INCIDENTS_FILE="$HOME/.claude/skills/incident-tracker/incidents.json"

echo "=== Incident Summary Report ==="
echo "Generated: $(date)"
echo ""

echo "## Open Incidents"
jq -r '.incidents[] | select(.status != "resolved") | 
  "- #\(.id): \(.title) [\(_.severity)]"' "$INCIDENTS_FILE"

echo ""
echo "## Incidents by Severity (Last 30 Days)"
jq -r '.incidents[] | .severity' "$INCIDENTS_FILE" | \
  sort | uniq -c | sort -rn

echo ""
echo "## Resolution Time Averages"
# Calculate average time to resolution by severity
jq -r '.incidents[] | select(.status == "resolved") | 
  "\(.severity) \(.created_at) \(.resolved_at)"' "$INCIDENTS_FILE"

This report script provides visibility into incident trends, helping teams identify recurring issues and measure their incident response effectiveness.

Integrating with Notification Systems

For critical incidents, you need to notify the right people immediately. Extend your skills to integrate with Slack, PagerDuty, or email:

---
name: incident-notifications
description: Send incident notifications to on-call teams and stakeholders
---

## Critical Incident Notifications

When a critical (SEV1) incident is logged, automatically notify:

1. On-call engineer via PagerDuty
2. Team Slack channel #incidents
3. Engineering manager for SEV1/SEV2

Use the notify-oncall script:

```bash
#!/bin/bash
# Notify on-call team of critical incident
curl -X POST "$SLACK_WEBHOOK_URL" \
  -H 'Content-Type: application/json' \
  -d "{
    \"text\": \"🚨 CRITICAL INCIDENT: $INCIDENT_TITLE\",
    \"attachments\": [{
      \"color\": \"danger\",
      \"fields\": [
        {\"title\": \"Severity\", \"value\": \"SEV1\"},
        {\"title\": \"Description\", \"value\": \"$INCIDENT_DESCRIPTION\"}
      ]
    }]
  }"

```

This automation ensures critical incidents get immediate attention without manual intervention.

Best Practices for Incident Tracking Workflows

Building an effective incident tracking system requires more than just logging incidents. Follow these best practices:

Document Everything: Every incident should have a clear title, description, and timeline. Future-you will thank present-you for detailed notes.

Automate Repetitive Tasks: Use Claude skills to handle routine operations like assigning IDs, updating statuses, and generating reports.

Integrate Early: Connect your incident tracker to monitoring systems, CI/CD pipelines, and communication tools from the start.

Review Regularly: Schedule regular incident retrospectives to identify patterns and prevent recurrence.

Keep It Simple: Start with a minimal viable incident tracking system and add complexity as your needs evolve.

Conclusion

Claude Code provides a flexible foundation for building incident tracking workflows that fit your team’s needs. By creating focused skills for logging, triaging, resolving, and reporting incidents, you establish consistent processes that improve over time.

The skills and workflows in this tutorial are starting points. Customize them to match your organization’s terminology, integrate with your existing tools, and evolve them as your incident management matures.

With Claude handling the operational aspects of incident tracking, your team can focus on what matters most: building reliable systems and responding effectively when issues arise.

Built by theluckystrike — More at zovo.one