Claude Skills Guide

Claude Code for Detectron2 Workflow Guide

Detectron2, Facebook AI Research’s next-generation object detection and segmentation library, is a powerful tool for computer vision projects. However, working with its complex APIs, custom dataset configurations, and model architectures can be challenging. This guide shows you how to use Claude Code to streamline your Detectron2 development workflow, from project setup to model deployment.

Setting Up Your Detectron2 Environment

The first step in any Detectron2 project is getting your development environment configured correctly. Claude Code can help you create a well-structured project with all necessary dependencies.

Project Initialization

Start by creating a new directory and initializing your project structure:

mkdir detectron2-project && cd detectron2-project

Claude Code can generate a comprehensive requirements.txt for your project:

# requirements.txt
torch>=2.0.0
torchvision>=0.15.0
detectron2>=0.6
opencv-python>=4.8.0
pillow>=10.0.0
numpy>=1.24.0
pyyaml>=6.0

For GPU acceleration, ensure you have the appropriate CUDA version installed. Claude Code can help you generate the correct installation commands based on your system specifications.

Docker Setup for Reproducibility

Containerizing your Detectron2 environment ensures consistency across development and deployment. Here’s a practical Dockerfile:

FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime

WORKDIR /app

# Install Detectron2 dependencies
RUN apt-get update && apt-get install -y \
    libgl1-mesa-glx \
    libglib2.0-0 \
    libsm6 \
    libxext6 \
    libxrender-dev \
    && rm -rf /var/lib/apt/lists/*

# Clone and install Detectron2
RUN git clone https://github.com/facebookresearch/detectron2.git /tmp/detectron2 && \
    cd /tmp/detectron2 && \
    pip install -e . && \
    cd / && rm -rf /tmp/detectron2

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

Data Preparation and Dataset Registration

One of the most time-consuming aspects of Detectron2 projects is preparing and registering datasets. Claude Code can significantly accelerate this process.

Registering Custom Datasets

Detectron2 requires datasets to be registered with its dataset catalog. Here’s how to structure your dataset registration:

from detectron2.data import DatasetCatalog, MetadataCatalog
from detectron2.data.datasets import register_coco_instances
import os

def register_custom_dataset():
    """Register your custom dataset with Detectron2"""
    
    # Register training set
    register_coco_instances(
        "my_dataset_train",
        {},
        "path/to/train/annotations.json",
        "path/to/train/images"
    )
    
    # Register validation set
    register_coco_instances(
        "my_dataset_val",
        {},
        "path/to/val/annotations.json",
        "path/to/val/images"
    )
    
    # Set up metadata
    MetadataCatalog.get("my_dataset_train").thing_classes = [
        "person", "car", "dog", "cat", "chair"
    ]

if __name__ == "__main__":
    register_custom_dataset()

Claude Code can also help you convert datasets from other formats (YOLO, VOC, labelme) to COCO format, which Detectron2 natively supports.

Data Augmentation Strategies

Effective data augmentation is crucial for model generalization. Here’s a configuration example:

from detectron2.data import DatasetMapper, build_detection_train_loader
from detectron2.data.transforms import (
    RandomFlip, RandomResize, RandomCrop, ColorAugSSDTransform
)

def get_train_augmentation():
    """Define custom augmentation pipeline"""
    return [
        RandomFlip(prob=0.5, horizontal=True, vertical=False),
        RandomResize(min_scale=0.8, max_scale=1.25),
        RandomCrop(crop_type="relative", crop_size=(0.9, 0.9)),
        ColorAugSSDTransform(imgaug=True),
    ]

Model Configuration and Training

Choosing the Right Model Architecture

Detectron2 offers various pre-built models. Claude Code can help you select the appropriate architecture based on your requirements:

Model Type Use Case Speed Accuracy
R50-FPN Balanced Fast Good
R101-FPN Higher accuracy Medium Better
X101-FPN Maximum accuracy Slow Best
R50-FPN-Mask Instance segmentation Fast Good

Training Configuration

Here’s a practical training script:

from detectron2.config import get_cfg
from detectron2.engine import DefaultTrainer
from detectron2.evaluation import COCOEvaluator

def setup_training():
    cfg = get_cfg()
    cfg.merge_from_file(
        "detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
    )
    
    # Dataset configuration
    cfg.DATASETS.TRAIN = ("my_dataset_train",)
    cfg.DATASETS.TEST = ("my_dataset_val",)
    
    # Training parameters
    cfg.SOLVER.IMS_PER_BATCH = 4
    cfg.SOLVER.BASE_LR = 0.001
    cfg.SOLVER.MAX_ITER = 10000
    cfg.SOLVER.STEPS = (6000, 8000)
    
    # GPU settings
    cfg.MODEL.DEVICE = "cuda"
    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
    
    return cfg

def train_model():
    cfg = setup_training()
    
    os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
    trainer = DefaultTrainer(cfg)
    trainer.resume_or_load(resume=False)
    trainer.train()

if __name__ == "__main__":
    train_model()

Handling Common Training Issues

Claude Code can help you diagnose and resolve common training problems:

  1. Out of Memory Errors: Reduce batch size or image size
  2. Slow Training: Enable mixed-precision training with cfg.SOLVER.AMP_ENABLED = True
  3. Overfitting: Adjust learning rate schedule or add more augmentation
# Enable mixed-precision training for faster training with less memory
cfg.SOLVER.AMP_ENABLED = True

Inference and Model Deployment

Running Inference

After training, here’s how to perform inference:

from detectron2.engine import DefaultPredictor
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
import cv2

def run_inference(model_path, config_path, image_path):
    cfg = get_cfg()
    cfg.merge_from_file(config_path)
    cfg.MODEL.WEIGHTS = model_path
    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
    
    predictor = DefaultPredictor(cfg)
    image = cv2.imread(image_path)
    outputs = predictor(image)
    
    v = Visualizer(
        image[:, :, ::-1],
        metadata=MetadataCatalog.get("my_dataset_train"),
        scale=0.8
    )
    v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    
    cv2.imwrite("output.png", v.get_image()[:, :, ::-1])

# Run inference
run_inference(
    model_path="output/model_final.pth",
    config_path="detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml",
    image_path="test.jpg"
)

Exporting for Production

For production deployment, consider exporting your model to ONNX format:

import torch
from detectron2.modeling import build_model
from detectron2.checkpoint import DetectionCheckpointer

def export_to_onnx(cfg, model_path, output_path):
    model = build_model(cfg)
    checkpointer = DetectionCheckpointer(model)
    checkpointer.load(model_path)
    
    model.eval()
    
    # Create dummy input
    dummy_input = torch.randn(1, 3, 800, 1333)
    
    torch.onnx.export(
        model,
        dummy_input,
        output_path,
        export_params=True,
        opset_version=11,
        input_names=['input'],
        output_names=['boxes', 'scores', 'labels', 'masks'],
        dynamic_axes={
            'input': {0: 'batch_size'},
            'boxes': {0: 'batch_size'},
            'scores': {0: 'batch_size'},
            'labels': {0: 'batch_size'}
        }
    )

export_to_onnx(cfg, "output/model_final.pth", "model.onnx")

Best Practices and Tips

Project Structure

Maintain a clean, organized project structure:

detectron2-project/
├── configs/
│   └── my_config.yaml
├── datasets/
│   ├── train/
│   └── val/
├── models/
│   └── output/
├── scripts/
│   ├── train.py
│   ├── inference.py
│   └── export.py
├── src/
│   └── dataset_utils.py
├── tests/
│   └── test_model.py
├── requirements.txt
└── README.md

Claude Code Prompts for Detectron2

Here are effective prompts to use with Claude Code:

Conclusion

Claude Code significantly accelerates Detectron2 development by helping with environment setup, dataset preparation, training configuration, and deployment. By following this workflow guide, you can focus more on the creative aspects of your computer vision projects while Claude Code handles the boilerplate code and helps troubleshoot issues.

Remember to always verify generated code against the official Detectron2 documentation, as library APIs may evolve over time.

Built by theluckystrike — More at zovo.one