Best Deploy Workflow for a Remote Infrastructure Team of 3
Use a four-stage pipeline — local validation, CI testing, staged deployment, and production approval gate — with GitHub Actions environment protection requiring one peer approval before any production push. This workflow gives a three-person remote infrastructure team enough automation to deploy safely across time zones while keeping human oversight where it matters. Pair it with weekly deployment rotation and async runbooks stored in your infrastructure repo so the on-call engineer can execute confidently without hunting for context in Slack.
Core Principles for Small Remote Teams
Before diving into implementation, establish the principles that guide your workflow. Small teams benefit from explicit conventions that larger teams might handle through process overhead.
Automation over memory. Document your deployment steps as code rather than relying on tribal knowledge. When someone deploys at 2 AM across three time zones, they should follow tested steps, not hunt for context in Slack threads.
Fail fast, recover faster. Your workflow should catch problems early in the pipeline and provide clear rollback paths. A three-person team cannot afford debugging production issues while juggling other responsibilities.
Async-first check-ins. Build review gates that work without requiring immediate responses. Use pull request comments, checklist-based approvals, and scheduled deployment windows rather than expecting real-time availability.
Structuring Your Deployment Pipeline
A practical deployment pipeline for a small infrastructure team uses staged gates that escalate appropriately based on change risk.
Stage 1: Local Validation
Every deployment starts with developer workstations running identical validation:
# Pre-commit hook: validate changes before they enter version control
#!/bin/bash
set -e
# Lint infrastructure code
terraform fmt -check -recursive
ansible-lint playbook.yml
hadolint Dockerfile*
# Validate syntax and plan
terraform plan -out=tfplan
ansible-playbook --check playbook.yml
This catches basic errors before code reaches version control, reducing review cycles.
Stage 2: Automated Testing in CI
Your continuous integration pipeline runs comprehensive checks on every branch:
# .github/workflows/validate.yaml
name: Validate Infrastructure Changes
on: [pull_request]
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v2
- run: terraform init
- run: terraform validate
- run: terraform plan -no-color
ansible:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: ansible-lint playbook.yml
- run: ansible-playbook --check playbook.yml
Generate plan output as a pull request artifact. When reviewing infrastructure changes, teammates can examine the exact resource modifications before approval.
Stage 3: Staged Deployment
Deploy to production-facing environments in controlled steps:
#!/bin/bash
# deploy.sh - Production deployment script
ENVIRONMENT=${1:-staging}
APP_VERSION=${2:-latest}
AUTO_APPROVE=${3:-false}
echo "Deploying $APP_VERSION to $ENVIRONMENT"
# Pull latest configuration
git pull origin main
# Run deployment playbook
ansible-playbook deploy.yml \
-e "env=$ENVIRONMENT" \
-e "version=$APP_VERSION" \
--tags=deploy
# Verify deployment health
./scripts/health-check.sh "$ENVIRONMENT"
if [ $? -eq 0 ]; then
echo "Deployment successful"
else
echo "Health check failed - initiating rollback"
ansible-playbook rollback.yml -e "env=$ENVIRONMENT"
exit 1
fi
Stage 4: Production Approval Gate
For a three-person team, require at least one peer approval for production changes:
# .github/workflows/deploy-production.yml
name: Deploy to Production
on:
workflow_dispatch:
inputs:
version:
description: 'Version tag to deploy'
required: true
jobs:
deploy:
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- name: Require approval
run: |
echo "Waiting for approval..."
# Use GitHub's built-in environment protection
- name: Deploy
run: ./deploy.sh production ${{ github.event.inputs.version }}
- name: Notify team
run: |
curl -X POST $SLACK_WEBHOOK \
-d "text='Deployment complete: ${{ github.event.inputs.version }} by ${{ github.actor }}'"
GitHub’s environment protection rules ensure that deployments require approval from a designated team member before proceeding.
Time Zone Coordination Strategies
Remote infrastructure teams spanning multiple time zones need explicit coordination mechanisms.
Scheduled deployment windows. Agree on overlapping hours when at least two team members are available. For a team with members in UTC-5, UTC+1, and UTC+8, the overlap between UTC-5 and UTC+1 (roughly 14:00-18:00 UTC-5) provides a four-hour window where real-time coordination is possible.
Deployment rotation. Rotate deployment responsibility weekly. Each team member owns deployment readiness for their assigned week, including updating runbooks and monitoring alerts.
Async runbooks. Maintain deployment runbooks as markdown files in your infrastructure repository:
# Deployment Runbook: Application Server
## Prerequisites
- [ ] Incident channel created in Slack
- [ ] On-call engineer acknowledged deployment
## Pre-deployment
1. Check active incidents: `kubectl get incidents`
2. Verify database migrations are compatible: `make db:validate`
3. Confirm backup completion: `aws s3 ls s3://backups/$(date +%Y-%m-%d)/`
## Execution
1. Run: `./deploy.sh production <version>`
2. Monitor: `tail -f deployment.log`
3. Verify: `./scripts/ smoke-tests.sh`
## Rollback
If smoke tests fail:
1. Run: `./rollback.sh production <previous-version>`
2. Alert: Notify #incidents channel
3. Document: Create incident report
## Post-deployment
1. Update status page
2. Announce in #releases channel
3. Close incident channel
Handling Emergency Deployments
Sometimes production issues require immediate action outside normal procedures. Define clear escalation paths:
# emergency-deploy.sh - Restricted to on-call personnel
#!/bin/bash
if [ "$1" != "--emergency" ]; then
echo "Use ./emergency-deploy.sh --emergency <version> for emergency deployments"
exit 1
fi
# Verify caller is on-call
ONCALL=$(cat .oncall/current)
if [ "$USER" != "$ONCALL" ]; then
echo "Only $ONCALL can run emergency deployments"
exit 1
fi
# Require second confirmation for emergency mode
read -p "EMERGENCY DEPLOY to production? Type 'yes' to confirm: "
if [ "$REPLY" != "yes" ]; then
echo "Deployment cancelled"
exit 1
fi
./deploy.sh production $2 --emergency
This pattern requires explicit acknowledgment of emergency status while keeping deployment speed acceptable for critical situations.
Continuous Improvement
Review your deployment process monthly. Track metrics that matter for a small team:
- Mean time to recovery (MTTR) for failed deployments
- Deployment frequency and batch sizes
- Review cycle time for pull requests
- Number of rollback incidents
A three-person team can iterate quickly on workflow improvements. When something causes friction, discuss it in your next sync and adjust accordingly.
Related Reading
- Async QA Signoff Process for Remote Teams Releasing Weekly
- ADR Tools for Remote Engineering Teams
- Async Bug Triage Process for Remote QA Teams
Built by theluckystrike — More at zovo.one