Remote Team Runbook Template for SSL Certificate Renewal Process with Distributed Infrastructure Team
SSL certificate expiration remains one of the most preventable causes of service outages. When certificates expire, your applications become inaccessible, customers see security warnings, and your team scrambles to fix the problem under pressure. For distributed infrastructure teams working across time zones, the lack of a standardized renewal process amplifies these risks.
This runbook provides a practical template for managing SSL certificate renewals asynchronously, ensuring your remote infrastructure team stays ahead of expiration dates without last-minute fire drills.
Understanding the Challenge
In a remote team environment, certificate renewals present unique challenges. When your infrastructure spans multiple cloud providers, regions, and team members in different time zones, the coordination overhead increases significantly. Someone in Tokyo might handle DNS configuration while someone in Berlin manages the certificate deployment, and someone in San Francisco maintains the documentation.
Without explicit runbooks, critical knowledge lives only in individual Slack messages or personal notes. When team members take time off or leave, institutional knowledge disappears. A well-structured runbook solves this by making every renewal step explicit, auditable, and transferable.
The Certificate Renewal Runbook Template
Prerequisites
Before initiating a certificate renewal, gather the following information:
- Certificate domain(s): The fully qualified domain names covered by the certificate
- Current certificate expiration date: Available from your monitoring system or certificate details
- Certificate authority (CA): The organization that issued your current certificate
- Cloud provider(s): Where the certificates are currently deployed (AWS, GCP, Azure, or on-premise)
- Primary and secondary owners: Team members responsible for this certificate
- Renewal method: ACM (Automated Certificate Management Environment), manual CSR generation, or DNS validation
Phase 1: Preparation (14 Days Before Expiration)
The preparation phase starts two weeks before certificate expiration. This buffer allows time for troubleshooting without rushing.
Step 1: Create a Renewal Task
Open a task in your project management system with the following template:
## Certificate Renewal Task
- **Domain**: example.com, api.example.com
- **Current Expiration**: 2026-03-30
- **Environment**: Production
- **Owners**: @alice (primary), @bob (secondary)
- **CA**: Let's Encrypt / DigiCert
- **Validation Method**: DNS-01
Step 2: Verify Access Permissions
Confirm that team members have appropriate access:
# Verify AWS IAM certificate access
aws iam get-server-certificate --server-certificate-name example-com-prod
# Verify GCP SSL certificates
gcloud compute ssl-certificates list --filter="name~example"
# Verify Azure certificates
az keyvault certificate list --vault-name "production-keyvault"
If any team member lacks access, grant permissions during this phase rather than during deployment.
Step 3: Notify Stakeholders
Post a message in your infrastructure team’s channel:
🔔 Certificate Renewal Reminder
Domain: example.com
Current cert expires: 2026-03-30
Renewal initiated by: @alice
Timeline: Preparation phase (Day 1-7), Deployment phase (Day 8-10)
Phase 2: Certificate Generation (7 Days Before Expiration)
Step 1: Generate New Certificate
For Let’s Encrypt with Certbot:
certbot certonly --dns-route53 \
-d example.com \
-d api.example.com \
--renew-by-default \
--cert-path /tmp/new-certs/
For manual CSR generation with OpenSSL:
# Generate new private key
openssl genrsa -out example-com-key.pem 4096
# Generate CSR
openssl req -new -key example-com-key.pem \
-out example-com.csr \
-subj "/C=US/ST=California/L=San Francisco/O=Company/CN=example.com"
Step 2: Validate Certificate
Before deploying, verify the certificate details:
# Check certificate expiration
openssl x509 -in /tmp/new-certs/fullchain.pem -noout -dates
# Verify domain coverage
openssl x509 -in /tmp/new-certs/fullchain.pem -noout -text | grep -A1 "Subject Alternative Name"
Step 3: Upload to Secret Management
Store the new certificate in your team’s secrets management system:
# HashiCorp Vault example
vault kv put secret/ssl/example-com-prod \
certificate=@/tmp/new-certs/fullchain.pem \
private_key=@/tmp/new-certs/privkey.pem \
expiration="2027-03-30" \
renewed_by="alice" \
renewal_date="2026-03-16"
Phase 3: Deployment (3 Days Before Expiration)
Deploy certificates to each environment systematically. Test staging before production.
Step 1: Deploy to Staging
# AWS ALB example
aws iam upload-server-certificate \
--server-certificate-name example-com-staging \
--certificate-body file://staging-cert.pem \
--private-key file://staging-key.pem \
--path /infra/staging/
# Update ALB listener
aws elbv2 set-rule-associations \
--rule-associations "[{\"RuleArn\":\"arn:aws:elasticloadbalancing:region:account:rule/rule-id\",\"CertificateArns\":['arn:aws:iam::account:server-certificate/example-com-staging']}]"
Step 2: Validate Staging Deployment
# Test certificate deployment
curl -vI https://staging.example.com 2>&1 | grep -E "SSL|TLS|certificate"
# Verify certificate chain
openssl s_client -connect staging.example.com:443 -showcerts
Step 3: Coordinate Production Deployment
For production deployment, coordinate with team members in overlapping time zones:
Production Deployment Window
- Start: 2026-03-27 18:00 UTC (Alice's end of day)
- End: 2026-03-27 22:00 UTC (Bob's start of day)
- Rollback procedure: [link to rollback documentation]
- Approvals required: 1 primary or 2 team members
Step 4: Deploy to Production
Follow the same deployment steps used for staging, targeting production resources.
Step 5: Verify Production Deployment
# Check certificate is serving correctly
curl -I https://example.com 2>&1 | head -20
# Verify with external checker
curl -s https://www.ssllabs.com/ssltest/analyze.html?d=example.com | grep -i "rating"
# Monitor error rates post-deployment
kubectl get pods -n production -l app=api | grep -v "Running"
Phase 4: Post-Renewal (After Deployment)
Step 1: Update Documentation
Record the renewal in your certificate inventory:
## Certificate Inventory
| Domain | Serial | Expiration | CA | Renewed | Owner |
|--------|--------|------------|-----|---------|-------|
| example.com | 04:A3:... | 2027-03-30 | Let's Encrypt | 2026-03-16 | alice |
Step 2: Close the Task
Mark the task complete and include a summary:
✅ Certificate Renewal Complete
- New certificate deployed to production
- Expiration: 2027-03-30
- Deployed by: alice
- Verified at: 2026-03-27 19:45 UTC
- Next renewal reminder: 2027-03-16
Step 3: Schedule Next Reminder
Set a calendar reminder for 14 days before the next expiration:
🔔 SSL Certificate Renewal Reminder
Date: 2027-03-16 (14 days before expiration)
Certificate: example.com, api.example.com
Action: Begin renewal preparation
Automating Renewal Reminders
For distributed teams, automation reduces manual tracking overhead. Consider implementing certificate expiration monitoring:
# Example: Certificate expiration monitor
import boto3
from datetime import datetime, timedelta
def check_certificate_expiration():
iam = boto3.client('iam')
certs = iam.list_server_certificates()
thirty_days = timedelta(days=30)
for cert in certs['ServerCertificateMetadataList']:
days_until_expiry = (cert['Expiration'] - datetime.now(cert['Expiration'].tzinfo)).days
if days_until_expiry <= 30:
send_slack_notification(
channel="#infrastructure",
message=f"⚠️ Certificate {cert['ServerCertificateName']} expires in {days_until_expiry} days"
)
Key Success Factors
Successful certificate renewal in distributed teams depends on four practices. First, start early with a two-week buffer to allow time for troubleshooting access issues or DNS propagation delays. Second, document everything in writing so any team member can execute the runbook without requiring verbal instructions. Third, test in staging first to catch configuration errors before they affect production. Fourth, maintain an accurate certificate inventory with expiration dates, owners, and renewal procedures.
When your team spans multiple time zones, async-friendly processes prevent single points of failure. Every piece of knowledge should exist in documentation, not just in someone’s head.
Related Articles
- Remote Team Runbook Template for Database Failover
- Remote Team Runbook Template for Deploying Hotfix to
- Get recent workflow run durations
- Certificate Based Authentication Setup for Remote Team VPN
- From your local machine with VPN active
Built by theluckystrike — More at zovo.one