Data Retention Policy Template for Startups
Below is a ready-to-use data retention policy template for startups, with recommended retention periods by data type: 7 years for transaction records, 90 days for application logs, 2 years for marketing data, and active-plus-30-days for account data. Copy the template, adapt the retention schedules to your business, and implement the included PostgreSQL and Python automation scripts to enforce deletion automatically. This prevents GDPR and CCPA penalties, reduces storage costs, and minimizes your breach surface.
Why Startups Need a Data Retention Policy
Without a documented retention policy, startups often accumulate data indefinitely. Every database table, log file, and backup becomes a liability. When regulators like the FTC, state attorneys general, or international authorities request data for investigations, you must produce what you have—irrespective of whether retaining that data was justified.
GDPR Article 5(1)(e) explicitly requires that personal data be kept only for as long as necessary for the purposes for which it was collected. California Consumer Privacy Act (CCPA) and similar state laws impose similar obligations. Beyond compliance, expired data provides no business value but creates ongoing storage costs and breach exposure. A retention policy transforms an abstract legal requirement into an operational system.
Core Components of a Retention Policy
Your policy needs four essential elements: data categorization, retention periods, deletion procedures, and exceptions handling. Each data category in your systems maps to a specific retention period based on legal requirements, business needs, and risk assessment.
Data Categories and Retention Periods
Organize your data into categories with corresponding retention periods:
Transaction and service data should be retained for the duration of the user relationship plus a defined period for tax and accounting purposes (typically 7 years). Authentication and account data stays active plus 30 days after deletion for account recovery. Application logs are retained for 90 days for debugging and security incident investigation. Marketing and analytics data follows consent duration, typically purged after 2 years of inactivity. Communication records are retained for 2 years for customer support and dispute resolution. Backup data should be deleted or anonymized within 30 days of creation.
Map these categories to your actual database tables and implement automated enforcement.
Implementing Automated Data Retention
Manual deletion processes fail. Engineers forget to run cleanup scripts, and data accumulates faster than anyone monitors. Automate retention enforcement at the database level using scheduled jobs or database-native features.
PostgreSQL Implementation
Create a function that identifies and deletes expired records:
CREATE OR REPLACE FUNCTION cleanup_expired_data()
RETURNS void
LANGUAGE plpgsql
SECURITY DEFINER
AS $$
BEGIN
-- Delete application logs older than 90 days
DELETE FROM application_logs
WHERE created_at < NOW() - INTERVAL '90 days';
-- Delete marketing data where consent expired (2 years inactive)
DELETE FROM marketing_consents
WHERE last_interaction < NOW() - INTERVAL '730 days'
AND withdrawn = true;
-- Anonymize analytics data older than 1 year
UPDATE analytics_events
SET user_id = NULL,
session_id = NULL,
ip_address = NULL
WHERE event_timestamp < NOW() - INTERVAL '365 days'
AND user_id IS NOT NULL;
END;
$$;
Schedule this function to run daily:
-- Run cleanup daily at 3:00 AM
SELECT cron.schedule(
'daily-data-retention-cleanup',
'0 3 * * *',
'SELECT cleanup_expired_data()'
);
Python Implementation for Cross-Table Cleanup
For application-level cleanup across multiple data stores:
import asyncio
from datetime import datetime, timedelta
from dataclasses import dataclass
from typing import List
@dataclass
class RetentionRule:
table: str
column: str
retention_days: int
action: str # 'delete' or 'anonymize'
class DataRetentionEnforcer:
def __init__(self, db_pool):
self.rules: List[RetentionRule] = []
self.db_pool = db_pool
def add_rule(self, table: str, column: str,
retention_days: int, action: str = 'delete'):
self.rules.append(RetentionRule(
table=table,
column=column,
retention_days=retention_days,
action=action
))
async def run_cleanup(self):
cutoff_date = datetime.utcnow()
for rule in self.rules:
async with self.db_pool.acquire() as conn:
if rule.action == 'delete':
query = f"""
DELETE FROM {rule.table}
WHERE {rule.column} < $1
RETURNING id
"""
else: # anonymize
query = f"""
UPDATE {rule.table}
SET user_id = NULL, ip_address = NULL
WHERE {rule.column} < $1
RETURNING id
"""
cutoff = cutoff_date - timedelta(days=rule.retention_days)
deleted = await conn.fetch(query, cutoff)
print(f"Processed {len(deleted)} records from {rule.table}")
# Usage
enforcer = DataRetentionEnforcer(db_pool)
enforcer.add_rule('application_logs', 'created_at', 90, 'delete')
enforcer.add_rule('analytics_events', 'event_timestamp', 365, 'anonymize')
enforcer.add_rule('marketing_consents', 'last_interaction', 730, 'delete')
Handling Data Subject Requests
When users request deletion under GDPR Article 17 or CCPA, your retention system must identify all data associated with their identity across every storage system. Build a data lineage map that tracks user identifiers across tables.
async def handle_deletion_request(user_id: str, db_pool):
"""
Execute a data subject deletion request across all data stores.
"""
# Define deletion scope for each data type
deletion_queries = [
("users", "DELETE FROM users WHERE id = $1"),
("sessions", "DELETE FROM sessions WHERE user_id = $1"),
("auth_tokens", "DELETE FROM auth_tokens WHERE user_id = $1"),
("customer_data", "DELETE FROM customer_data WHERE user_id = $1"),
("support_tickets", "UPDATE support_tickets SET user_id = NULL, user_email = NULL WHERE user_id = $1"),
]
async with db_pool.acquire() as conn:
async with conn.transaction():
for table, query in deletion_queries:
result = await conn.execute(query, user_id)
print(f"Deleted from {table}: {result}")
return {"status": "completed", "user_id": user_id}
Documentation and Compliance Evidence
Your retention policy must be documented and accessible. Create a RETENTION_POLICY.md file in your codebase that every engineer can reference:
# Data Retention Policy
## Purpose
This document defines retention periods for all data processed by our platform.
## Policy Owner
Legal Team (legal@yourstartup.com)
## Last Updated
March 2026
## Data Categories
| Data Type | Retention Period | Legal Basis | Storage Location |
|-----------|------------------|-------------|------------------|
| User Account Data | Active + 30 days | Service provision | PostgreSQL |
| Transaction Records | 7 years | Tax compliance | PostgreSQL |
| Application Logs | 90 days | Security monitoring | CloudWatch/Splunk |
| Marketing Consent | 2 years inactive | Consent | PostgreSQL |
| Support Communications | 2 years | Legitimate interest | Zendesk |
## Enforcement
- Automated deletion runs daily at 3:00 AM UTC
- Manual deletion requires approval from Policy Owner
- All deletions are logged with timestamps
Key Implementation Steps
Start by auditing your data stores to identify every table containing personal data. Map each table to a retention category and determine the appropriate retention period. Implement automated cleanup using the database-native approach or application-level code shown above. Schedule regular audits to verify that cleanup jobs execute successfully and retention periods remain accurate as your product evolves.
Related Reading
Built by theluckystrike — More at zovo.one