Best AI Tools for Writing Database Seed Scripts 2026
Generating realistic test data is tedious. You need valid email formats, proper password hashes, realistic addresses, correct foreign key relationships, and edge cases—all matching your schema. AI tools now understand ORM conventions (Prisma, Django, Rails) and generate production-like seed data in seconds. This guide compares specific tools, real-world implementations, and ORM-specific patterns.
The Data Generation Problem
Manual seed scripts are error-prone. Hardcoding 50 users, 100 posts, and related comments takes hours and rarely includes edge cases (deleted users, orphaned posts, boundary values). Fake data generators (Faker, Fakerjs) reduce friction but still require you to write boilerplate loops and relationship logic. AI tools understand your schema from comments or type hints and generate complete, relationships-aware scripts.
Tool Comparison
1. Claude + Claude.dev / Cursor
Overview: Claude (via Claude.dev browser interface or Cursor IDE) understands ORM syntax natively and generates well-structured seed scripts with realistic data.
Cost: Claude.dev free tier (limited), $20/month Pro. Cursor $20/month.
Prisma Example:
Paste your schema into Claude with this prompt:
I have this Prisma schema:
model User {
id Int @id @default(autoincrement())
email String @unique
username String @unique
password String
createdAt DateTime @default(now())
posts Post[]
}
model Post {
id Int @id @default(autoincrement())
title String
content String
published Boolean @default(false)
authorId Int
author User @relation(fields: [authorId], references: [id])
comments Comment[]
}
model Comment {
id Int @id @default(autoincrement())
text String
postId Int
post Post @relation(fields: [postId], references: [id])
}
Generate a seed script using Prisma Client that creates:
- 5 users with realistic data
- 15 posts across users
- 30 comments across posts
- Include at least one unpublished post
- Hash passwords with bcrypt
- Use valid email formats
- Include timestamps with realistic dates
Output only TypeScript code.
Claude generates:
// prisma/seed.ts
import { PrismaClient } from '@prisma/client';
import bcrypt from 'bcrypt';
const prisma = new PrismaClient();
async function main() {
// Clear existing data
await prisma.comment.deleteMany({});
await prisma.post.deleteMany({});
await prisma.user.deleteMany({});
// Create users
const users = await Promise.all([
prisma.user.create({
data: {
email: 'alice@example.com',
username: 'alice_wonder',
password: await bcrypt.hash('password123', 10),
},
}),
prisma.user.create({
data: {
email: 'bob@example.com',
username: 'bob_builder',
password: await bcrypt.hash('securepass456', 10),
},
}),
prisma.user.create({
data: {
email: 'carol@example.com',
username: 'carol_singer',
password: await bcrypt.hash('mysecret789', 10),
},
}),
prisma.user.create({
data: {
email: 'david@example.com',
username: 'david_jones',
password: await bcrypt.hash('davidpass123', 10),
},
}),
prisma.user.create({
data: {
email: 'eve@example.com',
username: 'eve_explorer',
password: await bcrypt.hash('evepass456', 10),
},
}),
]);
console.log(`Created ${users.length} users`);
// Create posts
const posts = [
{
title: 'Getting Started with Next.js',
content: 'Next.js is a powerful React framework for building full-stack applications...',
published: true,
authorId: users[0].id,
},
{
title: 'Understanding TypeScript Generics',
content: 'Generics allow you to write reusable code that works with multiple types...',
published: true,
authorId: users[0].id,
},
{
title: 'Draft: My Unpublished Thoughts',
content: 'This post is still being written and not ready for publication.',
published: false,
authorId: users[1].id,
},
{
title: 'Web Performance Optimization Tips',
content: 'Learn practical techniques to make your web apps faster...',
published: true,
authorId: users[1].id,
},
{
title: 'Database Design Patterns',
content: 'Explore common patterns for designing scalable databases...',
published: true,
authorId: users[2].id,
},
];
const createdPosts = await Promise.all(
posts.map((post) =>
prisma.post.create({
data: post,
})
)
);
console.log(`Created ${createdPosts.length} posts`);
// Create comments
const comments = [
{ text: 'Great article! Helped me understand this concept.', postId: createdPosts[0].id },
{ text: 'Can you elaborate on the performance implications?', postId: createdPosts[0].id },
{ text: 'I disagree with this approach...', postId: createdPosts[1].id },
{ text: 'Thanks for the tips!', postId: createdPosts[3].id },
{ text: 'Looking forward to more posts like this.', postId: createdPosts[4].id },
];
const createdComments = await Promise.all(
comments.map((comment) =>
prisma.comment.create({
data: comment,
})
)
);
console.log(`Created ${createdComments.length} comments`);
}
main()
.then(async () => {
await prisma.$disconnect();
})
.catch(async (e) => {
console.error(e);
await prisma.$disconnect();
process.exit(1);
});
package.json integration:
{
"scripts": {
"seed": "npx ts-node prisma/seed.ts"
}
}
Pros: Native ORM understanding, generates production-like data, handles relationships automatically, works in IDE via Cursor.
Cons: Manual prompting required, no built-in templates, limited batch size (Claude context).
2. GitHub Copilot (IDE Autocomplete)
Overview: Copilot completes seed scripts as you type in VS Code or JetBrains. Fast feedback loop but requires you to define structure.
Cost: $10-20/month.
Django Example:
Start typing in fixtures/seed.py:
from django.core.management.base import BaseCommand
from django.contrib.auth.models import User
from myapp.models import Post, Comment
from django.utils import timezone
from faker import Faker
fake = Faker()
class Command(BaseCommand):
def handle(self, *args, **options):
# Clear existing
User.objects.all().delete()
Post.objects.all().delete()
# Create users
users = []
for i in range(5):
user = User.objects.create_user(
username=fake.user_name(),
email=fake.email(),
password='testpass123' # <-- Copilot suggests structure
)
users.append(user)
Copilot auto-completes the rest:
# Create posts (Copilot suggests loop structure)
for user in users:
for i in range(3):
Post.objects.create(
title=fake.sentence(nb_words=6),
content=fake.paragraphs(nb=3),
author=user,
published=fake.boolean(),
created_at=fake.date_time_this_year()
)
# Create comments (Copilot suggests)
posts = Post.objects.all()
for post in posts:
for i in range(random.randint(1, 5)):
Comment.objects.create(
text=fake.paragraph(),
post=post,
created_at=fake.date_time_this_year()
)
Pros: Fast, integrated in editor, works with any framework, learns from your codebase.
Cons: Requires skeleton structure, generic suggestions, less context-aware than Claude.
3. GPT-4 (OpenAI API)
Overview: Full API control for custom seed generation with schema validation.
Cost: $0.03 per 1K input tokens, $0.06 per 1K output tokens.
Rails Example:
# lib/seed_generator.rb
require 'openai'
class SeedGenerator
def initialize
@client = OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
end
def generate_seeds(schema_description, count: 10)
response = @client.chat(
parameters: {
model: 'gpt-4-turbo',
messages: [
{
role: 'user',
content: <<~PROMPT
Generate a Rails seed script for this schema:
#{schema_description}
Requirements:
- Create #{count} users with valid emails and BCrypt-hashed passwords
- Create 30 posts across users
- Create 50 comments across posts
- Include published and draft posts
- Use Rails conventions and Faker gem
- Add timestamps with realistic dates
- Output only Ruby code, no explanation
Constraints:
- Use `User.create!` not `build`
- Hash passwords with BCrypt
- Include error handling
PROMPT
}
],
temperature: 0.2
}
)
response.dig('choices', 0, 'message', 'content')
end
end
Call it:
schema = File.read('db/schema.rb')
generator = SeedGenerator.new
code = generator.generate_seeds(schema, count: 10)
File.write('db/seeds.rb', code)
Pros: High-quality generation, schema-aware, can handle complex relationships.
Cons: API dependency, latency (2-3s), cost per generation.
4. Cursor IDE (Built-in AI)
Overview: AI-native editor with Claude built-in. You can chat with your schema and generate seeds.
Cost: $20/month Pro.
Workflow:
- Open
prisma/schema.prisma - Select schema, open Cursor Chat (Cmd+K)
- Type: “Generate a seed script using Prisma Client with realistic data for all these models”
- Cursor generates complete, executable code
- Edit directly in editor
Pros: Fastest iteration, integrated with your editor, Claude-powered.
Cons: Subscription required, less flexible than manual prompting.
5. Specialized Tools: Factory Bot (Rails), Model Factories (Django)
Overview: AI doesn’t generate these; instead, use factories + AI completion.
Rails with FactoryBot:
# spec/factories/users.rb
FactoryBot.define do
factory :user do
email { Faker::Internet.email }
username { Faker::Internet.username }
password { SecureRandom.hex(16) }
created_at { Faker::Time.between(from: 2.years.ago, to: Time.now) }
factory :user_with_posts do
transient do
post_count { 3 }
end
after(:create) do |user, evaluator|
create_list(:post, evaluator.post_count, author: user)
end
end
end
factory :post do
title { Faker::Lorem.sentence(word_count: 8) }
content { Faker::Lorem.paragraphs(number: 5).join("\n\n") }
author { create(:user) }
published { [true, false].sample }
end
factory :comment do
text { Faker::Lorem.paragraph }
post { create(:post) }
end
end
# Seeds using factories
# db/seeds.rb
5.times { create(:user_with_posts, post_count: 3) }
Use Claude to generate factory definitions from schema:
Prompt: "Generate a FactoryBot factory for this Rails model..."
Pros: Reusable, DRY, works with tests.
Cons: Requires setup, less flexible than custom scripts.
Tool Comparison Table
| Tool | Cost | ORM Support | Relationship Handling | Speed | Ease | Best For |
|---|---|---|---|---|---|---|
| Claude | $20/mo | All | Excellent | 1-2s | Manual prompting | Complex schemas, realistic data |
| Copilot | $10/mo | All | Good | Instant | Editor integration | Quick scripts, IDE workflow |
| GPT-4 | $0.03/K tokens | All | Excellent | 2-3s | API calls | High-volume generation, APIs |
| Cursor | $20/mo | All | Excellent | 1-2s | Chat in editor | Rapid iteration, full IDE |
| FactoryBot | Free | Rails | Excellent | Instant | Reusable | Testing + seeding hybrid |
Practical Patterns
Pattern 1: Schema-Aware Generation (Prisma)
#!/bin/bash
# Generate seed from schema with Claude
SCHEMA=$(cat prisma/schema.prisma)
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d "{
\"model\": \"claude-3-5-sonnet-20241022\",
\"max_tokens\": 2048,
\"messages\": [{
\"role\": \"user\",
\"content\": \"Generate a TypeScript Prisma seed script from this schema:\n$SCHEMA\n\nOutput only code.\"
}]
}" | jq -r '.content[0].text' > prisma/seed.ts
Pattern 2: Incremental Seeding (Django)
# management/commands/seed.py
from django.core.management.base import BaseCommand
from django.db import transaction
class Command(BaseCommand):
@transaction.atomic
def handle(self, *args, **options):
# Only seed if database is empty
if User.objects.exists():
self.stdout.write("Database already seeded")
return
# Use Claude to generate batch creation
# ... (Claude generates bulk_create list)
User.objects.bulk_create([user1, user2, ...])
self.stdout.write(f"Seeded {User.objects.count()} users")
Pattern 3: Linked Data Generation (Rails)
# Use Claude to generate seed that respects foreign keys
# db/seeds.rb
# AI-generated structure maintaining referential integrity
User.transaction do
users = 5.times.map do |i|
User.create!(
email: "user#{i}@test.com",
password: SecureRandom.hex(16)
)
end
# Posts reference user IDs correctly
users.each do |user|
3.times { Post.create!(user_id: user.id, title: "Post #{rand(1000)}") }
end
# Comments reference post IDs correctly
Post.all.each do |post|
2.times { Comment.create!(post_id: post.id, text: "Great post!") }
end
end
Edge Cases & Solutions
Edge Case: Circular References
Problem: User has profile, profile references user.
Claude Solution:
// Create users first
const users = await Promise.all([...]);
// Then create profiles with user references
const profiles = await Promise.all(
users.map(user =>
prisma.profile.create({
data: {
bio: 'Test bio',
userId: user.id,
},
})
)
);
Edge Case: Unique Constraints
Problem: Email must be unique; random data may collide.
// Use counters or UUIDs
const users = await Promise.all(
Array.from({ length: 100 }, (_, i) =>
prisma.user.create({
data: {
email: `user${i}@test.com`, // Guaranteed unique
username: `user_${nanoid()}`, // Random unique
},
})
)
);
Edge Case: Soft Deletes
Problem: Need deleted users in seed to test filters.
// Include both active and deleted
const users = await Promise.all([
prisma.user.create({
data: { email: 'active@test.com', deletedAt: null },
}),
prisma.user.create({
data: { email: 'deleted@test.com', deletedAt: new Date('2024-01-01') },
}),
]);
Integration Checklist
- Choose your ORM (Prisma, Django, Rails, Sequelize)
- Get API keys ready (Claude, OpenAI, or use IDE Copilot/Cursor)
- Create initial schema document or paste schema into AI
- Generate seed script using your chosen tool
- Review generated code for:
- Correct relationships and foreign keys
- Realistic data (emails, dates, values)
- Handling of unique constraints
- Transaction safety and error handling
- Add seed script to
.gitignoreor commit to repo (team decision) - Test seed script:
npm run seed/python manage.py seed/rake db:seed - Verify data in database client
- Add seed command to CI/CD for test database setup
Related Articles
- ChatGPT vs Claude for Creating Database Migration Scripts
- AI Code Completion for Writing Shell Commands Inside Scripts
- Best AI Tools for Writing Shell Scripts
- Free AI Tools for Writing Bash Scripts and Automation
- AI Tools for Writing pytest Tests for Alembic Database
Built by theluckystrike — More at zovo.one