Skip to content

Data Seeding

Data seeding populates your database with initial data—whether it’s reference data for production, realistic test data for staging, or development fixtures for local testing. This guide covers environment-specific seed scripts and idempotent patterns.

Why Seed Data?

Different environments need different data:

flowchart LR
  subgraph Local["Local Development"]
      L1["Test users"]
      L2["Sample posts"]
      L3["Mock data"]
  end

  subgraph Staging["Staging"]
      S1["QA test accounts"]
      S2["Realistic data sets"]
      S3["Edge case data"]
  end

  subgraph Production["Production"]
      P1["Admin users"]
      P2["Default categories"]
      P3["System config"]
  end

EnvironmentSeed PurposeData Characteristics
LocalDevelopment testingFake data, many records, diverse scenarios
StagingQA and integration testingRealistic data, test accounts, edge cases
ProductionInitial setupMinimal essential data, real admin accounts

Seed Scripts Per Environment

Create separate seed scripts for each environment to handle their unique requirements.

Project Structure

Organize your seed scripts in a dedicated directory:

scripts/
├── seed/
│ ├── index.ts # Seed runner (entry point)
│ ├── local.ts # Local development seeds
│ ├── staging.ts # Staging environment seeds
│ ├── production.ts # Production seeds (minimal)
│ └── data/
│ ├── users.ts # User seed data
│ └── posts.ts # Post seed data

Base Seed Types

Define shared types for your seed data:

scripts/seed/types.ts
import type { InferInsertModel } from 'drizzle-orm';
import type { users, posts } from '../../src/db/schema';
export type NewUser = InferInsertModel<typeof users>;
export type NewPost = InferInsertModel<typeof posts>;
export interface SeedContext {
environment: 'local' | 'staging' | 'production';
verbose: boolean;
}

Local Development Seeds

Local seeds include extensive test data for development:

scripts/seed/local.ts
import { drizzle } from 'drizzle-orm/d1';
import { users, posts } from '../../src/db/schema';
import type { NewUser, NewPost } from './types';
const localUsers: NewUser[] = [
{
name: 'Alice Developer',
email: 'alice@localhost.test',
createdAt: Date.now()
},
{
name: 'Bob Tester',
email: 'bob@localhost.test',
createdAt: Date.now()
},
{
name: 'Charlie Admin',
email: 'charlie@localhost.test',
createdAt: Date.now()
},
];
const localPosts: NewPost[] = [
{
title: 'Getting Started with Drizzle',
content: 'This is a sample post for local development...',
authorId: 1,
createdAt: Date.now()
},
{
title: 'D1 Database Tips',
content: 'Here are some tips for working with D1...',
authorId: 1,
createdAt: Date.now()
},
{
title: 'Testing Your API',
content: 'Learn how to test your API endpoints...',
authorId: 2,
createdAt: Date.now()
},
];
export async function seedLocal(db: ReturnType<typeof drizzle>) {
console.log('🌱 Seeding local database...');
// Insert users
const insertedUsers = await db.insert(users).values(localUsers).returning();
console.log(` ✓ Inserted ${insertedUsers.length} users`);
// Insert posts
const insertedPosts = await db.insert(posts).values(localPosts).returning();
console.log(` ✓ Inserted ${insertedPosts.length} posts`);
console.log('✅ Local seeding complete');
}

Staging Seeds

Staging seeds include realistic test accounts and QA data:

scripts/seed/staging.ts
import { drizzle } from 'drizzle-orm/d1';
import { users, posts } from '../../src/db/schema';
import type { NewUser, NewPost } from './types';
const stagingUsers: NewUser[] = [
{
name: 'QA Tester',
email: 'qa@staging.example.com',
createdAt: Date.now()
},
{
name: 'Staging Admin',
email: 'admin@staging.example.com',
createdAt: Date.now()
},
];
export async function seedStaging(db: ReturnType<typeof drizzle>) {
console.log('🌱 Seeding staging database...');
const insertedUsers = await db.insert(users).values(stagingUsers).returning();
console.log(` ✓ Inserted ${insertedUsers.length} users`);
console.log('✅ Staging seeding complete');
}

Production Seeds

Production seeds are minimal—only essential reference data:

scripts/seed/production.ts
import { drizzle } from 'drizzle-orm/d1';
import { users } from '../../src/db/schema';
import type { NewUser } from './types';
// Only essential data for production
const productionUsers: NewUser[] = [
{
name: 'System Admin',
email: 'admin@yourdomain.com',
createdAt: Date.now()
},
];
export async function seedProduction(db: ReturnType<typeof drizzle>) {
console.log('🌱 Seeding production database...');
const insertedUsers = await db.insert(users).values(productionUsers).returning();
console.log(` ✓ Inserted ${insertedUsers.length} admin users`);
console.log('✅ Production seeding complete');
}

Idempotent Seeding Pattern

Idempotent seeds can run multiple times without creating duplicate data. This is crucial for reliability.

Why Idempotency Matters

flowchart TD
  A["Run seed script"] --> B{Data exists?}
  B -->|No| C["Insert seed data"]
  B -->|Yes| D["Skip or update"]
  C --> E["✅ Seeding complete"]
  D --> E

Without idempotency:

  • Running seeds twice creates duplicate data
  • Failed partial seeds leave inconsistent state
  • Team members get different data

With idempotency:

  • Seeds are safe to run anytime
  • Recovery from failures is straightforward
  • Consistent state across all environments

Pattern 1: Check Before Insert

The simplest approach—check if data exists before inserting:

scripts/seed/idempotent.ts
import { eq } from 'drizzle-orm';
import { drizzle } from 'drizzle-orm/d1';
import { users } from '../../src/db/schema';
export async function seedUsersIdempotent(db: ReturnType<typeof drizzle>) {
// Check if seed data already exists
const existingUsers = await db.select().from(users).limit(1);
if (existingUsers.length > 0) {
console.log(' ⏭️ Users already seeded, skipping...');
return;
}
// Insert seed data
await db.insert(users).values([
{ name: 'Alice', email: 'alice@example.com', createdAt: Date.now() },
{ name: 'Bob', email: 'bob@example.com', createdAt: Date.now() },
]);
console.log(' ✓ Users seeded');
}

Pattern 2: Upsert by Unique Key

Use unique constraints to update existing records or insert new ones:

scripts/seed/upsert.ts
import { eq, sql } from 'drizzle-orm';
import { drizzle } from 'drizzle-orm/d1';
import { users } from '../../src/db/schema';
import type { NewUser } from './types';
async function upsertUser(
db: ReturnType<typeof drizzle>,
user: NewUser
) {
// Check if user exists by email (unique key)
const existing = await db
.select()
.from(users)
.where(eq(users.email, user.email))
.limit(1);
if (existing.length > 0) {
// Update existing user
await db
.update(users)
.set({ name: user.name })
.where(eq(users.email, user.email));
console.log(` ↻ Updated user: ${user.email}`);
} else {
// Insert new user
await db.insert(users).values(user);
console.log(` ✓ Created user: ${user.email}`);
}
}
export async function seedUsersUpsert(db: ReturnType<typeof drizzle>) {
const seedUsers: NewUser[] = [
{ name: 'Alice', email: 'alice@example.com', createdAt: Date.now() },
{ name: 'Bob', email: 'bob@example.com', createdAt: Date.now() },
];
for (const user of seedUsers) {
await upsertUser(db, user);
}
}

Pattern 3: Using SQLite’s INSERT OR REPLACE

For simpler cases, use SQLite’s conflict resolution:

scripts/seed/conflict-resolution.ts
import { sql } from 'drizzle-orm';
import { drizzle } from 'drizzle-orm/d1';
export async function seedWithConflictResolution(db: ReturnType<typeof drizzle>) {
// Using raw SQL for INSERT OR IGNORE
// This skips records that would violate unique constraints
await db.run(sql`
INSERT OR IGNORE INTO users (name, email, created_at)
VALUES
('Alice', 'alice@example.com', ${Date.now()}),
('Bob', 'bob@example.com', ${Date.now()})
`);
console.log(' ✓ Users seeded (duplicates ignored)');
}

Pattern 4: Marker-Based Seeding

Track which seed versions have been applied:

scripts/seed/marker-based.ts
import { eq, sql } from 'drizzle-orm';
import { drizzle } from 'drizzle-orm/d1';
import { sqliteTable, text, integer } from 'drizzle-orm/sqlite-core';
// Create a seed tracking table
const seedMarkers = sqliteTable('_seed_markers', {
id: integer('id').primaryKey({ autoIncrement: true }),
name: text('name').notNull().unique(),
appliedAt: integer('applied_at').notNull(),
});
async function isSeedApplied(
db: ReturnType<typeof drizzle>,
seedName: string
): Promise<boolean> {
const result = await db
.select()
.from(seedMarkers)
.where(eq(seedMarkers.name, seedName))
.limit(1);
return result.length > 0;
}
async function markSeedApplied(
db: ReturnType<typeof drizzle>,
seedName: string
) {
await db.insert(seedMarkers).values({
name: seedName,
appliedAt: Date.now(),
});
}
export async function seedWithMarkers(db: ReturnType<typeof drizzle>) {
const SEED_NAME = 'initial-users-v1';
// Check if this seed was already applied
if (await isSeedApplied(db, SEED_NAME)) {
console.log(` ⏭️ Seed "${SEED_NAME}" already applied, skipping...`);
return;
}
// Apply seed data
await db.run(sql`
INSERT INTO users (name, email, created_at)
VALUES
('Alice', 'alice@example.com', ${Date.now()}),
('Bob', 'bob@example.com', ${Date.now()})
`);
// Mark seed as applied
await markSeedApplied(db, SEED_NAME);
console.log(` ✓ Seed "${SEED_NAME}" applied`);
}

Complete Seed Script Example

Here’s a complete, production-ready seed runner:

scripts/seed/index.ts
import { drizzle } from 'drizzle-orm/d1';
import { eq } from 'drizzle-orm';
import { users, posts } from '../../src/db/schema';
import type { NewUser, NewPost } from './types';
// =============================================================================
// Seed Data
// =============================================================================
const seedData = {
local: {
users: [
{ name: 'Alice Developer', email: 'alice@localhost.test', createdAt: Date.now() },
{ name: 'Bob Tester', email: 'bob@localhost.test', createdAt: Date.now() },
{ name: 'Charlie Admin', email: 'charlie@localhost.test', createdAt: Date.now() },
] as NewUser[],
posts: [
{ title: 'Getting Started', content: 'Welcome to the app!', authorId: 1, createdAt: Date.now() },
{ title: 'Advanced Tips', content: 'Here are some tips...', authorId: 2, createdAt: Date.now() },
] as NewPost[],
},
staging: {
users: [
{ name: 'QA Tester', email: 'qa@staging.example.com', createdAt: Date.now() },
{ name: 'Staging Admin', email: 'admin@staging.example.com', createdAt: Date.now() },
] as NewUser[],
posts: [] as NewPost[],
},
production: {
users: [
{ name: 'System Admin', email: 'admin@yourdomain.com', createdAt: Date.now() },
] as NewUser[],
posts: [] as NewPost[],
},
};
// =============================================================================
// Idempotent Seed Functions
// =============================================================================
async function seedUsers(
db: ReturnType<typeof drizzle>,
usersToSeed: NewUser[]
): Promise<number> {
let seededCount = 0;
for (const user of usersToSeed) {
// Check if user exists by email
const existing = await db
.select()
.from(users)
.where(eq(users.email, user.email))
.limit(1);
if (existing.length === 0) {
await db.insert(users).values(user);
seededCount++;
}
}
return seededCount;
}
async function seedPosts(
db: ReturnType<typeof drizzle>,
postsToSeed: NewPost[]
): Promise<number> {
let seededCount = 0;
for (const post of postsToSeed) {
// Check if post exists by title and author
const existing = await db
.select()
.from(posts)
.where(eq(posts.title, post.title))
.limit(1);
if (existing.length === 0) {
await db.insert(posts).values(post);
seededCount++;
}
}
return seededCount;
}
// =============================================================================
// Main Seed Runner
// =============================================================================
type Environment = 'local' | 'staging' | 'production';
export async function seed(
db: ReturnType<typeof drizzle>,
environment: Environment = 'local'
) {
console.log(`\n🌱 Starting ${environment} database seed...\n`);
const data = seedData[environment];
if (!data) {
throw new Error(`Unknown environment: ${environment}`);
}
// Seed users
const usersSeeded = await seedUsers(db, data.users);
if (usersSeeded > 0) {
console.log(` ✓ Seeded ${usersSeeded} new user(s)`);
} else {
console.log(` ⏭️ Users already seeded`);
}
// Seed posts (if any)
if (data.posts.length > 0) {
const postsSeeded = await seedPosts(db, data.posts);
if (postsSeeded > 0) {
console.log(` ✓ Seeded ${postsSeeded} new post(s)`);
} else {
console.log(` ⏭️ Posts already seeded`);
}
}
console.log(`\n${environment} seeding complete\n`);
}
// =============================================================================
// CLI Entry Point
// =============================================================================
// This would be called from your package.json scripts
// Example: ts-node scripts/seed/index.ts local

Package.json Scripts

Add seed commands to your package.json:

package.json
{
"scripts": {
"seed:local": "wrangler d1 execute DB --local --file=scripts/seed-local.sql",
"seed:staging": "wrangler d1 execute DB --remote --env staging --file=scripts/seed-staging.sql",
"seed:production": "wrangler d1 execute DB --remote --file=scripts/seed-production.sql"
}
}

SQL-Based Seeds

For simpler projects, use SQL files directly:

scripts/seed-local.sql
-- Local development seed data
-- Idempotent: uses INSERT OR IGNORE to skip existing records
-- Seed users
INSERT OR IGNORE INTO users (id, name, email, created_at)
VALUES
(1, 'Alice Developer', 'alice@localhost.test', 1705312200000),
(2, 'Bob Tester', 'bob@localhost.test', 1705312200000),
(3, 'Charlie Admin', 'charlie@localhost.test', 1705312200000);
-- Seed posts
INSERT OR IGNORE INTO posts (id, title, content, author_id, created_at)
VALUES
(1, 'Getting Started with Drizzle', 'This is a sample post...', 1, 1705312200000),
(2, 'D1 Database Tips', 'Here are some tips...', 1, 1705312200000),
(3, 'Testing Your API', 'Learn how to test...', 2, 1705312200000);
scripts/seed-staging.sql
-- Staging seed data
-- Idempotent: uses INSERT OR IGNORE to skip existing records
INSERT OR IGNORE INTO users (id, name, email, created_at)
VALUES
(1, 'QA Tester', 'qa@staging.example.com', 1705312200000),
(2, 'Staging Admin', 'admin@staging.example.com', 1705312200000);
scripts/seed-production.sql
-- Production seed data (minimal)
-- Idempotent: uses INSERT OR IGNORE to skip existing records
INSERT OR IGNORE INTO users (id, name, email, created_at)
VALUES
(1, 'System Admin', 'admin@yourdomain.com', 1705312200000);

Running Seeds

flowchart TD
  A["Choose environment"] --> B{Which env?}
  B -->|Local| C["pnpm seed:local"]
  B -->|Staging| D["pnpm seed:staging"]
  B -->|Production| E["pnpm seed:production"]

  C --> F["Uses --local flag"]
  D --> G["Uses --env staging"]
  E --> H["Uses --remote flag"]

  F --> I["Seeds local SQLite"]
  G --> J["Seeds staging D1"]
  H --> K["Seeds production D1"]

Local Development

Terminal
# Apply seed data to local database
pnpm seed:local

Expected output:

🌀 Executing on local database DB...
Running query: INSERT OR IGNORE INTO users...
✅ Query succeeded
Running query: INSERT OR IGNORE INTO posts...
✅ Query succeeded

Staging Environment

Terminal
# Apply seed data to staging
pnpm seed:staging

Expected output:

🌀 Executing on remote database DB (f1e2d3c4-...)
Running query: INSERT OR IGNORE INTO users...
✅ Query succeeded

Production Environment

Terminal
# Apply seed data to production (use with caution!)
pnpm seed:production

Best Practices

1. Always Use Idempotent Patterns

Every seed script should be safe to run multiple times:

// ✅ Good: Checks before inserting
const existing = await db.select().from(users).where(eq(users.email, email));
if (existing.length === 0) {
await db.insert(users).values(user);
}
// ❌ Bad: Will fail or create duplicates on re-run
await db.insert(users).values(user);

2. Keep Production Seeds Minimal

Production seeds should only include essential reference data:

// ✅ Good: Only essential data
const productionSeeds = [
{ name: 'System Admin', email: 'admin@company.com' },
];
// ❌ Bad: Test data in production
const productionSeeds = [
{ name: 'Test User 1', email: 'test1@example.com' },
{ name: 'Test User 2', email: 'test2@example.com' },
// ... 50 more test users
];

3. Use Realistic Data for Staging

Staging should mirror production scenarios:

// ✅ Good: Realistic staging data
const stagingUsers = [
{ name: 'QA Lead', email: 'qa-lead@staging.company.com' },
{ name: 'Edge Case User', email: 'special+chars@staging.company.com' },
{ name: '', email: 'empty-name@staging.company.com' }, // Test edge case
];

4. Version Your Seed Data

Track seed versions for reproducibility:

scripts/seed/versions.ts
export const SEED_VERSION = '1.0.0';
// Include version in seed markers
await markSeedApplied(db, `users-${SEED_VERSION}`);

5. Log Seed Operations

Always provide feedback during seeding:

console.log(`🌱 Starting ${environment} seed...`);
console.log(` ✓ Seeded ${count} users`);
console.log(` ⏭️ Posts already exist, skipping`);
console.log(`✅ Seeding complete`);

Quick Reference

TaskCommand
Seed local databasepnpm seed:local
Seed staging databasepnpm seed:staging
Seed production databasepnpm seed:production
Execute raw SQL locallypnpm wrangler d1 execute DB --local --command "..."
Execute raw SQL on stagingpnpm wrangler d1 execute DB --remote --env staging --command "..."

Summary

Data seeding is essential for consistent development and testing:

  • Environment-specific seeds ensure each environment has appropriate data
  • Idempotent patterns make seeds safe to run multiple times
  • Production seeds should be minimal and contain only essential data
  • Staging seeds should include realistic test scenarios
  • Local seeds can be extensive for thorough development testing

By following these patterns, you’ll have reliable, repeatable database seeding across all environments.