A schema is a contract with your future self. Every column you add, every constraint you skip, every "we'll clean this up later" — it all compounds. I've worked on systems where a single bad schema decision made three years ago now costs the team a full sprint every quarter in workarounds.

The database outlives everything. Your frontend framework will change. Your API layer will get rewritten. Your deployment strategy will evolve. But the data? The data stays. And the shape you gave it on day one follows you forever, because migrating a table with 200 million rows is not the same as refactoring a React component.

This is what I've learned about making schema decisions that don't haunt you. PostgreSQL-focused, because that's what I use and trust, but most of these patterns apply to any relational database.

Naming Conventions: The Boring Thing That Matters Most#

I've seen more arguments about naming conventions than about actual architecture decisions. Here's what I've settled on after years of context-switching between projects:

Snake_case for everything. Tables, columns, indexes, constraints. No camelCase, no PascalCase. PostgreSQL folds unquoted identifiers to lowercase anyway, so createdAt becomes createdat unless you double-quote it everywhere. Don't fight the database.

Plural table names. A table holds many rows. users not user. orders not order. order_items not order_item. This reads naturally in queries: SELECT * FROM users WHERE ... — you're selecting from a collection.

_id suffix for foreign keys. user_id, order_id, tenant_id. The primary key is just id. This is unambiguous. When you join tables, users.id = orders.user_id reads like English.

_at suffix for timestamps. created_at, updated_at, deleted_at, published_at, expires_at. You always know it's a point in time.

is_ prefix for booleans. is_active, is_verified, is_published. Some people use has_ for ownership booleans (has_mfa_enabled), but I keep it simple and use is_ for everything.

_count suffix for denormalized counters. comment_count, follower_count. Makes it clear this is a cached number, not a live calculation.

Here's the thing: consistency beats perfection. I've seen teams spend weeks debating whether it should be email_address or email or email_addr. Just pick one pattern and enforce it everywhere. The worst naming convention is the one that's inconsistently applied.

sql

-- Good: consistent, readable, no surprises
CREATE TABLE users (
    id              BIGSERIAL PRIMARY KEY,
    email           TEXT NOT NULL,
    display_name    TEXT NOT NULL,
    password_hash   TEXT NOT NULL,
    is_active       BOOLEAN NOT NULL DEFAULT true,
    is_verified     BOOLEAN NOT NULL DEFAULT false,
    login_count     INTEGER NOT NULL DEFAULT 0,
    last_login_at   TIMESTAMPTZ,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Bad: inconsistent casing, mixed conventions, ambiguous names
CREATE TABLE User (
    UserID          SERIAL PRIMARY KEY,
    EmailAddress    VARCHAR(255),
    name            VARCHAR(100),
    passwd          VARCHAR(255),
    active          BOOLEAN,
    verified        INT,             -- is this 0/1? a verification level?
    lastLogin       TIMESTAMP,       -- camelCase in a snake_case world
    created         TIMESTAMP        -- created what? when?
);

One more thing: never use reserved words as column names. type, order, user, group, table — they all work if you quote them, but they'll burn you in ORMs, query builders, and every dynamic SQL generation tool. Use kind instead of type, sort_order instead of order. Your future self will thank you.

The Standard Columns: What Every Table Gets#

Every table in my schemas starts with the same skeleton. No exceptions.

sql

CREATE TABLE things (
    id          BIGSERIAL PRIMARY KEY,
    -- ... domain columns ...
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

The UUID vs BIGSERIAL Debate#

This is one of those decisions that generates way more heat than light. Here's the actual trade-off:

BIGSERIAL (auto-incrementing integer):

8 bytes. Compact. Fast to index and join.
Ordered — you can sort by id to get insertion order.
Predictable — a user can guess other IDs by incrementing theirs.
Doesn't work well in distributed systems (requires coordination).

UUID v4 (random):

16 bytes. Larger indexes, slower joins (but rarely the bottleneck).
Unpredictable — no information leakage.
Works in distributed systems without coordination.
Terrible index locality — random UUIDs fragment B-tree indexes.

UUID v7 (time-sorted, RFC 9562):

16 bytes, but time-ordered so B-tree locality is excellent.
Unpredictable enough for external use.
Relatively new, but PostgreSQL 17+ has gen_random_uuid() and you can use uuid_generate_v7() with extensions.

My current stance: BIGSERIAL for internal tables, UUID v7 for anything exposed to the outside world. If an ID ever appears in a URL, an API response, or a webhook payload, use UUIDs. If it's a pure join table that users never see, BIGSERIAL is fine.

sql

-- For an API-facing resource
CREATE TABLE projects (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    owner_id    UUID NOT NULL REFERENCES users(id),
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- For an internal mapping table
CREATE TABLE project_members (
    id          BIGSERIAL PRIMARY KEY,
    project_id  UUID NOT NULL REFERENCES projects(id),
    user_id     UUID NOT NULL REFERENCES users(id),
    role        TEXT NOT NULL DEFAULT 'member',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (project_id, user_id)
);

Timestamps: Always TIMESTAMPTZ#

Use TIMESTAMPTZ, not TIMESTAMP. The "TZ" version stores the value in UTC and converts on read based on the session timezone. The non-TZ version stores whatever you give it with no timezone context — which means if two servers in different timezones write to the same table, you get silent data corruption.

sql

-- Always this
created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
 
-- Never this
created_at  TIMESTAMP NOT NULL DEFAULT NOW()

The updated_at Trigger#

PostgreSQL doesn't have MySQL's ON UPDATE CURRENT_TIMESTAMP. You need a trigger:

sql

CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
-- Apply to every table that has updated_at
CREATE TRIGGER set_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();
 
CREATE TRIGGER set_updated_at
    BEFORE UPDATE ON projects
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

Yes, you need one trigger per table. Yes, it's tedious. Write a migration helper that creates the trigger automatically when you add a table. It's worth it, because the alternative is remembering to set updated_at = NOW() in every UPDATE query across your entire application — and you will forget.

Normalization: When to Break the Rules#

Every CS course teaches normalization up to 3NF (Third Normal Form). The rules are:

1NF: Every column holds a single atomic value. No arrays, no comma-separated lists.
2NF: Every non-key column depends on the entire primary key (relevant for composite keys).
3NF: No transitive dependencies. If column A determines column B, and B determines C, then C shouldn't be in the same table as A.

In practice, 3NF is the sweet spot for transactional tables. You should start there and deviate only when you have a specific, measurable reason.

Here's a properly normalized order system:

sql

CREATE TABLE customers (
    id              BIGSERIAL PRIMARY KEY,
    email           TEXT NOT NULL UNIQUE,
    display_name    TEXT NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE products (
    id              BIGSERIAL PRIMARY KEY,
    sku             TEXT NOT NULL UNIQUE,
    name            TEXT NOT NULL,
    price_cents     INTEGER NOT NULL,
    is_active       BOOLEAN NOT NULL DEFAULT true,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE orders (
    id              BIGSERIAL PRIMARY KEY,
    customer_id     BIGINT NOT NULL REFERENCES customers(id),
    status          TEXT NOT NULL DEFAULT 'pending'
                    CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled')),
    total_cents     INTEGER NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE order_items (
    id              BIGSERIAL PRIMARY KEY,
    order_id        BIGINT NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
    product_id      BIGINT NOT NULL REFERENCES products(id),
    quantity        INTEGER NOT NULL CHECK (quantity > 0),
    unit_price_cents INTEGER NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Notice unit_price_cents on order_items. This is intentional denormalization. We snapshot the price at the time of order, because the product's price might change later. This is one of the most common and correct reasons to denormalize.

When Denormalization Is the Right Call#

Reporting tables. If your analytics dashboard needs to join 8 tables to render, create a denormalized reporting table and populate it with a background job. Your transactional schema stays clean, and your reporting queries stay fast.

sql

-- Denormalized reporting table, populated by a cron job
CREATE TABLE order_reports (
    id                  BIGSERIAL PRIMARY KEY,
    order_id            BIGINT NOT NULL,
    customer_email      TEXT NOT NULL,
    customer_name       TEXT NOT NULL,
    product_names       TEXT[] NOT NULL,        -- array is fine here
    item_count          INTEGER NOT NULL,
    total_cents         INTEGER NOT NULL,
    order_status        TEXT NOT NULL,
    ordered_at          TIMESTAMPTZ NOT NULL,
    report_generated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Materialized views. PostgreSQL's MATERIALIZED VIEW is underrated. It's a denormalized snapshot that you refresh on demand. Perfect for dashboards.

sql

CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT
    DATE_TRUNC('month', o.created_at) AS month,
    COUNT(DISTINCT o.id) AS order_count,
    SUM(o.total_cents) AS revenue_cents,
    COUNT(DISTINCT o.customer_id) AS unique_customers
FROM orders o
WHERE o.status != 'cancelled'
GROUP BY DATE_TRUNC('month', o.created_at)
ORDER BY month DESC;
 
-- Refresh it nightly
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;

JSON columns for dynamic attributes. When different rows need different fields — product variants, form submissions, user preferences — a JSONB column is often better than a nightmare EAV (Entity-Attribute-Value) table. More on this later.

The One Rule I Never Break#

Never denormalize your source-of-truth tables. Denormalize copies, snapshots, reports, and caches. The canonical data stays normalized. When the denormalized copy gets stale or corrupted (and it will), you rebuild it from the normalized source.

Foreign Keys & Constraints: The Best Code You'll Never Write#

I've heard every excuse for skipping foreign keys. "They slow down writes." "We enforce it in the application." "We need flexibility."

These are all wrong.

Foreign keys are the single most impactful thing you can add to a schema. They prevent entire categories of bugs that no amount of application code can catch — race conditions, partial failures, orphaned rows from failed transactions. A foreign key is a guarantee from the database engine itself, enforced at the storage level. Your ORM's beforeDelete hook is a suggestion.

sql

-- Always do this
ALTER TABLE orders
    ADD CONSTRAINT fk_orders_customer
    FOREIGN KEY (customer_id) REFERENCES customers(id);
 
-- The "we'll handle it in code" alternative:
-- Hope. Hope is not a strategy.

ON DELETE Strategies#

This is where it gets nuanced. What happens when you delete a parent row?

RESTRICT (default): The delete fails if child rows exist. Use this for most relationships. You can't delete a customer who has orders — that's business logic encoded in the schema.

sql

-- Customer can't be deleted while they have orders
ALTER TABLE orders
    ADD CONSTRAINT fk_orders_customer
    FOREIGN KEY (customer_id) REFERENCES customers(id)
    ON DELETE RESTRICT;

CASCADE: Child rows are automatically deleted. Use this sparingly and deliberately. Good for "part-of" relationships where the child has no meaning without the parent.

sql

-- Deleting an order deletes its line items — they're meaningless alone
ALTER TABLE order_items
    ADD CONSTRAINT fk_order_items_order
    FOREIGN KEY (order_id) REFERENCES orders(id)
    ON DELETE CASCADE;
 
-- Deleting a project deletes its memberships
ALTER TABLE project_members
    ADD CONSTRAINT fk_project_members_project
    FOREIGN KEY (project_id) REFERENCES projects(id)
    ON DELETE CASCADE;

SET NULL: The foreign key column is set to NULL. Use this when the relationship is optional and the child row is still meaningful on its own.

sql

-- If a manager leaves, their reports still exist — just unassigned
ALTER TABLE employees
    ADD CONSTRAINT fk_employees_manager
    FOREIGN KEY (manager_id) REFERENCES employees(id)
    ON DELETE SET NULL;

My rule of thumb: default to RESTRICT, use CASCADE for composition relationships, use SET NULL for optional associations. If you're unsure, RESTRICT is always the safe choice — it's easier to relax a constraint than to recover deleted data.

Check Constraints: Cheap Insurance#

Check constraints cost almost nothing at write time and prevent garbage data from entering your system forever:

sql

CREATE TABLE products (
    id              BIGSERIAL PRIMARY KEY,
    name            TEXT NOT NULL CHECK (LENGTH(name) > 0),
    price_cents     INTEGER NOT NULL CHECK (price_cents >= 0),
    discount_pct    NUMERIC(5,2) CHECK (discount_pct BETWEEN 0 AND 100),
    weight_kg       NUMERIC(10,3) CHECK (weight_kg > 0),
    status          TEXT NOT NULL CHECK (status IN ('draft', 'active', 'archived')),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Every check constraint you add is one fewer validation bug you'll ever have to debug in production. The database is the last line of defense. Use it.

Unique Constraints and Partial Unique Indexes#

Unique constraints are straightforward for simple cases:

sql

ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);

But they get interesting with partial indexes — unique constraints that only apply to certain rows:

sql

-- Only one active subscription per user (but they can have many cancelled ones)
CREATE UNIQUE INDEX uq_active_subscription
    ON subscriptions (user_id)
    WHERE status = 'active';
 
-- Only one primary address per user
CREATE UNIQUE INDEX uq_primary_address
    ON addresses (user_id)
    WHERE is_primary = true;

This is one of PostgreSQL's killer features. Use it.

Soft Deletes: The Pattern Everyone Loves to Hate#

Soft deletes are simple in concept: instead of DELETE FROM users WHERE id = 42, you do UPDATE users SET deleted_at = NOW() WHERE id = 42. The row stays in the database but is filtered out of normal queries.

sql

CREATE TABLE users (
    id          BIGSERIAL PRIMARY KEY,
    email       TEXT NOT NULL,
    display_name TEXT NOT NULL,
    is_active   BOOLEAN NOT NULL DEFAULT true,
    deleted_at  TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- The partial index: filter out deleted rows efficiently
CREATE INDEX idx_users_active ON users (email) WHERE deleted_at IS NULL;

Every query in your application now needs WHERE deleted_at IS NULL. Every. Single. Query. ORMs help with this (Prisma has middleware, Drizzle has .where(isNull(deletedAt))), but it's a tax on every read operation. Miss it once and you're showing "deleted" data to users.

The Unique Constraint Problem#

Here's where soft deletes get ugly. If you have UNIQUE (email) and a user soft-deletes their account, their email is still in the table. They can't re-register with the same email. A new user with that email can't sign up either.

The fix is a partial unique index:

sql

-- Email must be unique, but only among non-deleted users
CREATE UNIQUE INDEX uq_users_email_active
    ON users (email)
    WHERE deleted_at IS NULL;

This works, but now you need to remember this pattern for every unique column on every soft-deletable table. It's manageable with discipline, but it's complexity you've chosen to take on.

The Foreign Key Problem#

Soft deletes interact badly with foreign keys. If orders.user_id references users.id with ON DELETE RESTRICT, and you soft-delete a user... nothing happens. The FK doesn't fire because you didn't actually delete the row. The user is "gone" from the application's perspective but still very much present in the database.

This means your application code has to handle the case where a referenced entity is soft-deleted. Every join, every lookup, every time you follow a foreign key — you need to check deleted_at IS NULL on the referenced table too. Or you don't, and your application shows "Order by [deleted user]" which is either a bug or a feature depending on who you ask.

My Approach#

I use soft deletes only for user-facing entities where recovery is a business requirement — user accounts, projects, documents. Things where a support agent might need to restore a deletion. For everything else, I use hard deletes with an audit trail (more on this next).

sql

-- Soft delete: user-facing, recoverable
CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title       TEXT NOT NULL,
    content     TEXT NOT NULL,
    owner_id    UUID NOT NULL REFERENCES users(id),
    deleted_at  TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_documents_active ON documents (owner_id) WHERE deleted_at IS NULL;
 
-- Hard delete: internal, not recoverable from UI (but audited)
CREATE TABLE api_keys (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id     UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    key_hash    TEXT NOT NULL,
    last_used_at TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- When deleted, it's gone. The audit_log records that it existed.

Audit Trails: Know What Changed and Who Did It#

Every non-trivial application needs an answer to "what happened to this record?" at some point. Audit trails are how you provide that answer without digging through application logs.

The Separate Audit Table Pattern#

The simplest approach: a single audit_log table that records every change.

sql

CREATE TABLE audit_log (
    id              BIGSERIAL PRIMARY KEY,
    table_name      TEXT NOT NULL,
    record_id       TEXT NOT NULL,          -- text to handle both UUID and BIGINT PKs
    operation       TEXT NOT NULL CHECK (operation IN ('INSERT', 'UPDATE', 'DELETE')),
    old_values      JSONB,                  -- NULL for INSERT
    new_values      JSONB,                  -- NULL for DELETE
    changed_fields  TEXT[],                 -- which columns changed (for UPDATE)
    changed_by      UUID REFERENCES users(id),
    changed_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    ip_address      INET,
    user_agent      TEXT
);
 
CREATE INDEX idx_audit_log_table_record ON audit_log (table_name, record_id);
CREATE INDEX idx_audit_log_changed_by ON audit_log (changed_by);
CREATE INDEX idx_audit_log_changed_at ON audit_log (changed_at);

The challenge is populating it. You can do it in application code (explicit, but easy to forget) or with triggers (automatic, but harder to pass context like changed_by).

The Trigger-Based Approach#

Triggers capture every change automatically, even from raw SQL or database admin operations:

sql

CREATE OR REPLACE FUNCTION audit_trigger_function()
RETURNS TRIGGER AS $$
DECLARE
    old_data JSONB;
    new_data JSONB;
    changed  TEXT[];
    col      TEXT;
BEGIN
    IF TG_OP = 'DELETE' THEN
        old_data := to_jsonb(OLD);
        INSERT INTO audit_log (table_name, record_id, operation, old_values, changed_at)
        VALUES (TG_TABLE_NAME, OLD.id::TEXT, 'DELETE', old_data, NOW());
        RETURN OLD;
    ELSIF TG_OP = 'UPDATE' THEN
        old_data := to_jsonb(OLD);
        new_data := to_jsonb(NEW);
        -- Find changed fields
        FOR col IN SELECT key FROM jsonb_each(new_data)
        LOOP
            IF old_data->col IS DISTINCT FROM new_data->col THEN
                changed := array_append(changed, col);
            END IF;
        END LOOP;
        -- Only log if something actually changed
        IF array_length(changed, 1) > 0 THEN
            INSERT INTO audit_log (table_name, record_id, operation, old_values, new_values, changed_fields, changed_at)
            VALUES (TG_TABLE_NAME, NEW.id::TEXT, 'UPDATE', old_data, new_data, changed, NOW());
        END IF;
        RETURN NEW;
    ELSIF TG_OP = 'INSERT' THEN
        new_data := to_jsonb(NEW);
        INSERT INTO audit_log (table_name, record_id, operation, new_values, changed_at)
        VALUES (TG_TABLE_NAME, NEW.id::TEXT, 'INSERT', new_data, NOW());
        RETURN NEW;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;
 
-- Apply to tables you want audited
CREATE TRIGGER audit_users
    AFTER INSERT OR UPDATE OR DELETE ON users
    FOR EACH ROW EXECUTE FUNCTION audit_trigger_function();
 
CREATE TRIGGER audit_orders
    AFTER INSERT OR UPDATE OR DELETE ON orders
    FOR EACH ROW EXECUTE FUNCTION audit_trigger_function();

The downside: triggers don't know which application user made the change. You can work around this with session variables:

sql

-- In your application, before the query:
SET LOCAL app.current_user_id = 'uuid-of-current-user';
 
-- In the trigger function:
changed_by_id := current_setting('app.current_user_id', true)::UUID;

This works but feels fragile. In practice, I use a hybrid approach: triggers for the data capture, and application code to set the session context.

The History Table Pattern#

For tables where you need the full version history (not just "what changed" but "what was the state at time T"), a dedicated history table is cleaner:

sql

CREATE TABLE documents (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title           TEXT NOT NULL,
    content         TEXT NOT NULL,
    version         INTEGER NOT NULL DEFAULT 1,
    owner_id        UUID NOT NULL REFERENCES users(id),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE document_history (
    id              BIGSERIAL PRIMARY KEY,
    document_id     UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    title           TEXT NOT NULL,
    content         TEXT NOT NULL,
    version         INTEGER NOT NULL,
    changed_by      UUID REFERENCES users(id),
    changed_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (document_id, version)
);

Before every update to documents, copy the current state to document_history and increment the version. Now you can reconstruct the document at any point in time, show diffs between versions, and even restore old versions.

The trade-off is storage. If your content column is large and changes frequently, the history table can grow fast. For most applications, this is fine — storage is cheap and you can archive old versions to cold storage if needed.

Multi-Tenancy: Three Approaches, Pick Your Pain#

Multi-tenancy is one of those things that's easy to add at the start and nearly impossible to add later. If there's any chance your application will serve multiple organizations, build it in from day one.

Row-Level: tenant_id on Every Table#

The most common approach. Every table has a tenant_id column, and every query filters by it.

sql

CREATE TABLE tenants (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    slug        TEXT NOT NULL UNIQUE,
    plan        TEXT NOT NULL DEFAULT 'free' CHECK (plan IN ('free', 'pro', 'enterprise')),
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE projects (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id),
    name        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE tasks (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id),
    project_id  UUID NOT NULL REFERENCES projects(id),
    title       TEXT NOT NULL,
    status      TEXT NOT NULL DEFAULT 'todo',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Every index should include tenant_id for query performance
CREATE INDEX idx_projects_tenant ON projects (tenant_id);
CREATE INDEX idx_tasks_tenant_project ON tasks (tenant_id, project_id);

The risk is data leakage — one missed WHERE tenant_id = ... and you're showing Tenant A's data to Tenant B. PostgreSQL's Row-Level Security (RLS) eliminates this class of bugs:

sql

-- Enable RLS on the table
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
 
-- Create a policy based on a session variable
CREATE POLICY tenant_isolation ON projects
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Force RLS even for table owners
ALTER TABLE projects FORCE ROW LEVEL SECURITY;

Now, even if your application code forgets the WHERE tenant_id = ... clause, PostgreSQL adds it automatically. This is defense in depth, and it's one of the strongest arguments for PostgreSQL in multi-tenant systems.

sql

-- In your application's connection middleware:
SET LOCAL app.current_tenant_id = 'tenant-uuid-here';
 
-- Now all queries on RLS-enabled tables are automatically filtered
SELECT * FROM projects;
-- PostgreSQL internally adds: WHERE tenant_id = 'tenant-uuid-here'

Pros: Single database, simple ops, efficient resource usage, easy cross-tenant queries for admin. Cons: Requires discipline (or RLS), every query touches tenant_id, harder to give tenants their own backup/restore.

Schema-Per-Tenant#

Each tenant gets their own PostgreSQL schema. All schemas share the same database, but the tables are isolated by namespace.

sql

-- Create a schema for each tenant
CREATE SCHEMA tenant_acme;
CREATE SCHEMA tenant_globex;
 
-- Tables live in the tenant's schema
CREATE TABLE tenant_acme.projects (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Set the search_path for the current tenant
SET search_path TO tenant_acme, public;
 
-- Now unqualified queries hit the tenant's schema
SELECT * FROM projects;  -- queries tenant_acme.projects

Pros: Strong isolation, no risk of cross-tenant data leakage, easy per-tenant backup/restore, tenants can have schema variations. Cons: Schema migration complexity (you need to migrate N schemas), connection pool management, PostgreSQL has practical limits around ~10,000 schemas.

Database-Per-Tenant#

Each tenant gets their own database. Maximum isolation.

Pros: Complete isolation, independent scaling, easy backup/restore, can place big tenants on dedicated hardware. Cons: Connection management nightmare, cross-tenant queries impossible, migration must run N times, significant operational overhead.

Which One?#

For most SaaS applications: start with row-level + RLS. It's the simplest to operate, and RLS gives you strong-enough isolation for the vast majority of use cases. Move to schema-per-tenant only if you have contractual isolation requirements (enterprise customers, regulated industries). Database-per-tenant is for when you absolutely must guarantee physical isolation — and even then, consider managed databases where the operational burden is handled for you.

JSON/JSONB Columns: The Escape Hatch#

PostgreSQL's JSONB is remarkable. It gives you document-database flexibility inside a relational system. But like any powerful tool, it's easy to abuse.

When JSONB Is the Right Choice#

Dynamic user-defined attributes. An e-commerce platform where each product category has different attributes — shoes have size and color, electronics have voltage and wattage. Rather than an EAV table or a column for every possible attribute:

sql

CREATE TABLE products (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name            TEXT NOT NULL,
    category_id     UUID NOT NULL REFERENCES categories(id),
    price_cents     INTEGER NOT NULL CHECK (price_cents >= 0),
    attributes      JSONB NOT NULL DEFAULT '{}',
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Example data:
-- Shoe: {"size": "10", "color": "black", "material": "leather"}
-- Laptop: {"ram_gb": 16, "storage_gb": 512, "screen_inches": 14.0}

Application settings and configuration. User preferences, feature flags, notification settings — things that change shape frequently and don't need relational integrity.

sql

CREATE TABLE user_settings (
    user_id     UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
    preferences JSONB NOT NULL DEFAULT '{
        "theme": "system",
        "notifications": {"email": true, "push": false},
        "locale": "en",
        "timezone": "UTC"
    }',
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Webhook payloads, API responses, event data. Anything where the schema is controlled by an external system and might change without notice.

GIN Indexes on JSONB#

Without indexes, querying inside JSONB requires a full table scan. GIN indexes make it fast:

sql

-- Index the entire JSONB column (good for @> containment queries)
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
 
-- Query: find all products with color = "black"
SELECT * FROM products
WHERE attributes @> '{"color": "black"}';
 
-- Query: find products with any of these attributes
SELECT * FROM products
WHERE attributes ?| ARRAY['color', 'size'];

For specific paths you query often, a targeted index is more efficient:

sql

-- Index a specific path
CREATE INDEX idx_products_color ON products ((attributes->>'color'));
 
-- Now this is a regular B-tree lookup
SELECT * FROM products
WHERE attributes->>'color' = 'black';

The Migration Trap#

Here's where JSONB bites you: you can't easily add NOT NULL constraints or default values to fields inside JSON. If you add a new required field to your product attributes, you need to backfill every existing row. With a regular column, a migration handles this atomically. With JSONB, you're writing an UPDATE that touches every row and hoping your application code handles missing fields gracefully until the backfill completes.

sql

-- Adding a new column: clean, atomic, one statement
ALTER TABLE products ADD COLUMN weight_kg NUMERIC(10,3);
 
-- Adding a new JSONB field: messy, requires a backfill
UPDATE products
SET attributes = attributes || '{"weight_kg": null}'
WHERE NOT (attributes ? 'weight_kg');
-- Plus update application code to handle missing weight_kg
-- Plus update validation logic
-- Plus update every API response that includes attributes

My rule: if you're querying a JSONB field in a WHERE clause more than twice a week, it should probably be a column. JSONB is a great escape hatch. It's a terrible default.

Schema Migrations: Don't Break Production at 3 AM#

Schema migrations are where theory meets reality. Your schema looks great on paper, but now you need to alter a table with 50 million rows during business hours without any downtime.

Migration Tools#

I've used most of them. Brief takes:

Drizzle (TypeScript): My current favorite. Schema-as-code, type-safe queries generated from the schema, clean migration SQL. The push command for development is fast.

Prisma (TypeScript): Great DX for simple schemas. Struggles with advanced PostgreSQL features (partial indexes, custom types, RLS). The migration engine can make surprising decisions.

Flyway (Java/CLI): Rock-solid, battle-tested, SQL-first. If you're writing raw SQL migrations, Flyway tracks them reliably. No magic, no surprises.

golang-migrate (Go/CLI): Similar to Flyway but lighter. Great for Go projects or when you just want a simple up/down migration runner.

The Zero-Downtime Problem#

The most dangerous schema changes are the ones that lock the table. In PostgreSQL, ALTER TABLE ... ADD COLUMN with a default value used to lock the entire table for the duration of the rewrite. Since PostgreSQL 11, simple defaults (constants) are metadata-only and instant. But other operations still lock:

sql

-- SAFE: metadata-only, instant (PostgreSQL 11+)
ALTER TABLE users ADD COLUMN bio TEXT;
ALTER TABLE users ADD COLUMN is_premium BOOLEAN NOT NULL DEFAULT false;
 
-- DANGEROUS: rewrites the entire table, full table lock
ALTER TABLE users ALTER COLUMN email TYPE VARCHAR(500);
ALTER TABLE users ADD COLUMN score INTEGER NOT NULL DEFAULT compute_score();
 
-- DANGEROUS: scans the entire table to validate
ALTER TABLE orders ADD CONSTRAINT check_total CHECK (total_cents >= 0);

Adding NOT NULL Columns Safely#

You can't just add NOT NULL to an existing table with data — it fails because existing rows don't have a value. The naive approach:

sql

-- This locks the table and rewrites it. Don't do this on a big table.
ALTER TABLE users ADD COLUMN avatar_url TEXT NOT NULL DEFAULT 'https://example.com/default.png';

Since PostgreSQL 11, this is actually safe for constant defaults — it's metadata-only. But if your default is a function or you need to backfill with computed values, use the expand-contract pattern.

The Expand-Contract Pattern#

This is the gold standard for zero-downtime schema changes. Three phases:

Phase 1: Expand — Add the new column as nullable. Deploy application code that writes to both old and new columns.

sql

-- Migration 1: Add the nullable column
ALTER TABLE users ADD COLUMN normalized_email TEXT;

Phase 2: Migrate — Backfill existing rows in batches. Your application is already writing to the new column for new data.

sql

-- Migration 2: Backfill in batches (don't do this in one statement for large tables)
UPDATE users SET normalized_email = LOWER(TRIM(email))
WHERE normalized_email IS NULL
LIMIT 10000;
-- Repeat until all rows are backfilled

Phase 3: Contract — Once all rows are backfilled, add the NOT NULL constraint and remove the old column (if applicable).

sql

-- Migration 3: Add constraint (use NOT VALID to avoid full table scan, then validate separately)
ALTER TABLE users ADD CONSTRAINT users_normalized_email_not_null
    CHECK (normalized_email IS NOT NULL) NOT VALID;
 
-- Migration 4: Validate the constraint (takes a ShareUpdateExclusiveLock, not an AccessExclusiveLock)
ALTER TABLE users VALIDATE CONSTRAINT users_normalized_email_not_null;

The NOT VALID + VALIDATE CONSTRAINT trick is crucial. Adding a check constraint normally scans the entire table while holding a heavy lock. NOT VALID adds the constraint without scanning (it only applies to new writes), and VALIDATE CONSTRAINT scans with a lighter lock that doesn't block reads or writes.

Index Creation#

Creating indexes on large tables blocks writes by default. Always use CONCURRENTLY:

sql

-- BLOCKS WRITES: don't do this on a live table
CREATE INDEX idx_users_email ON users (email);
 
-- NON-BLOCKING: use this instead
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

CONCURRENTLY takes longer and can't run inside a transaction, but it doesn't lock the table. The trade-off is always worth it in production.

The Decisions I've Regretted#

Every schema has regrets. Here are mine.

Using SERIAL Instead of UUID for External IDs#

Early in a project, I used SERIAL primary keys and exposed them directly in URLs: /users/42, /orders/1337. This leaked information (competitors could guess our user count), made pagination guessable, and broke when we needed to merge databases from two regions. Switching to UUIDs for external-facing IDs required a multi-month migration.

Lesson: use UUIDs for anything exposed outside your database. Use SERIAL/BIGSERIAL for internal join tables if you want, but never let an auto-incrementing integer appear in a URL.

No Constraints ("We'll Add Them Later")#

We launched a table without check constraints because "we're moving fast and we'll add them later." Within two weeks, the data had negative prices, empty names, and an order_status value of "oof" that someone typed during testing and never cleaned up.

Adding constraints after the fact required:

Finding all invalid data
Deciding what to do with it (fix, delete, or grandfather it)
Writing a migration that backfills/fixes the data AND adds the constraint

This took longer than adding the constraint on day one would have. Start with constraints. Loosen them if you must. Never the other way around.

Comma-Separated Strings Instead of Arrays or Join Tables#

I once stored tags as a comma-separated string: "javascript,react,nextjs". Querying was a nightmare:

sql

-- This is how you query comma-separated values. Don't do this.
SELECT * FROM posts WHERE tags LIKE '%react%';
-- This also matches "react-native" and "preact"
 
-- What I should have done:
CREATE TABLE post_tags (
    post_id     UUID NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    tag_id      UUID NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    PRIMARY KEY (post_id, tag_id)
);

Or at minimum, use PostgreSQL's native array type:

sql

CREATE TABLE posts (
    id      UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title   TEXT NOT NULL,
    tags    TEXT[] NOT NULL DEFAULT '{}',
    -- ...
);
 
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);
 
-- Clean queries
SELECT * FROM posts WHERE 'react' = ANY(tags);
SELECT * FROM posts WHERE tags @> ARRAY['react', 'nextjs'];

Arrays are acceptable for simple lists that don't need their own attributes. The moment you need metadata on the relationship (like "who added this tag" or "when was it added"), you need a join table.

Using "type" as a Column Name#

sql

-- Seemed fine at the time
CREATE TABLE notifications (
    id      BIGSERIAL PRIMARY KEY,
    type    TEXT NOT NULL,  -- "email", "sms", "push"
    -- ...
);
 
-- Then you try to query it in literally any ORM:
-- notification.type  ← conflicts with every language's type system
-- "type" is a reserved word in most SQL dialects
-- You end up quoting it everywhere: SELECT "type" FROM notifications

Use kind, category, or notification_type. Anything but type. Similarly, avoid order (use sort_order or position), user (use account or prefix it), and group (use team or group_name).

Not Adding created_at from the Start#

One table didn't get created_at because "we don't need it." Three months later, we needed to debug an issue and had no idea when records were created. Adding it retroactively meant all existing rows got the same timestamp (the migration timestamp), making the historical data useless.

Every table gets created_at. No exceptions. The cost is one column. The cost of not having it is unknowable until you need it.

Putting Business Logic in Database Views#

I once created a chain of views — active_users filtered users, premium_active_users filtered active_users, and a reporting view joined all of them. It worked great until someone changed the users table and all three views broke silently. The query planner also struggled to optimize through multiple view layers.

Views are great for convenience and read-only access patterns. They're terrible as a place to encode business logic that changes. Keep business rules in application code where they're versioned, tested, and deployable. Use views for reporting shortcuts, not as architectural building blocks.

Putting It All Together#

Here's what a well-designed schema looks like when you combine these patterns. A simplified project management system:

sql

-- Extensions
CREATE EXTENSION IF NOT EXISTS "pgcrypto";  -- for gen_random_uuid()
 
-- Updated_at trigger function (reusable)
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
-- Tenants
CREATE TABLE tenants (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name            TEXT NOT NULL CHECK (LENGTH(name) > 0),
    slug            TEXT NOT NULL UNIQUE CHECK (slug ~ '^[a-z0-9-]+$'),
    plan            TEXT NOT NULL DEFAULT 'free'
                    CHECK (plan IN ('free', 'pro', 'enterprise')),
    settings        JSONB NOT NULL DEFAULT '{}',
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON tenants
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
-- Users
CREATE TABLE users (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    email           TEXT NOT NULL,
    display_name    TEXT NOT NULL CHECK (LENGTH(display_name) > 0),
    password_hash   TEXT NOT NULL,
    is_active       BOOLEAN NOT NULL DEFAULT true,
    is_verified     BOOLEAN NOT NULL DEFAULT false,
    last_login_at   TIMESTAMPTZ,
    deleted_at      TIMESTAMPTZ,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE UNIQUE INDEX uq_users_email_per_tenant
    ON users (tenant_id, email)
    WHERE deleted_at IS NULL;
 
CREATE INDEX idx_users_tenant ON users (tenant_id) WHERE deleted_at IS NULL;
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON users
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
-- Enable RLS
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON users
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Projects
CREATE TABLE projects (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    name            TEXT NOT NULL CHECK (LENGTH(name) > 0),
    description     TEXT NOT NULL DEFAULT '',
    status          TEXT NOT NULL DEFAULT 'active'
                    CHECK (status IN ('active', 'archived', 'deleted')),
    owner_id        UUID NOT NULL REFERENCES users(id),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_projects_tenant ON projects (tenant_id);
CREATE INDEX idx_projects_owner ON projects (owner_id);
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON projects
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON projects
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Tasks
CREATE TABLE tasks (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    project_id      UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    title           TEXT NOT NULL CHECK (LENGTH(title) > 0),
    description     TEXT NOT NULL DEFAULT '',
    status          TEXT NOT NULL DEFAULT 'todo'
                    CHECK (status IN ('todo', 'in_progress', 'review', 'done')),
    priority        INTEGER NOT NULL DEFAULT 0 CHECK (priority BETWEEN 0 AND 4),
    assignee_id     UUID REFERENCES users(id) ON DELETE SET NULL,
    due_at          TIMESTAMPTZ,
    completed_at    TIMESTAMPTZ,
    sort_order      INTEGER NOT NULL DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_tasks_tenant_project ON tasks (tenant_id, project_id);
CREATE INDEX idx_tasks_assignee ON tasks (assignee_id) WHERE assignee_id IS NOT NULL;
CREATE INDEX idx_tasks_status ON tasks (tenant_id, status) WHERE status != 'done';
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON tasks
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON tasks
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Task comments
CREATE TABLE task_comments (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    task_id         UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
    author_id       UUID NOT NULL REFERENCES users(id),
    body            TEXT NOT NULL CHECK (LENGTH(body) > 0),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_task_comments_task ON task_comments (task_id);
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON task_comments
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
ALTER TABLE task_comments ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON task_comments
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Audit log (no RLS — admin-only table)
CREATE TABLE audit_log (
    id              BIGSERIAL PRIMARY KEY,
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    table_name      TEXT NOT NULL,
    record_id       TEXT NOT NULL,
    operation       TEXT NOT NULL CHECK (operation IN ('INSERT', 'UPDATE', 'DELETE')),
    old_values      JSONB,
    new_values      JSONB,
    changed_fields  TEXT[],
    changed_by      UUID REFERENCES users(id),
    changed_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_audit_log_tenant_table ON audit_log (tenant_id, table_name, record_id);
CREATE INDEX idx_audit_log_changed_at ON audit_log (changed_at);

Notice the patterns:

Every table has tenant_id and RLS enabled (except tenants and audit_log).
Every table has created_at and updated_at with triggers.
UUIDs for all primary keys (external-facing entities).
Check constraints on status enums, lengths, and ranges.
Foreign keys with appropriate ON DELETE behavior.
Indexes designed for the access patterns (tenant + project, tenant + status).
Soft deletes only on users (where account recovery is needed), status-based archival on projects.

Final Thoughts#

Schema design isn't glamorous. Nobody has ever given a conference talk titled "I Added Check Constraints to Every Table and It Saved Us Six Months of Debugging." But that's exactly what good schema design does — it prevents problems so quietly that you never even know they would have existed.

The patterns in this post aren't novel. They're the result of years of writing migrations, debugging data corruption, and refactoring schemas under production load. Every one of them exists because I, or someone I worked with, did it the other way first and paid the price.

Start with constraints. Use foreign keys. Add created_at to everything. Pick a naming convention and enforce it ruthlessly. Use RLS for multi-tenancy. Be careful with JSONB. Test your migrations against production-sized data before deploying them.

The database is the foundation. Get it right, and everything built on top becomes simpler. Get it wrong, and no amount of clever application code can save you.

This is what I've learned about making schema decisions that don't haunt you. PostgreSQL-focused, because that's what I use and trust, but most of these patterns apply to any relational database.

Naming Conventions: The Boring Thing That Matters Most#

I've seen more arguments about naming conventions than about actual architecture decisions. Here's what I've settled on after years of context-switching between projects:

_id suffix for foreign keys. user_id, order_id, tenant_id. The primary key is just id. This is unambiguous. When you join tables, users.id = orders.user_id reads like English.

_at suffix for timestamps. created_at, updated_at, deleted_at, published_at, expires_at. You always know it's a point in time.

is_ prefix for booleans. is_active, is_verified, is_published. Some people use has_ for ownership booleans (has_mfa_enabled), but I keep it simple and use is_ for everything.

_count suffix for denormalized counters. comment_count, follower_count. Makes it clear this is a cached number, not a live calculation.

sql

-- Good: consistent, readable, no surprises
CREATE TABLE users (
    id              BIGSERIAL PRIMARY KEY,
    email           TEXT NOT NULL,
    display_name    TEXT NOT NULL,
    password_hash   TEXT NOT NULL,
    is_active       BOOLEAN NOT NULL DEFAULT true,
    is_verified     BOOLEAN NOT NULL DEFAULT false,
    login_count     INTEGER NOT NULL DEFAULT 0,
    last_login_at   TIMESTAMPTZ,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Bad: inconsistent casing, mixed conventions, ambiguous names
CREATE TABLE User (
    UserID          SERIAL PRIMARY KEY,
    EmailAddress    VARCHAR(255),
    name            VARCHAR(100),
    passwd          VARCHAR(255),
    active          BOOLEAN,
    verified        INT,             -- is this 0/1? a verification level?
    lastLogin       TIMESTAMP,       -- camelCase in a snake_case world
    created         TIMESTAMP        -- created what? when?
);

The Standard Columns: What Every Table Gets#

Every table in my schemas starts with the same skeleton. No exceptions.

sql

CREATE TABLE things (
    id          BIGSERIAL PRIMARY KEY,
    -- ... domain columns ...
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

The UUID vs BIGSERIAL Debate#

This is one of those decisions that generates way more heat than light. Here's the actual trade-off:

BIGSERIAL (auto-incrementing integer):

8 bytes. Compact. Fast to index and join.
Ordered — you can sort by id to get insertion order.
Predictable — a user can guess other IDs by incrementing theirs.
Doesn't work well in distributed systems (requires coordination).

UUID v4 (random):

16 bytes. Larger indexes, slower joins (but rarely the bottleneck).
Unpredictable — no information leakage.
Works in distributed systems without coordination.
Terrible index locality — random UUIDs fragment B-tree indexes.

UUID v7 (time-sorted, RFC 9562):

16 bytes, but time-ordered so B-tree locality is excellent.
Unpredictable enough for external use.
Relatively new, but PostgreSQL 17+ has gen_random_uuid() and you can use uuid_generate_v7() with extensions.

sql

-- For an API-facing resource
CREATE TABLE projects (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    owner_id    UUID NOT NULL REFERENCES users(id),
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- For an internal mapping table
CREATE TABLE project_members (
    id          BIGSERIAL PRIMARY KEY,
    project_id  UUID NOT NULL REFERENCES projects(id),
    user_id     UUID NOT NULL REFERENCES users(id),
    role        TEXT NOT NULL DEFAULT 'member',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (project_id, user_id)
);

Timestamps: Always TIMESTAMPTZ#

sql

-- Always this
created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
 
-- Never this
created_at  TIMESTAMP NOT NULL DEFAULT NOW()

The updated_at Trigger#

PostgreSQL doesn't have MySQL's ON UPDATE CURRENT_TIMESTAMP. You need a trigger:

sql

CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
-- Apply to every table that has updated_at
CREATE TRIGGER set_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();
 
CREATE TRIGGER set_updated_at
    BEFORE UPDATE ON projects
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

Normalization: When to Break the Rules#

Every CS course teaches normalization up to 3NF (Third Normal Form). The rules are:

1NF: Every column holds a single atomic value. No arrays, no comma-separated lists.
2NF: Every non-key column depends on the entire primary key (relevant for composite keys).
3NF: No transitive dependencies. If column A determines column B, and B determines C, then C shouldn't be in the same table as A.

In practice, 3NF is the sweet spot for transactional tables. You should start there and deviate only when you have a specific, measurable reason.

Here's a properly normalized order system:

sql

CREATE TABLE customers (
    id              BIGSERIAL PRIMARY KEY,
    email           TEXT NOT NULL UNIQUE,
    display_name    TEXT NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE products (
    id              BIGSERIAL PRIMARY KEY,
    sku             TEXT NOT NULL UNIQUE,
    name            TEXT NOT NULL,
    price_cents     INTEGER NOT NULL,
    is_active       BOOLEAN NOT NULL DEFAULT true,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE orders (
    id              BIGSERIAL PRIMARY KEY,
    customer_id     BIGINT NOT NULL REFERENCES customers(id),
    status          TEXT NOT NULL DEFAULT 'pending'
                    CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled')),
    total_cents     INTEGER NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE order_items (
    id              BIGSERIAL PRIMARY KEY,
    order_id        BIGINT NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
    product_id      BIGINT NOT NULL REFERENCES products(id),
    quantity        INTEGER NOT NULL CHECK (quantity > 0),
    unit_price_cents INTEGER NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

When Denormalization Is the Right Call#

sql

-- Denormalized reporting table, populated by a cron job
CREATE TABLE order_reports (
    id                  BIGSERIAL PRIMARY KEY,
    order_id            BIGINT NOT NULL,
    customer_email      TEXT NOT NULL,
    customer_name       TEXT NOT NULL,
    product_names       TEXT[] NOT NULL,        -- array is fine here
    item_count          INTEGER NOT NULL,
    total_cents         INTEGER NOT NULL,
    order_status        TEXT NOT NULL,
    ordered_at          TIMESTAMPTZ NOT NULL,
    report_generated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Materialized views. PostgreSQL's MATERIALIZED VIEW is underrated. It's a denormalized snapshot that you refresh on demand. Perfect for dashboards.

sql

CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT
    DATE_TRUNC('month', o.created_at) AS month,
    COUNT(DISTINCT o.id) AS order_count,
    SUM(o.total_cents) AS revenue_cents,
    COUNT(DISTINCT o.customer_id) AS unique_customers
FROM orders o
WHERE o.status != 'cancelled'
GROUP BY DATE_TRUNC('month', o.created_at)
ORDER BY month DESC;
 
-- Refresh it nightly
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;

The One Rule I Never Break#

Foreign Keys & Constraints: The Best Code You'll Never Write#

I've heard every excuse for skipping foreign keys. "They slow down writes." "We enforce it in the application." "We need flexibility."

These are all wrong.

sql

-- Always do this
ALTER TABLE orders
    ADD CONSTRAINT fk_orders_customer
    FOREIGN KEY (customer_id) REFERENCES customers(id);
 
-- The "we'll handle it in code" alternative:
-- Hope. Hope is not a strategy.

ON DELETE Strategies#

This is where it gets nuanced. What happens when you delete a parent row?

RESTRICT (default): The delete fails if child rows exist. Use this for most relationships. You can't delete a customer who has orders — that's business logic encoded in the schema.

sql

-- Customer can't be deleted while they have orders
ALTER TABLE orders
    ADD CONSTRAINT fk_orders_customer
    FOREIGN KEY (customer_id) REFERENCES customers(id)
    ON DELETE RESTRICT;

CASCADE: Child rows are automatically deleted. Use this sparingly and deliberately. Good for "part-of" relationships where the child has no meaning without the parent.

sql

-- Deleting an order deletes its line items — they're meaningless alone
ALTER TABLE order_items
    ADD CONSTRAINT fk_order_items_order
    FOREIGN KEY (order_id) REFERENCES orders(id)
    ON DELETE CASCADE;
 
-- Deleting a project deletes its memberships
ALTER TABLE project_members
    ADD CONSTRAINT fk_project_members_project
    FOREIGN KEY (project_id) REFERENCES projects(id)
    ON DELETE CASCADE;

SET NULL: The foreign key column is set to NULL. Use this when the relationship is optional and the child row is still meaningful on its own.

sql

-- If a manager leaves, their reports still exist — just unassigned
ALTER TABLE employees
    ADD CONSTRAINT fk_employees_manager
    FOREIGN KEY (manager_id) REFERENCES employees(id)
    ON DELETE SET NULL;

Check Constraints: Cheap Insurance#

Check constraints cost almost nothing at write time and prevent garbage data from entering your system forever:

sql

CREATE TABLE products (
    id              BIGSERIAL PRIMARY KEY,
    name            TEXT NOT NULL CHECK (LENGTH(name) > 0),
    price_cents     INTEGER NOT NULL CHECK (price_cents >= 0),
    discount_pct    NUMERIC(5,2) CHECK (discount_pct BETWEEN 0 AND 100),
    weight_kg       NUMERIC(10,3) CHECK (weight_kg > 0),
    status          TEXT NOT NULL CHECK (status IN ('draft', 'active', 'archived')),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Every check constraint you add is one fewer validation bug you'll ever have to debug in production. The database is the last line of defense. Use it.

Unique Constraints and Partial Unique Indexes#

Unique constraints are straightforward for simple cases:

sql

ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);

But they get interesting with partial indexes — unique constraints that only apply to certain rows:

sql

-- Only one active subscription per user (but they can have many cancelled ones)
CREATE UNIQUE INDEX uq_active_subscription
    ON subscriptions (user_id)
    WHERE status = 'active';
 
-- Only one primary address per user
CREATE UNIQUE INDEX uq_primary_address
    ON addresses (user_id)
    WHERE is_primary = true;

This is one of PostgreSQL's killer features. Use it.

Soft Deletes: The Pattern Everyone Loves to Hate#

sql

CREATE TABLE users (
    id          BIGSERIAL PRIMARY KEY,
    email       TEXT NOT NULL,
    display_name TEXT NOT NULL,
    is_active   BOOLEAN NOT NULL DEFAULT true,
    deleted_at  TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- The partial index: filter out deleted rows efficiently
CREATE INDEX idx_users_active ON users (email) WHERE deleted_at IS NULL;

The Unique Constraint Problem#

The fix is a partial unique index:

sql

-- Email must be unique, but only among non-deleted users
CREATE UNIQUE INDEX uq_users_email_active
    ON users (email)
    WHERE deleted_at IS NULL;

This works, but now you need to remember this pattern for every unique column on every soft-deletable table. It's manageable with discipline, but it's complexity you've chosen to take on.

The Foreign Key Problem#

My Approach#

sql

-- Soft delete: user-facing, recoverable
CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title       TEXT NOT NULL,
    content     TEXT NOT NULL,
    owner_id    UUID NOT NULL REFERENCES users(id),
    deleted_at  TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_documents_active ON documents (owner_id) WHERE deleted_at IS NULL;
 
-- Hard delete: internal, not recoverable from UI (but audited)
CREATE TABLE api_keys (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id     UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    key_hash    TEXT NOT NULL,
    last_used_at TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- When deleted, it's gone. The audit_log records that it existed.

Audit Trails: Know What Changed and Who Did It#

Every non-trivial application needs an answer to "what happened to this record?" at some point. Audit trails are how you provide that answer without digging through application logs.

The Separate Audit Table Pattern#

The simplest approach: a single audit_log table that records every change.

sql

CREATE TABLE audit_log (
    id              BIGSERIAL PRIMARY KEY,
    table_name      TEXT NOT NULL,
    record_id       TEXT NOT NULL,          -- text to handle both UUID and BIGINT PKs
    operation       TEXT NOT NULL CHECK (operation IN ('INSERT', 'UPDATE', 'DELETE')),
    old_values      JSONB,                  -- NULL for INSERT
    new_values      JSONB,                  -- NULL for DELETE
    changed_fields  TEXT[],                 -- which columns changed (for UPDATE)
    changed_by      UUID REFERENCES users(id),
    changed_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    ip_address      INET,
    user_agent      TEXT
);
 
CREATE INDEX idx_audit_log_table_record ON audit_log (table_name, record_id);
CREATE INDEX idx_audit_log_changed_by ON audit_log (changed_by);
CREATE INDEX idx_audit_log_changed_at ON audit_log (changed_at);

The challenge is populating it. You can do it in application code (explicit, but easy to forget) or with triggers (automatic, but harder to pass context like changed_by).

The Trigger-Based Approach#

Triggers capture every change automatically, even from raw SQL or database admin operations:

sql

CREATE OR REPLACE FUNCTION audit_trigger_function()
RETURNS TRIGGER AS $$
DECLARE
    old_data JSONB;
    new_data JSONB;
    changed  TEXT[];
    col      TEXT;
BEGIN
    IF TG_OP = 'DELETE' THEN
        old_data := to_jsonb(OLD);
        INSERT INTO audit_log (table_name, record_id, operation, old_values, changed_at)
        VALUES (TG_TABLE_NAME, OLD.id::TEXT, 'DELETE', old_data, NOW());
        RETURN OLD;
    ELSIF TG_OP = 'UPDATE' THEN
        old_data := to_jsonb(OLD);
        new_data := to_jsonb(NEW);
        -- Find changed fields
        FOR col IN SELECT key FROM jsonb_each(new_data)
        LOOP
            IF old_data->col IS DISTINCT FROM new_data->col THEN
                changed := array_append(changed, col);
            END IF;
        END LOOP;
        -- Only log if something actually changed
        IF array_length(changed, 1) > 0 THEN
            INSERT INTO audit_log (table_name, record_id, operation, old_values, new_values, changed_fields, changed_at)
            VALUES (TG_TABLE_NAME, NEW.id::TEXT, 'UPDATE', old_data, new_data, changed, NOW());
        END IF;
        RETURN NEW;
    ELSIF TG_OP = 'INSERT' THEN
        new_data := to_jsonb(NEW);
        INSERT INTO audit_log (table_name, record_id, operation, new_values, changed_at)
        VALUES (TG_TABLE_NAME, NEW.id::TEXT, 'INSERT', new_data, NOW());
        RETURN NEW;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;
 
-- Apply to tables you want audited
CREATE TRIGGER audit_users
    AFTER INSERT OR UPDATE OR DELETE ON users
    FOR EACH ROW EXECUTE FUNCTION audit_trigger_function();
 
CREATE TRIGGER audit_orders
    AFTER INSERT OR UPDATE OR DELETE ON orders
    FOR EACH ROW EXECUTE FUNCTION audit_trigger_function();

The downside: triggers don't know which application user made the change. You can work around this with session variables:

sql

-- In your application, before the query:
SET LOCAL app.current_user_id = 'uuid-of-current-user';
 
-- In the trigger function:
changed_by_id := current_setting('app.current_user_id', true)::UUID;

This works but feels fragile. In practice, I use a hybrid approach: triggers for the data capture, and application code to set the session context.

The History Table Pattern#

For tables where you need the full version history (not just "what changed" but "what was the state at time T"), a dedicated history table is cleaner:

sql

CREATE TABLE documents (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title           TEXT NOT NULL,
    content         TEXT NOT NULL,
    version         INTEGER NOT NULL DEFAULT 1,
    owner_id        UUID NOT NULL REFERENCES users(id),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE document_history (
    id              BIGSERIAL PRIMARY KEY,
    document_id     UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    title           TEXT NOT NULL,
    content         TEXT NOT NULL,
    version         INTEGER NOT NULL,
    changed_by      UUID REFERENCES users(id),
    changed_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (document_id, version)
);

Multi-Tenancy: Three Approaches, Pick Your Pain#

Row-Level: tenant_id on Every Table#

The most common approach. Every table has a tenant_id column, and every query filters by it.

sql

CREATE TABLE tenants (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    slug        TEXT NOT NULL UNIQUE,
    plan        TEXT NOT NULL DEFAULT 'free' CHECK (plan IN ('free', 'pro', 'enterprise')),
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE projects (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id),
    name        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE tasks (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id),
    project_id  UUID NOT NULL REFERENCES projects(id),
    title       TEXT NOT NULL,
    status      TEXT NOT NULL DEFAULT 'todo',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Every index should include tenant_id for query performance
CREATE INDEX idx_projects_tenant ON projects (tenant_id);
CREATE INDEX idx_tasks_tenant_project ON tasks (tenant_id, project_id);

The risk is data leakage — one missed WHERE tenant_id = ... and you're showing Tenant A's data to Tenant B. PostgreSQL's Row-Level Security (RLS) eliminates this class of bugs:

sql

-- Enable RLS on the table
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
 
-- Create a policy based on a session variable
CREATE POLICY tenant_isolation ON projects
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Force RLS even for table owners
ALTER TABLE projects FORCE ROW LEVEL SECURITY;

sql

-- In your application's connection middleware:
SET LOCAL app.current_tenant_id = 'tenant-uuid-here';
 
-- Now all queries on RLS-enabled tables are automatically filtered
SELECT * FROM projects;
-- PostgreSQL internally adds: WHERE tenant_id = 'tenant-uuid-here'

Schema-Per-Tenant#

Each tenant gets their own PostgreSQL schema. All schemas share the same database, but the tables are isolated by namespace.

sql

-- Create a schema for each tenant
CREATE SCHEMA tenant_acme;
CREATE SCHEMA tenant_globex;
 
-- Tables live in the tenant's schema
CREATE TABLE tenant_acme.projects (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Set the search_path for the current tenant
SET search_path TO tenant_acme, public;
 
-- Now unqualified queries hit the tenant's schema
SELECT * FROM projects;  -- queries tenant_acme.projects

Database-Per-Tenant#

Each tenant gets their own database. Maximum isolation.

Which One?#

JSON/JSONB Columns: The Escape Hatch#

PostgreSQL's JSONB is remarkable. It gives you document-database flexibility inside a relational system. But like any powerful tool, it's easy to abuse.

When JSONB Is the Right Choice#

sql

CREATE TABLE products (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name            TEXT NOT NULL,
    category_id     UUID NOT NULL REFERENCES categories(id),
    price_cents     INTEGER NOT NULL CHECK (price_cents >= 0),
    attributes      JSONB NOT NULL DEFAULT '{}',
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
-- Example data:
-- Shoe: {"size": "10", "color": "black", "material": "leather"}
-- Laptop: {"ram_gb": 16, "storage_gb": 512, "screen_inches": 14.0}

Application settings and configuration. User preferences, feature flags, notification settings — things that change shape frequently and don't need relational integrity.

sql

CREATE TABLE user_settings (
    user_id     UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
    preferences JSONB NOT NULL DEFAULT '{
        "theme": "system",
        "notifications": {"email": true, "push": false},
        "locale": "en",
        "timezone": "UTC"
    }',
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Webhook payloads, API responses, event data. Anything where the schema is controlled by an external system and might change without notice.

GIN Indexes on JSONB#

Without indexes, querying inside JSONB requires a full table scan. GIN indexes make it fast:

sql

-- Index the entire JSONB column (good for @> containment queries)
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
 
-- Query: find all products with color = "black"
SELECT * FROM products
WHERE attributes @> '{"color": "black"}';
 
-- Query: find products with any of these attributes
SELECT * FROM products
WHERE attributes ?| ARRAY['color', 'size'];

For specific paths you query often, a targeted index is more efficient:

sql

-- Index a specific path
CREATE INDEX idx_products_color ON products ((attributes->>'color'));
 
-- Now this is a regular B-tree lookup
SELECT * FROM products
WHERE attributes->>'color' = 'black';

The Migration Trap#

sql

-- Adding a new column: clean, atomic, one statement
ALTER TABLE products ADD COLUMN weight_kg NUMERIC(10,3);
 
-- Adding a new JSONB field: messy, requires a backfill
UPDATE products
SET attributes = attributes || '{"weight_kg": null}'
WHERE NOT (attributes ? 'weight_kg');
-- Plus update application code to handle missing weight_kg
-- Plus update validation logic
-- Plus update every API response that includes attributes

My rule: if you're querying a JSONB field in a WHERE clause more than twice a week, it should probably be a column. JSONB is a great escape hatch. It's a terrible default.

Schema Migrations: Don't Break Production at 3 AM#

Schema migrations are where theory meets reality. Your schema looks great on paper, but now you need to alter a table with 50 million rows during business hours without any downtime.

Migration Tools#

I've used most of them. Brief takes:

Drizzle (TypeScript): My current favorite. Schema-as-code, type-safe queries generated from the schema, clean migration SQL. The push command for development is fast.

Prisma (TypeScript): Great DX for simple schemas. Struggles with advanced PostgreSQL features (partial indexes, custom types, RLS). The migration engine can make surprising decisions.

Flyway (Java/CLI): Rock-solid, battle-tested, SQL-first. If you're writing raw SQL migrations, Flyway tracks them reliably. No magic, no surprises.

golang-migrate (Go/CLI): Similar to Flyway but lighter. Great for Go projects or when you just want a simple up/down migration runner.

The Zero-Downtime Problem#

sql

-- SAFE: metadata-only, instant (PostgreSQL 11+)
ALTER TABLE users ADD COLUMN bio TEXT;
ALTER TABLE users ADD COLUMN is_premium BOOLEAN NOT NULL DEFAULT false;
 
-- DANGEROUS: rewrites the entire table, full table lock
ALTER TABLE users ALTER COLUMN email TYPE VARCHAR(500);
ALTER TABLE users ADD COLUMN score INTEGER NOT NULL DEFAULT compute_score();
 
-- DANGEROUS: scans the entire table to validate
ALTER TABLE orders ADD CONSTRAINT check_total CHECK (total_cents >= 0);

Adding NOT NULL Columns Safely#

You can't just add NOT NULL to an existing table with data — it fails because existing rows don't have a value. The naive approach:

sql

-- This locks the table and rewrites it. Don't do this on a big table.
ALTER TABLE users ADD COLUMN avatar_url TEXT NOT NULL DEFAULT 'https://example.com/default.png';

The Expand-Contract Pattern#

This is the gold standard for zero-downtime schema changes. Three phases:

Phase 1: Expand — Add the new column as nullable. Deploy application code that writes to both old and new columns.

sql

-- Migration 1: Add the nullable column
ALTER TABLE users ADD COLUMN normalized_email TEXT;

Phase 2: Migrate — Backfill existing rows in batches. Your application is already writing to the new column for new data.

sql

-- Migration 2: Backfill in batches (don't do this in one statement for large tables)
UPDATE users SET normalized_email = LOWER(TRIM(email))
WHERE normalized_email IS NULL
LIMIT 10000;
-- Repeat until all rows are backfilled

Phase 3: Contract — Once all rows are backfilled, add the NOT NULL constraint and remove the old column (if applicable).

sql

-- Migration 3: Add constraint (use NOT VALID to avoid full table scan, then validate separately)
ALTER TABLE users ADD CONSTRAINT users_normalized_email_not_null
    CHECK (normalized_email IS NOT NULL) NOT VALID;
 
-- Migration 4: Validate the constraint (takes a ShareUpdateExclusiveLock, not an AccessExclusiveLock)
ALTER TABLE users VALIDATE CONSTRAINT users_normalized_email_not_null;

Index Creation#

Creating indexes on large tables blocks writes by default. Always use CONCURRENTLY:

sql

-- BLOCKS WRITES: don't do this on a live table
CREATE INDEX idx_users_email ON users (email);
 
-- NON-BLOCKING: use this instead
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

CONCURRENTLY takes longer and can't run inside a transaction, but it doesn't lock the table. The trade-off is always worth it in production.

The Decisions I've Regretted#

Every schema has regrets. Here are mine.

Using SERIAL Instead of UUID for External IDs#

Lesson: use UUIDs for anything exposed outside your database. Use SERIAL/BIGSERIAL for internal join tables if you want, but never let an auto-incrementing integer appear in a URL.

No Constraints ("We'll Add Them Later")#

Adding constraints after the fact required:

Finding all invalid data
Deciding what to do with it (fix, delete, or grandfather it)
Writing a migration that backfills/fixes the data AND adds the constraint

This took longer than adding the constraint on day one would have. Start with constraints. Loosen them if you must. Never the other way around.

Comma-Separated Strings Instead of Arrays or Join Tables#

I once stored tags as a comma-separated string: "javascript,react,nextjs". Querying was a nightmare:

sql

-- This is how you query comma-separated values. Don't do this.
SELECT * FROM posts WHERE tags LIKE '%react%';
-- This also matches "react-native" and "preact"
 
-- What I should have done:
CREATE TABLE post_tags (
    post_id     UUID NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    tag_id      UUID NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    PRIMARY KEY (post_id, tag_id)
);

Or at minimum, use PostgreSQL's native array type:

sql

CREATE TABLE posts (
    id      UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title   TEXT NOT NULL,
    tags    TEXT[] NOT NULL DEFAULT '{}',
    -- ...
);
 
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);
 
-- Clean queries
SELECT * FROM posts WHERE 'react' = ANY(tags);
SELECT * FROM posts WHERE tags @> ARRAY['react', 'nextjs'];

Using "type" as a Column Name#

sql

-- Seemed fine at the time
CREATE TABLE notifications (
    id      BIGSERIAL PRIMARY KEY,
    type    TEXT NOT NULL,  -- "email", "sms", "push"
    -- ...
);
 
-- Then you try to query it in literally any ORM:
-- notification.type  ← conflicts with every language's type system
-- "type" is a reserved word in most SQL dialects
-- You end up quoting it everywhere: SELECT "type" FROM notifications

Not Adding created_at from the Start#

Every table gets created_at. No exceptions. The cost is one column. The cost of not having it is unknowable until you need it.

Putting Business Logic in Database Views#

Putting It All Together#

Here's what a well-designed schema looks like when you combine these patterns. A simplified project management system:

sql

-- Extensions
CREATE EXTENSION IF NOT EXISTS "pgcrypto";  -- for gen_random_uuid()
 
-- Updated_at trigger function (reusable)
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
-- Tenants
CREATE TABLE tenants (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name            TEXT NOT NULL CHECK (LENGTH(name) > 0),
    slug            TEXT NOT NULL UNIQUE CHECK (slug ~ '^[a-z0-9-]+$'),
    plan            TEXT NOT NULL DEFAULT 'free'
                    CHECK (plan IN ('free', 'pro', 'enterprise')),
    settings        JSONB NOT NULL DEFAULT '{}',
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON tenants
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
-- Users
CREATE TABLE users (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    email           TEXT NOT NULL,
    display_name    TEXT NOT NULL CHECK (LENGTH(display_name) > 0),
    password_hash   TEXT NOT NULL,
    is_active       BOOLEAN NOT NULL DEFAULT true,
    is_verified     BOOLEAN NOT NULL DEFAULT false,
    last_login_at   TIMESTAMPTZ,
    deleted_at      TIMESTAMPTZ,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE UNIQUE INDEX uq_users_email_per_tenant
    ON users (tenant_id, email)
    WHERE deleted_at IS NULL;
 
CREATE INDEX idx_users_tenant ON users (tenant_id) WHERE deleted_at IS NULL;
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON users
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
-- Enable RLS
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON users
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Projects
CREATE TABLE projects (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    name            TEXT NOT NULL CHECK (LENGTH(name) > 0),
    description     TEXT NOT NULL DEFAULT '',
    status          TEXT NOT NULL DEFAULT 'active'
                    CHECK (status IN ('active', 'archived', 'deleted')),
    owner_id        UUID NOT NULL REFERENCES users(id),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_projects_tenant ON projects (tenant_id);
CREATE INDEX idx_projects_owner ON projects (owner_id);
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON projects
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON projects
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Tasks
CREATE TABLE tasks (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    project_id      UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    title           TEXT NOT NULL CHECK (LENGTH(title) > 0),
    description     TEXT NOT NULL DEFAULT '',
    status          TEXT NOT NULL DEFAULT 'todo'
                    CHECK (status IN ('todo', 'in_progress', 'review', 'done')),
    priority        INTEGER NOT NULL DEFAULT 0 CHECK (priority BETWEEN 0 AND 4),
    assignee_id     UUID REFERENCES users(id) ON DELETE SET NULL,
    due_at          TIMESTAMPTZ,
    completed_at    TIMESTAMPTZ,
    sort_order      INTEGER NOT NULL DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_tasks_tenant_project ON tasks (tenant_id, project_id);
CREATE INDEX idx_tasks_assignee ON tasks (assignee_id) WHERE assignee_id IS NOT NULL;
CREATE INDEX idx_tasks_status ON tasks (tenant_id, status) WHERE status != 'done';
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON tasks
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON tasks
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Task comments
CREATE TABLE task_comments (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    task_id         UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
    author_id       UUID NOT NULL REFERENCES users(id),
    body            TEXT NOT NULL CHECK (LENGTH(body) > 0),
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_task_comments_task ON task_comments (task_id);
 
CREATE TRIGGER set_updated_at BEFORE UPDATE ON task_comments
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
 
ALTER TABLE task_comments ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON task_comments
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
 
-- Audit log (no RLS — admin-only table)
CREATE TABLE audit_log (
    id              BIGSERIAL PRIMARY KEY,
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    table_name      TEXT NOT NULL,
    record_id       TEXT NOT NULL,
    operation       TEXT NOT NULL CHECK (operation IN ('INSERT', 'UPDATE', 'DELETE')),
    old_values      JSONB,
    new_values      JSONB,
    changed_fields  TEXT[],
    changed_by      UUID REFERENCES users(id),
    changed_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_audit_log_tenant_table ON audit_log (tenant_id, table_name, record_id);
CREATE INDEX idx_audit_log_changed_at ON audit_log (changed_at);

Notice the patterns:

Every table has tenant_id and RLS enabled (except tenants and audit_log).
Every table has created_at and updated_at with triggers.
UUIDs for all primary keys (external-facing entities).
Check constraints on status enums, lengths, and ranges.
Foreign keys with appropriate ON DELETE behavior.
Indexes designed for the access patterns (tenant + project, tenant + status).
Soft deletes only on users (where account recovery is needed), status-based archival on projects.

Final Thoughts#

The database is the foundation. Get it right, and everything built on top becomes simpler. Get it wrong, and no amount of clever application code can save you.

Naming Conventions: The Boring Thing That Matters Most#

The Standard Columns: What Every Table Gets#

The UUID vs BIGSERIAL Debate#

Timestamps: Always TIMESTAMPTZ#

The updated_at Trigger#

Normalization: When to Break the Rules#

When Denormalization Is the Right Call#

The One Rule I Never Break#

Foreign Keys & Constraints: The Best Code You'll Never Write#

ON DELETE Strategies#

Check Constraints: Cheap Insurance#

Unique Constraints and Partial Unique Indexes#

Soft Deletes: The Pattern Everyone Loves to Hate#

The Unique Constraint Problem#

The Foreign Key Problem#

My Approach#

Audit Trails: Know What Changed and Who Did It#

The Separate Audit Table Pattern#

The Trigger-Based Approach#

The History Table Pattern#

Multi-Tenancy: Three Approaches, Pick Your Pain#

Row-Level: tenant_id on Every Table#

Schema-Per-Tenant#

Database-Per-Tenant#

Which One?#

JSON/JSONB Columns: The Escape Hatch#

When JSONB Is the Right Choice#

GIN Indexes on JSONB#

The Migration Trap#

Schema Migrations: Don't Break Production at 3 AM#

Migration Tools#

The Zero-Downtime Problem#

Adding NOT NULL Columns Safely#

The Expand-Contract Pattern#

Index Creation#

The Decisions I've Regretted#

Using SERIAL Instead of UUID for External IDs#

No Constraints ("We'll Add Them Later")#

Comma-Separated Strings Instead of Arrays or Join Tables#

Using "type" as a Column Name#

Not Adding created_at from the Start#

Putting Business Logic in Database Views#

Putting It All Together#

Final Thoughts#

Related Posts

SQL Cheat Sheet 2026: Essential Queries Every Developer Needs

Building AI Agents That Actually Work: Architecture, Patterns, and Hard Lessons from Production

Naming Conventions: The Boring Thing That Matters Most#

The Standard Columns: What Every Table Gets#

The UUID vs BIGSERIAL Debate#

Timestamps: Always TIMESTAMPTZ#

The updated_at Trigger#

Normalization: When to Break the Rules#

When Denormalization Is the Right Call#

The One Rule I Never Break#

Foreign Keys & Constraints: The Best Code You'll Never Write#

ON DELETE Strategies#

Check Constraints: Cheap Insurance#

Unique Constraints and Partial Unique Indexes#

Soft Deletes: The Pattern Everyone Loves to Hate#

The Unique Constraint Problem#

The Foreign Key Problem#

My Approach#

Audit Trails: Know What Changed and Who Did It#

The Separate Audit Table Pattern#

The Trigger-Based Approach#

The History Table Pattern#

Multi-Tenancy: Three Approaches, Pick Your Pain#

Row-Level: tenant_id on Every Table#

Schema-Per-Tenant#

Database-Per-Tenant#

Which One?#

JSON/JSONB Columns: The Escape Hatch#

When JSONB Is the Right Choice#

GIN Indexes on JSONB#

The Migration Trap#

Schema Migrations: Don't Break Production at 3 AM#

Migration Tools#

The Zero-Downtime Problem#

Adding NOT NULL Columns Safely#