Skip to main content
    Courses/SQL/Multi-Tenant Databases

    Lesson 47 • Advanced Track

    Designing Multi-Tenant Databases

    Almost every SaaS product serves many customers ("tenants") from one system. By the end of this lesson you'll be able to choose between the three standard isolation models — shared schema, schema-per-tenant, and database-per-tenant — and implement the most common one safely with a tenant_id column and PostgreSQL Row-Level Security so the database itself stops one customer from ever seeing another's data.

    What You'll Learn

    • What multi-tenancy is and why it matters for SaaS
    • The shared-schema model with a tenant_id discriminator
    • Why a forgotten filter is a cross-tenant data leak
    • Row-Level Security as an automatic safety net
    • Schema-per-tenant and database-per-tenant trade-offs
    • How to pick the right model: isolation vs. cost vs. scale

    Our Sample Table: tasks (shared by two tenants)

    Every example in this lesson uses one tasks table shared by two customers: tenant 1 = Acme and tenant 2 = Globex. Notice the tenant_id column — that single column is what keeps their data apart.

    Result:

    task_idtenant_idtitlestatus
    11Design logodone
    21Write docstodo
    31Fix login bugtodo
    42Ship invoicetodo
    52Renew SSLtodo
    62Plan roadmapdone

    Rows 1–3 belong to Acme; rows 4–6 belong to Globex. They sit side by side in one physical table.

    1. What "Multi-Tenant" Means

    A tenant is one customer of your SaaS — usually a whole company and all of its users. A multi-tenant system serves many tenants from shared infrastructure. The central challenge is isolation: tenant A must never see, change, or even count tenant B's rows, even though they might be running the very same product on the very same servers.

    🏢 Real-world analogy

    Think of an apartment building. Shared schema = everyone shares the building and the lobby, but each flat has its own lock (the tenant_id). Schema-per-tenant = each family gets a private floor. Database-per-tenant = each family gets their own separate building. More walls means more isolation — but also more cost to build and maintain.

    🔗 Shared Schema

    One set of tables, a tenant_id on each. Cheapest, easiest to scale to thousands of tenants. Risk: a forgotten filter leaks data.

    📂 Schema-per-Tenant

    Each tenant gets its own schema (namespace) of tables. Stronger isolation, moderate cost. Migrations must run N times.

    🏠 Database-per-Tenant

    A whole separate database per tenant. Strongest isolation, easiest per-tenant backup. Most expensive; hard past a few hundred tenants.

    2. The Shared-Schema Model (tenant_id discriminator)

    This is the default choice for most SaaS apps because it's the cheapest and scales to thousands of tenants on one set of tables. The trick is a discriminator column — conventionally tenant_id — added to every tenant-owned table. The golden rule: every query must filter on tenant_id, and every multi-column index should list tenant_id first.

    Shared schema with tenant_id

    One tasks table, a tenant_id on every row, queries scoped by tenant.

    Try it Yourself »
    SQL
    -- Model 1: SHARED SCHEMA (a.k.a. "pooled" multi-tenancy)
    -- Every tenant's rows live in the SAME tables. A single
    -- "discriminator" column — tenant_id — says who each row belongs to.
    
    CREATE TABLE tenants (
        tenant_id   SERIAL PRIMARY KEY,
        name        VARCHAR(100) NOT NULL,
        subdomain   VARCHAR(50) UNIQUE NOT NULL   -- acme.app.com
    );
    
    CREATE TABLE tasks (
        task_id     SERIAL PRIMARY KEY,
        tenant_id   INT NOT NULL REFERENCES tenants(tenant_id),  -- 👈 the discriminator
        titl
    ...

    Result — WHERE tenant_id = 1 → Acme only:

    titlestatus
    Design logodone
    Write docstodo
    Fix login bugtodo

    Compare that to what happens when you forget the filter. Because every tenant's rows live in one table, an unscoped query quietly returns everyone — the single most dangerous mistake in multi-tenant SQL.

    Scoped vs. leaked

    See how a missing tenant_id filter returns both tenants' rows.

    Try it Yourself »
    SQL
    -- Why tenant_id matters: the same table holds EVERYONE'S data.
    -- Tenant 1 = Acme, Tenant 2 = Globex. They share the "tasks" table.
    
    -- ✅ Scoped — Acme sees only Acme:
    SELECT title FROM tasks WHERE tenant_id = 1;
    
    -- ❌ Forgot the filter — this returns BOTH tenants' rows.
    --    In a SaaS app that is a cross-tenant DATA LEAK.
    SELECT title FROM tasks;

    Result — ❌ no filter → BOTH tenants (a leak):

    title
    Design logo
    Write docs
    Fix login bug
    Ship invoice
    Renew SSL
    Plan roadmap

    Your Turn: scope a query to one tenant

    Fill in the blanks to return only Globex's open (todo) tasks. The expected result is in the comments so you can check yourself.

    🎯 Your Turn: filter by tenant_id

    Replace the ___ blanks to isolate tenant 2's todo tasks.

    Try it Yourself »
    SQL
    -- 🎯 YOUR TURN — fill in the two blanks, then press "Try it Yourself"
    -- Goal: get the 'todo' tasks for tenant 2 (Globex) ONLY.
    
    SELECT title, status
    FROM tasks
    WHERE ___ = 2          -- 👉 the discriminator column that isolates tenants
      AND status = '___';  -- 👉 the status we want
    
    -- ✅ Expected result: only Globex's open tasks
    --    e.g.  Ship invoice | todo ,  Renew SSL | todo

    3. Row-Level Security — the Automatic Safety Net

    Relying on every developer to remember WHERE tenant_id = … on every query forever is a bet you will eventually lose. Row-Level Security (RLS) moves the filter into the database. You set the current tenant once per connection, write one POLICY per table, and PostgreSQL transparently appends the tenant filter to every SELECT, UPDATE, and DELETE. A forgotten WHERE clause is no longer a leak.

    RLS policy on tasks

    Enable RLS, set the tenant, write a USING policy — queries auto-scope.

    Try it Yourself »
    SQL
    -- Model 1 + a safety net: ROW-LEVEL SECURITY (RLS).
    -- RLS makes the DATABASE itself add the tenant filter, so a
    -- forgotten WHERE clause can no longer leak data.
    
    -- 1) Turn RLS on for the table
    ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;
    
    -- 2) Tell each connection which tenant it is acting as
    SET app.current_tenant = '1';   -- set once when the request starts
    
    -- 3) The policy: a row is only VISIBLE when its tenant_id matches
    CREATE POLICY tenant_isolation ON tasks
        USING (tenant_id = c
    ...

    Result — app.current_tenant = '1' → naked SELECT still returns Acme only:

    titlestatus
    Design logodone
    Write docstodo
    Fix login bugtodo

    Change the tenant context and re-run the same query — the database now shows a completely different set of rows. Tenant 2 never sees tenant 1's data, and the SQL never even mentions tenant_id.

    Result — app.current_tenant = '2' → same query, Globex only:

    titlestatus
    Ship invoicetodo
    Renew SSLtodo
    Plan roadmapdone

    4. Schema-per-Tenant & Database-per-Tenant

    Sometimes shared tables aren't enough — a big enterprise customer wants its data provably separated, or a regulation demands a dedicated backup per customer. Two stronger models trade cost for isolation:

    Schema-per-tenant: each tenant gets its own schema (a namespace of tables) inside one database. You route a connection with SET search_path TO tenant_acme;, then plain SELECT * FROM projects hits that tenant's tables. Isolation is strong and there's no tenant_id column to forget — but every schema change must be applied to every schema, so migrations fan out and the catalogue gets large past a few thousand tenants.

    Database-per-tenant: a whole separate database (or server) per tenant. The strongest isolation, the simplest "delete a customer" and "restore one customer" story, and a natural fit for compliance (HIPAA, data residency). The price: provisioning, connection-pool pressure, and migrations that must run across hundreds of databases. It rarely scales past the low hundreds of tenants without serious tooling.

    The "noisy neighbour" angle: in shared schema, one tenant running a huge report can slow everyone down (shared CPU, cache, locks). More physical separation (schema → database) reduces noisy-neighbour blast radius, which is another reason big customers pay for it.

    Your Turn: pick the right model

    Read the scenario in the comments and choose the model that fits. There's one blank — write the model name.

    🎯 Your Turn: choose a model

    Match the compliance scenario to shared-schema, schema-per-tenant, or database-per-tenant.

    Try it Yourself »
    SQL
    -- 🎯 YOUR TURN — fill in the blank with ONE model name.
    -- Scenario: a healthcare SaaS. Each hospital is legally required to
    -- keep its patient data in a physically separate database for
    -- compliance, and is happy to pay a premium for that guarantee.
    --
    -- Which multi-tenancy model fits best?
    
    -- 👉 Replace ___ with: shared-schema  |  schema-per-tenant  |  database-per-tenant
    -- MODEL: ___
    
    -- ✅ Expected answer: database-per-tenant
    --    (strongest isolation + per-tenant compliance/backup, co
    ...

    Common Errors (and the fix)

    • Forgetting tenant_id in a query → data leak: in shared schema, SELECT * FROM tasks returns every tenant. Always add WHERE tenant_id = … — and back it up with RLS so a slip can't leak data.
    • No RLS as a safety net: trusting application code alone to add the filter is fragile — one new query, one raw SQL console, one bug, and you have a breach. Enable RLS and connect the app with a non-owner role (owners and superusers bypass policies).
    • Cross-tenant joins: joining two tenant tables without carrying tenant_id through the ON clause can splice one tenant's rows onto another's. Join on a.tenant_id = b.tenant_id as well as the foreign key.
    • Per-tenant DB migration sprawl: in database/schema-per-tenant, a schema change must succeed across hundreds of targets. A migration that half-applies leaves tenants on different versions — use a migration runner that tracks and retries each tenant, and roll out in batches.
    • current_setting errors: current_setting('app.current_tenant') throws "unrecognized configuration parameter" if you never SET it. Use current_setting('app.current_tenant', true) (the true = "missing is OK") or set it at connection start.

    📘 Quick Reference — Model Comparison

    ModelIsolationCostScales toMigrations
    Shared schema + tenant_idLow (app-enforced)💰1000sRun once
    Shared schema + RLSMedium (DB-enforced)💰1000sRun once
    Schema-per-tenantHigh💰💰100s–1000sRun per schema
    Database-per-tenantComplete💰💰💰10s–100sRun per database

    Rule of thumb: start with shared schema + RLS; promote your biggest/most-regulated tenants to schema- or database-per-tenant only when they need it.

    Frequently Asked Questions

    Q: Isn't a tenant_id on every table wasteful?

    No — it's a single integer column, and it's also exactly what your indexes need. Put tenant_id first in composite indexes ((tenant_id, status)) so the database can jump straight to one tenant's rows.

    Q: Does Row-Level Security slow queries down?

    The policy is just an extra AND tenant_id = … the planner adds. With a tenant_id-leading index the cost is negligible. Keep policy expressions simple (avoid sub-queries) and they stay fast.

    Q: Can I mix models?

    Yes, and large SaaS companies do. Most tenants share one schema with RLS, while a handful of huge or regulated customers are promoted to their own database. The app routes each request to the right place based on the tenant.

    Q: How do I run a report across all tenants if RLS hides everything?

    Use a separate privileged role with a BYPASSRLS attribute or a policy granting it full access, and run analytics with that role only. Never reuse it for normal request handling — that defeats the isolation.

    Mini-Challenge: Lock Down the projects Table

    Put it all together — a brief, a blank canvas, and the expected result in the comments. Write it, then run it in a PostgreSQL playground to confirm the table auto-scopes.

    🎯 Mini-Challenge

    Enable RLS and add an isolation policy to the projects table.

    Try it Yourself »
    SQL
    -- 🎯 MINI-CHALLENGE — Lock down a second table
    -- The "projects" table also has a tenant_id column. Right now any
    -- tenant can read every project. Make the database enforce isolation.
    --
    -- 1. Enable Row-Level Security on the projects table
    -- 2. Create a policy named tenant_isolation_projects so a row is only
    --    visible when its tenant_id matches app.current_tenant
    -- 3. Then SET app.current_tenant = '1' and SELECT * FROM projects
    --
    -- ✅ Expected: after the policy, SELECT * FROM projects 
    ...

    🎉 Lesson Complete

    • ✅ Multi-tenancy = many customers, one system, strict data isolation
    • ✅ Shared schema uses a tenant_id discriminator and must scope every query
    • ✅ A forgotten tenant_id filter is a cross-tenant data leak
    • ✅ Row-Level Security makes the database enforce the tenant filter automatically
    • ✅ Schema- and database-per-tenant trade cost for stronger isolation and easier compliance
    • Next: data warehousing — star and snowflake schemas for analytics

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service