Lesson 47 • Advanced Track
Designing Multi-Tenant Databases
Almost every SaaS product serves many customers ("tenants") from one system. By the end of this lesson you'll be able to choose between the three standard isolation models — shared schema, schema-per-tenant, and database-per-tenant — and implement the most common one safely with a tenant_id column and PostgreSQL Row-Level Security so the database itself stops one customer from ever seeing another's data.
What You'll Learn
- ✓What multi-tenancy is and why it matters for SaaS
- ✓The shared-schema model with a tenant_id discriminator
- ✓Why a forgotten filter is a cross-tenant data leak
- ✓Row-Level Security as an automatic safety net
- ✓Schema-per-tenant and database-per-tenant trade-offs
- ✓How to pick the right model: isolation vs. cost vs. scale
CREATE TABLE with foreign keys. The in-browser editor lets you write and edit SQL; to run the RLS examples you'll need PostgreSQL specifically — copy queries into a free playground like db-fiddle.com (pick PostgreSQL) or a local Postgres. Every example below shows the expected result so you can check yourself. Our Sample Table: tasks (shared by two tenants)
Every example in this lesson uses one tasks table shared by two customers: tenant 1 = Acme and tenant 2 = Globex. Notice the tenant_id column — that single column is what keeps their data apart.
Result:
| task_id | tenant_id | title | status |
|---|---|---|---|
| 1 | 1 | Design logo | done |
| 2 | 1 | Write docs | todo |
| 3 | 1 | Fix login bug | todo |
| 4 | 2 | Ship invoice | todo |
| 5 | 2 | Renew SSL | todo |
| 6 | 2 | Plan roadmap | done |
Rows 1–3 belong to Acme; rows 4–6 belong to Globex. They sit side by side in one physical table.
1. What "Multi-Tenant" Means
A tenant is one customer of your SaaS — usually a whole company and all of its users. A multi-tenant system serves many tenants from shared infrastructure. The central challenge is isolation: tenant A must never see, change, or even count tenant B's rows, even though they might be running the very same product on the very same servers.
🏢 Real-world analogy
Think of an apartment building. Shared schema = everyone shares the building and the lobby, but each flat has its own lock (the tenant_id). Schema-per-tenant = each family gets a private floor. Database-per-tenant = each family gets their own separate building. More walls means more isolation — but also more cost to build and maintain.
🔗 Shared Schema
One set of tables, a tenant_id on each. Cheapest, easiest to scale to thousands of tenants. Risk: a forgotten filter leaks data.
📂 Schema-per-Tenant
Each tenant gets its own schema (namespace) of tables. Stronger isolation, moderate cost. Migrations must run N times.
🏠 Database-per-Tenant
A whole separate database per tenant. Strongest isolation, easiest per-tenant backup. Most expensive; hard past a few hundred tenants.
2. The Shared-Schema Model (tenant_id discriminator)
This is the default choice for most SaaS apps because it's the cheapest and scales to thousands of tenants on one set of tables. The trick is a discriminator column — conventionally tenant_id — added to every tenant-owned table. The golden rule: every query must filter on tenant_id, and every multi-column index should list tenant_id first.
Result — WHERE tenant_id = 1 → Acme only:
| title | status |
|---|---|
| Design logo | done |
| Write docs | todo |
| Fix login bug | todo |
Compare that to what happens when you forget the filter. Because every tenant's rows live in one table, an unscoped query quietly returns everyone — the single most dangerous mistake in multi-tenant SQL.
Scoped vs. leaked
See how a missing tenant_id filter returns both tenants' rows.
-- Why tenant_id matters: the same table holds EVERYONE'S data.
-- Tenant 1 = Acme, Tenant 2 = Globex. They share the "tasks" table.
-- ✅ Scoped — Acme sees only Acme:
SELECT title FROM tasks WHERE tenant_id = 1;
-- ❌ Forgot the filter — this returns BOTH tenants' rows.
-- In a SaaS app that is a cross-tenant DATA LEAK.
SELECT title FROM tasks;Result — ❌ no filter → BOTH tenants (a leak):
| title |
|---|
| Design logo |
| Write docs |
| Fix login bug |
| Ship invoice |
| Renew SSL |
| Plan roadmap |
Your Turn: scope a query to one tenant
Fill in the blanks to return only Globex's open (todo) tasks. The expected result is in the comments so you can check yourself.
🎯 Your Turn: filter by tenant_id
Replace the ___ blanks to isolate tenant 2's todo tasks.
-- 🎯 YOUR TURN — fill in the two blanks, then press "Try it Yourself"
-- Goal: get the 'todo' tasks for tenant 2 (Globex) ONLY.
SELECT title, status
FROM tasks
WHERE ___ = 2 -- 👉 the discriminator column that isolates tenants
AND status = '___'; -- 👉 the status we want
-- ✅ Expected result: only Globex's open tasks
-- e.g. Ship invoice | todo , Renew SSL | todo3. Row-Level Security — the Automatic Safety Net
Relying on every developer to remember WHERE tenant_id = … on every query forever is a bet you will eventually lose. Row-Level Security (RLS) moves the filter into the database. You set the current tenant once per connection, write one POLICY per table, and PostgreSQL transparently appends the tenant filter to every SELECT, UPDATE, and DELETE. A forgotten WHERE clause is no longer a leak.
USING clause controls which rows are visible for reads. Add a WITH CHECK clause to also stop a tenant from inserting rows with someone else's tenant_id. RLS is enforced for normal roles but bypassed by the table owner and superusers — so connect your app with a non-owner role.RLS policy on tasks
Enable RLS, set the tenant, write a USING policy — queries auto-scope.
-- Model 1 + a safety net: ROW-LEVEL SECURITY (RLS).
-- RLS makes the DATABASE itself add the tenant filter, so a
-- forgotten WHERE clause can no longer leak data.
-- 1) Turn RLS on for the table
ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;
-- 2) Tell each connection which tenant it is acting as
SET app.current_tenant = '1'; -- set once when the request starts
-- 3) The policy: a row is only VISIBLE when its tenant_id matches
CREATE POLICY tenant_isolation ON tasks
USING (tenant_id = c
...Result — app.current_tenant = '1' → naked SELECT still returns Acme only:
| title | status |
|---|---|
| Design logo | done |
| Write docs | todo |
| Fix login bug | todo |
Change the tenant context and re-run the same query — the database now shows a completely different set of rows. Tenant 2 never sees tenant 1's data, and the SQL never even mentions tenant_id.
Result — app.current_tenant = '2' → same query, Globex only:
| title | status |
|---|---|
| Ship invoice | todo |
| Renew SSL | todo |
| Plan roadmap | done |
4. Schema-per-Tenant & Database-per-Tenant
Sometimes shared tables aren't enough — a big enterprise customer wants its data provably separated, or a regulation demands a dedicated backup per customer. Two stronger models trade cost for isolation:
Schema-per-tenant: each tenant gets its own schema (a namespace of tables) inside one database. You route a connection with SET search_path TO tenant_acme;, then plain SELECT * FROM projects hits that tenant's tables. Isolation is strong and there's no tenant_id column to forget — but every schema change must be applied to every schema, so migrations fan out and the catalogue gets large past a few thousand tenants.
Database-per-tenant: a whole separate database (or server) per tenant. The strongest isolation, the simplest "delete a customer" and "restore one customer" story, and a natural fit for compliance (HIPAA, data residency). The price: provisioning, connection-pool pressure, and migrations that must run across hundreds of databases. It rarely scales past the low hundreds of tenants without serious tooling.
The "noisy neighbour" angle: in shared schema, one tenant running a huge report can slow everyone down (shared CPU, cache, locks). More physical separation (schema → database) reduces noisy-neighbour blast radius, which is another reason big customers pay for it.
Your Turn: pick the right model
Read the scenario in the comments and choose the model that fits. There's one blank — write the model name.
🎯 Your Turn: choose a model
Match the compliance scenario to shared-schema, schema-per-tenant, or database-per-tenant.
-- 🎯 YOUR TURN — fill in the blank with ONE model name.
-- Scenario: a healthcare SaaS. Each hospital is legally required to
-- keep its patient data in a physically separate database for
-- compliance, and is happy to pay a premium for that guarantee.
--
-- Which multi-tenancy model fits best?
-- 👉 Replace ___ with: shared-schema | schema-per-tenant | database-per-tenant
-- MODEL: ___
-- ✅ Expected answer: database-per-tenant
-- (strongest isolation + per-tenant compliance/backup, co
...Common Errors (and the fix)
- Forgetting
tenant_idin a query → data leak: in shared schema,SELECT * FROM tasksreturns every tenant. Always addWHERE tenant_id = …— and back it up with RLS so a slip can't leak data. - No RLS as a safety net: trusting application code alone to add the filter is fragile — one new query, one raw SQL console, one bug, and you have a breach. Enable RLS and connect the app with a non-owner role (owners and superusers bypass policies).
- Cross-tenant joins: joining two tenant tables without carrying
tenant_idthrough theONclause can splice one tenant's rows onto another's. Join ona.tenant_id = b.tenant_idas well as the foreign key. - Per-tenant DB migration sprawl: in database/schema-per-tenant, a schema change must succeed across hundreds of targets. A migration that half-applies leaves tenants on different versions — use a migration runner that tracks and retries each tenant, and roll out in batches.
current_settingerrors:current_setting('app.current_tenant')throws "unrecognized configuration parameter" if you neverSETit. Usecurrent_setting('app.current_tenant', true)(thetrue= "missing is OK") or set it at connection start.
📘 Quick Reference — Model Comparison
| Model | Isolation | Cost | Scales to | Migrations |
|---|---|---|---|---|
| Shared schema + tenant_id | Low (app-enforced) | 💰 | 1000s | Run once |
| Shared schema + RLS | Medium (DB-enforced) | 💰 | 1000s | Run once |
| Schema-per-tenant | High | 💰💰 | 100s–1000s | Run per schema |
| Database-per-tenant | Complete | 💰💰💰 | 10s–100s | Run per database |
Rule of thumb: start with shared schema + RLS; promote your biggest/most-regulated tenants to schema- or database-per-tenant only when they need it.
Frequently Asked Questions
Q: Isn't a tenant_id on every table wasteful?
No — it's a single integer column, and it's also exactly what your indexes need. Put tenant_id first in composite indexes ((tenant_id, status)) so the database can jump straight to one tenant's rows.
Q: Does Row-Level Security slow queries down?
The policy is just an extra AND tenant_id = … the planner adds. With a tenant_id-leading index the cost is negligible. Keep policy expressions simple (avoid sub-queries) and they stay fast.
Q: Can I mix models?
Yes, and large SaaS companies do. Most tenants share one schema with RLS, while a handful of huge or regulated customers are promoted to their own database. The app routes each request to the right place based on the tenant.
Q: How do I run a report across all tenants if RLS hides everything?
Use a separate privileged role with a BYPASSRLS attribute or a policy granting it full access, and run analytics with that role only. Never reuse it for normal request handling — that defeats the isolation.
Mini-Challenge: Lock Down the projects Table
Put it all together — a brief, a blank canvas, and the expected result in the comments. Write it, then run it in a PostgreSQL playground to confirm the table auto-scopes.
🎯 Mini-Challenge
Enable RLS and add an isolation policy to the projects table.
-- 🎯 MINI-CHALLENGE — Lock down a second table
-- The "projects" table also has a tenant_id column. Right now any
-- tenant can read every project. Make the database enforce isolation.
--
-- 1. Enable Row-Level Security on the projects table
-- 2. Create a policy named tenant_isolation_projects so a row is only
-- visible when its tenant_id matches app.current_tenant
-- 3. Then SET app.current_tenant = '1' and SELECT * FROM projects
--
-- ✅ Expected: after the policy, SELECT * FROM projects
...🎉 Lesson Complete
- ✅ Multi-tenancy = many customers, one system, strict data isolation
- ✅ Shared schema uses a
tenant_iddiscriminator and must scope every query - ✅ A forgotten
tenant_idfilter is a cross-tenant data leak - ✅ Row-Level Security makes the database enforce the tenant filter automatically
- ✅ Schema- and database-per-tenant trade cost for stronger isolation and easier compliance
- ✅ Next: data warehousing — star and snowflake schemas for analytics
Sign up for free to track which lessons you've completed and get learning reminders.