JSON & Semi-Structured Data in SQL
Store, query, modify, and index JSON data in PostgreSQL and MySQL for flexible schema designs.
🎯 What You'll Learn
- JSON vs JSONB and when to use each
- Querying nested JSON with operators and JSON path
- Modifying JSONB: set, remove, merge, and append
- GIN indexing strategies for fast JSON queries
- Hybrid schema design: columns + JSONB for flexibility
📦 JSONB Basics
Think of JSONB as a flexible drawer where each row can store differently-shaped data. An event log might have purchase events with items and shipping, signup events with user info, and error events with stack traces — all in the same table.
💡 Pro Tip — Always Use JSONB, Not JSON
JSONB is stored in a decomposed binary format that's faster to query, supports indexing, and removes duplicate keys. The only advantage of JSON is preserving exact whitespace and key order — which you almost never need.
JSONB Fundamentals
Store and access JSON data with arrow operators
-- PostgreSQL: JSON vs JSONB
-- JSON: stored as text, preserves formatting, slower queries
-- JSONB: binary format, faster queries, supports indexing
-- Rule: Almost always use JSONB
CREATE TABLE events (
id SERIAL PRIMARY KEY,
event_type VARCHAR(50),
payload JSONB NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
INSERT INTO events (event_type, payload) VALUES
('purchase', '{
"customer_id": 42,
"items": [
{"product": "Laptop", "price": 999.99, "qty": 1},
...🔎 Advanced Querying
JSONB supports powerful operators for containment checks, key existence, and JSON path expressions — all of which can use GIN indexes for speed.
JSONB Query Operators
Containment, existence, and JSON path queries
-- Advanced JSONB querying
-- Containment operator @> (does JSON contain this subset?)
SELECT * FROM events
WHERE payload @> '{"shipping": {"method": "express"}}';
-- Existence operator ? (does key exist?)
SELECT * FROM events
WHERE payload ? 'coupon'; -- has a coupon field
-- ?| any of these keys exist
SELECT * FROM events
WHERE payload ?| array['coupon', 'discount'];
-- ?& all of these keys exist
SELECT * FROM events
WHERE payload ?& array['customer_id', 'items'];
-- JSON path queries (P
...✏️ Modifying JSONB
Unlike regular columns, JSONB lets you update individual fields without rewriting the entire value. You can set, remove, merge, and append to nested structures.
⚠️ Common Mistake
jsonb_set creates the key if the path exists but the key doesn't. However, it won't create intermediate objects. jsonb_set('{}', '{a,b,c}', '"val"') fails if a or a.b don't exist.
Modifying JSONB
Set, remove, merge, and build JSON
-- Modifying JSONB data
-- Set/update a field:
UPDATE events
SET payload = jsonb_set(payload, '{status}', '"processed"')
WHERE id = 1;
-- Set a nested field:
UPDATE events
SET payload = jsonb_set(payload, '{shipping,tracked}', 'true')
WHERE id = 1;
-- Remove a key:
UPDATE events
SET payload = payload - 'coupon'
WHERE id = 1;
-- Remove a nested key:
UPDATE events
SET payload = payload #- '{shipping,cost}'
WHERE id = 1;
-- Merge/concatenate JSONB (|| operator):
UPDATE events
SET payload = pay
...⚡ Indexing JSONB
Without indexes, JSONB queries do full table scans. GIN indexes make containment and existence operators fast. Expression indexes target specific fields.
JSONB Indexing
GIN, expression, and partial indexes for JSON
-- Indexing JSONB for performance
-- 1. GIN index on entire JSONB column
-- Supports: @>, ?, ?|, ?&, @@ operators
CREATE INDEX idx_events_payload
ON events USING gin (payload);
-- Now this is fast:
SELECT * FROM events
WHERE payload @> '{"shipping": {"method": "express"}}';
-- 2. GIN with jsonb_path_ops (smaller, faster for @>)
CREATE INDEX idx_events_payload_path
ON events USING gin (payload jsonb_path_ops);
-- 3. B-Tree index on a specific JSON field
-- Best for: equality and range queries
...🏗️ Schema Design Patterns
The best designs are hybrid: structured columns for data you query and join on, JSONB for flexible metadata. This gives you the performance of relational + the flexibility of document databases.
Hybrid Schema Design
When to use columns vs JSONB
-- Schema design: when to use JSONB vs columns
-- ✅ USE JSONB when:
-- • Schema varies per row (event payloads, form submissions)
-- • You're storing third-party API responses
-- • Nested/array data that doesn't fit relational model
-- • Rapid prototyping (schema changes without migrations)
-- ❌ USE COLUMNS when:
-- • Data is queried/filtered frequently
-- • Data has a consistent structure
-- • You need foreign keys or constraints
-- • Joins reference these fields
-- HYBRID approach (best of
...📋 Quick Reference
| Operator | Purpose |
|---|---|
| -> | Get JSON element (returns JSON) |
| ->> | Get JSON element as text |
| @> | Contains (left ⊇ right) |
| ? | Key exists |
| || | Concatenate/merge JSONB |
| jsonb_set() | Set a value at a path |
🎉 Lesson Complete!
You now know how to store, query, and index JSON data in SQL. Next, explore advanced data types like arrays, HSTORE, and geography!
Sign up for free to track which lessons you've completed and get learning reminders.