Advanced Track
Advanced Data Types: Arrays, HSTORE, UUID, ENUM & Ranges
PostgreSQL goes far beyond text and numbers. By the end of this lesson you'll store lists, key-value pairs, globally-unique ids, fixed label sets, and intervals natively — and query them with the special operators (ANY, @>, &&, ->) that make these types worth using.
What You'll Learn
- ✓Store lists in one column with ARRAY types
- ✓Test membership with ANY, @> and overlap with &&
- ✓Expand arrays into rows with unnest()
- ✓Keep flat key-value data in an HSTORE column
- ✓Generate globally-unique ids with UUID + gen_random_uuid()
- ✓Constrain values with ENUM and model intervals with range types (&&)
CREATE TABLE, and INSERT. These types are PostgreSQL features — most don't exist in SQLite or MySQL (engine notes are called out as you go). The in-browser editor lets you write SQL; to actually run these examples, copy them into a Postgres playground such as db-fiddle.com (pick PostgreSQL) or onecompiler.com/postgresql. Every example shows its expected result so you can check yourself. Our Sample Table: articles
The array, unnest, and challenge examples all run against this small articles table. The tags column is a TEXT[] — an array of text inside a single cell.
Result:
| id | title | tags |
|---|---|---|
| 1 | Intro to SQL | {sql,beginner} |
| 2 | Postgres Arrays | {sql,postgres,advanced} |
| 3 | Cooking Pasta | {food,italian,dinner} |
| 4 | Untagged Draft | {} |
1. ARRAY Columns — Many Values in One Cell
An array column stores a list of values inside a single cell. Add [] to any type — int[], TEXT[], BOOLEAN[] — and that column now holds a list instead of one value. It's perfect for tags, role names, or any short list that "belongs" to a row.
🏷️ Real-world analogy
A blog post's tags are like the stickers on a parcel — several labels stuck to one box. An array column lets you keep all those labels on the row itself, instead of building a whole separate "tags" table.
Create a table with an array column
A TEXT[] column with two ways to write array literals.
-- An array column stores MANY values in one cell.
-- Add [] to any type to make it an array of that type.
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT,
tags TEXT[] NOT NULL DEFAULT '{}' -- a text array; default = empty {}
);
INSERT INTO articles (title, tags) VALUES
('Intro to SQL', ARRAY['sql','beginner']), -- ARRAY[...] literal
('Postgres Arrays', '{sql,postgres,advanced}'), -- '{...}' literal
('Cooking Pasta', ARRAY['food','italia
...Result:
| title | tags |
|---|---|
| Intro to SQL | {sql,beginner} |
| Postgres Arrays | {sql,postgres,advanced} |
| … | … |
The real power is in querying arrays. The two questions you'll ask most are "is this value in the array?" and "do these two arrays share anything?" — answered by ANY/@> and && respectively.
💡 Arrays are 1-indexed
Unlike most programming languages, PostgreSQL arrays start at index 1, not 0. So tags[1] is the first element, and tags[0] just returns NULL.
Query arrays: ANY, @> and &&
Membership, containment, overlap, and array functions.
-- Read an element by POSITION (Postgres arrays start at 1, not 0!):
SELECT title, tags[1] AS first_tag FROM articles;
-- "Is this value anywhere in the array?" — two equivalent ways:
SELECT title FROM articles WHERE 'sql' = ANY(tags); -- ANY: value vs each element
SELECT title FROM articles WHERE tags @> ARRAY['sql']; -- @> : array CONTAINS array
-- "Do the arrays share ANY value?" — the && overlap operator:
SELECT title FROM articles WHERE tags && ARRAY['food','sql'];
-- Handy array fu
...Result — WHERE 'sql' = ANY(tags):
| title |
|---|
| Intro to SQL |
| Postgres Arrays |
unnest() turns an array into rows — one row per element. This is the bridge back to "normal" SQL: once values are rows, you can GROUP BY, JOIN, or count them like any other column.
unnest() — array to rows
Explode tags into rows, then count tag usage.
-- unnest() explodes an array into ONE ROW PER ELEMENT.
-- Useful for grouping, counting, or joining on individual values.
SELECT title, unnest(tags) AS tag
FROM articles;
-- Intro to SQL | sql
-- Intro to SQL | beginner
-- Postgres Arrays | sql
-- Postgres Arrays | postgres ...
-- Combine with GROUP BY to count how often each tag is used:
SELECT tag, COUNT(*) AS uses
FROM articles, unnest(tags) AS tag
GROUP BY tag
ORDER BY uses DESC, tag;Result — tag usage counts:
| tag | uses |
|---|---|
| sql | 2 |
| advanced | 1 |
| beginner | 1 |
| … | … |
GIN index: CREATE INDEX ON articles USING gin (tags);. It makes @>, &&, and ANY lookups quick instead of scanning every row.Your Turn: rows where the array contains a value
Fill in the blank so the query returns every article tagged 'sql'. The expected result is in the comments so you can check yourself.
🎯 Your Turn: tagged 'sql'
Replace the ___ with the membership operator.
-- 🎯 YOUR TURN — fill in the blank, then press "Try it Yourself"
-- Goal: list every article that is tagged 'sql'.
SELECT title
FROM articles
WHERE 'sql' = ___(tags); -- 👉 the operator that tests value-vs-each-element
-- ✅ Expected result (2 rows): Intro to SQL, Postgres Arrays
-- (Bonus: the same result comes from WHERE tags @> ARRAY['sql'])2. HSTORE — Flat Key-Value Pairs
HSTORE stores a set of key => value string pairs in one column — think of it as a tiny dictionary per row. It's great for sparse, optional settings (a theme here, a language there) where adding a real column for each would be wasteful.
HSTORE is an extension, so you enable it once per database with CREATE EXTENSION IF NOT EXISTS hstore;. Read a value with ->, test for a key with ?, and merge/update keys with ||.
JSONB — it's the modern default for semi-structured data. Reach for HSTORE only for simple, flat key-value lookups.HSTORE key-value column
Store, read (->), test (?), and update (||) keys.
-- HSTORE stores flat KEY => VALUE string pairs in one column.
-- It is a Postgres extension, so enable it once per database:
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE TABLE profiles (
user_id INT PRIMARY KEY,
preferences HSTORE -- e.g. theme, language, font_size
);
INSERT INTO profiles (user_id, preferences) VALUES
(1, 'theme => dark, language => en'),
(2, 'theme => light, language => fr, beta => on');
-- -> reads the value for a key (NULL if the key is missing
...Result — preferences -> 'theme' / 'language':
| user_id | theme | lang |
|---|---|---|
| 1 | dark | en |
| 2 | light | fr |
3. UUID — Globally Unique Identifiers
A UUID is a 128-bit identifier such as a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11. The point is that it's unique everywhere, with no central counter — two different servers can each mint ids that will never collide. Set a column's DEFAULT to gen_random_uuid() and Postgres fills in a fresh value on every insert.
Sequential SERIAL ids leak information (id 1042 tells the world you have ~1042 rows) and require the database to hand out the next number. UUIDs are unguessable and can be generated in your app before the row even exists.
UUID primary key with a default
gen_random_uuid() as a DEFAULT; trade-offs vs SERIAL.
-- A UUID is a 128-bit id like a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11.
-- gen_random_uuid() (built in on modern Postgres) makes a fresh one.
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(), -- auto-filled id
customer TEXT,
total NUMERIC(10,2),
created_at TIMESTAMP DEFAULT now()
);
-- You don't supply id — the DEFAULT generates it for you:
INSERT INTO orders (customer, total) VALUES ('Ada', 49.99)
RETURNING id; -- returns the new UUID, e.g.
...⚠️ Random UUIDs can hurt index performance
Random UUIDv4 values scatter across the B-Tree index, so inserts touch random pages and slow down on huge tables. Time-ordered UUIDv7 (Postgres 18 has uuidv7() built in) inserts in order like SERIAL while staying globally unique — the best default for new primary keys.
4. ENUM — A Fixed Set of Labels
An ENUM is a custom type whose value must be one of a fixed list of labels — like 'pending', 'shipped', 'delivered'. It both documents the allowed values and rejects anything else at write time, so a typo like 'shiped' errors instead of quietly saving bad data.
A nice bonus: ENUM values sort in the order you defined them, not alphabetically — so ORDER BY status naturally goes pending → shipped → delivered.
ENUM custom type
Define labels, enforce them, and sort in definition order.
-- An ENUM is a custom type limited to a fixed list of labels.
-- It documents the allowed values AND enforces them.
CREATE TYPE order_status AS ENUM ('pending', 'shipped', 'delivered', 'cancelled');
CREATE TABLE shipments (
id SERIAL PRIMARY KEY,
status order_status NOT NULL DEFAULT 'pending'
);
INSERT INTO shipments (status) VALUES ('pending'), ('shipped'), ('delivered');
-- ENUMs compare and SORT in their DEFINITION order, not alphabetically:
SELECT status FROM shipments ORDER
...Result — ORDER BY status (definition order):
| status |
|---|
| pending |
| shipped |
| delivered |
5. Range Types — Intervals as One Value
A range type stores an interval — a low and a high bound — in a single value. Built-ins include int4range, numrange, daterange, and tsrange. The bracket style sets inclusivity: [ includes the bound, ) excludes it, so '[2024-06-01,2024-06-05)' covers June 1–4.
The killer operator is && (overlap): "do these two ranges intersect?" That single check answers "is this room already booked for those dates?" — the heart of every scheduling and booking system.
EXCLUDE USING gist (room WITH =, stay WITH &&) rejects any new row whose room matches and whose stay overlaps an existing one.Range types: @> and && overlap
daterange bookings, contains a date, and overlap detection.
-- A range type stores an interval (a low and high bound) in one value.
-- Built-ins include int4range, numrange, daterange, tsrange.
-- Brackets set inclusivity: [ = inclusive, ) = exclusive.
SELECT
int4range(1, 10) AS ints, -- 1..9 (10 excluded)
'[2024-06-01,2024-06-05)'::daterange AS stay; -- Jun 1..4
-- A booking table where each stay is a daterange:
CREATE TABLE bookings (
id SERIAL PRIMARY KEY,
room INT,
stay daterange
);
INSERT INTO bookings (room
...Result — WHERE stay && '[2024-06-04,2024-06-06)':
| room | stay |
|---|---|
| 101 | [2024-06-01,2024-06-05) |
| 101 | [2024-06-05,2024-06-10) |
Your Turn: UUID default + range overlap
Two blanks: give the table a self-filling UUID id, then find the slots that overlap a given range. The expected result is in the comments.
🎯 Your Turn: UUID default & &&
Fill the UUID function and the overlap operator.
-- 🎯 YOUR TURN — fill in the two blanks, then press "Try it Yourself"
-- Part A: give the table a UUID id that fills itself in automatically.
CREATE TABLE tickets (
id UUID PRIMARY KEY DEFAULT ___(), -- 👉 the random-UUID function
slot daterange
);
INSERT INTO tickets (slot) VALUES
('[2024-07-01,2024-07-03)'),
('[2024-07-05,2024-07-08)');
-- Part B: find slots that clash with Jul 2 – Jul 6.
SELECT slot FROM tickets
WHERE slot ___ '[2024-07-02,2024-07-06)'::daterange; -- 👉
...A quick word on Geography (PostGIS)
For maps and location data there's the PostGIS extension, which adds GEOGRAPHY and GEOMETRY types for points, lines, and polygons on the Earth's surface. After CREATE EXTENSION postgis; you can store a coordinate as GEOGRAPHY(POINT, 4326) and ask spatial questions:
ST_DWithin(a, b, 2000)— is pointawithin 2,000 metres ofb? (great for "near me")ST_Distance(a, b)— how far apart, in metres.- A
GiSTindex plus the<->distance operator powers fast "nearest 5" queries.
PostGIS is a deep topic of its own — just know it exists so you never store lat/lng as two plain numbers and reinvent distance maths by hand.
Common Errors (and the fix)
- Array off-by-one: reaching for
tags[0]returnsNULL, not the first element. PostgreSQL arrays are 1-indexed — usetags[1]. - NULL hiding in an array:
'x' = ANY(ARRAY['a', NULL])is notfalse— a NULL element makes the resultNULL(treated as not-true), so rows can silently vanish. Keep arrays NULL-free, or filter witharray_remove(tags, NULL). - "function gen_random_uuid() does not exist": on older Postgres, enable it with
CREATE EXTENSION IF NOT EXISTS pgcrypto;(it's built in from Postgres 13+). - UUID vs SERIAL trade-off: don't default to UUID for everything — they're 4× the storage and random v4 hurts index locality. Use
SERIAL/IDENTITYfor internal tables; reach for UUID (ideally v7) when ids cross systems or appear in URLs. - ENUM migration pain: you can
ALTER TYPE ... ADD VALUE, but you cannot easily remove or reorder a label — that needs recreating the type and rewriting the column. Plan the label list up front, or use a lookup table + foreign key when it'll change often. - "operator does not exist: text[] @> text":
@>compares an array to an array. Writetags @> ARRAY['sql'], nottags @> 'sql'.
📘 Quick Reference
| Syntax | Purpose |
|---|---|
| TEXT[] / int[] | An array column (a list in one cell) |
| val = ANY(arr) | Is val anywhere in the array? |
| arr @> ARRAY[...] | Array contains all of these values |
| arr && ARRAY[...] | Arrays overlap (share any value) |
| unnest(arr) | Expand an array into one row per element |
| hs -> 'key' | Read an HSTORE value (NULL if missing) |
| hs ? 'key' | Does the HSTORE key exist? |
| gen_random_uuid() | Generate a random UUID (use as DEFAULT) |
| CREATE TYPE … AS ENUM | Custom type limited to fixed labels |
| daterange / int4range | An interval value (low..high) |
| range && range | Do two ranges overlap? (booking check) |
| lower(r) / upper(r) | Pull the bounds out of a range |
Frequently Asked Questions
Q: Should I use an array column or a separate table?
Use an array for a short, simple list that's read together and rarely queried on its own (e.g. tags). Use a separate "child" table when the items need their own columns, constraints, or heavy querying. Arrays trade relational flexibility for convenience.
Q: HSTORE or JSONB for settings?
Prefer JSONB for almost everything new — it supports nesting, arrays, numbers, and richer operators. Choose HSTORE only when your data is genuinely flat string-to-string and you want the absolute simplest key-value store.
Q: Are UUIDs always better than auto-increment ids?
No. UUIDs win when ids must be unique across systems, generated client-side, or exposed in URLs. For a single internal database, SERIAL/IDENTITY is smaller and faster. If you do want UUIDs, prefer time-ordered UUIDv7 to keep index performance.
Q: Why does tags[0] give me NULL?
Because PostgreSQL arrays start at index 1, not 0. The first element is tags[1]; index 0 is simply out of range and returns NULL.
Q: Do these types work in MySQL or SQLite?
Mostly no — arrays, HSTORE, ENUM types, ranges and PostGIS are Postgres features. MySQL has its own ENUM and JSON columns; SQLite has neither. If portability matters, model lists/settings with JSON or join tables instead.
Mini-Challenge: Tag Detective
Put it together — a brief, a blank canvas, and the expected result in the comments. Write it, then run it in a Postgres playground to confirm.
🎯 Mini-Challenge
Find multi-tag articles with @>, then count tag usage with unnest.
-- 🎯 MINI-CHALLENGE
-- Using ONLY this lesson's ideas (arrays, ANY/@>, unnest, ranges/&&):
-- 1. From the 'articles' table, find every article tagged 'sql'
-- AND tagged 'beginner' at the same time.
-- (Hint: @> can take an array with MORE than one value — tags @> ARRAY[...].)
-- 2. Then, separately, list each distinct tag with how many articles use it,
-- most-used first.
--
-- ✅ Expected #1 (1 row): Intro to SQL
-- ✅ Expected #2: sql | 2 , then the rest with 1 each
-- your
...🎉 Lesson Complete
- ✅
type[]array columns store lists; query withANY,@>,&&and explode withunnest() - ✅
HSTOREholds flat key-value pairs (->,?,||) — but JSONB is the modern default - ✅
UUID+gen_random_uuid()gives globally-unique ids; prefer UUIDv7 for primary keys - ✅
ENUMconstrains a column to fixed labels that sort in definition order - ✅ Range types model intervals;
&&detects overlaps (and exclusion constraints stop double-bookings) - ✅ Next: Database Security — roles, privileges, and row-level security
Sign up for free to track which lessons you've completed and get learning reminders.