Lesson 27 • Advanced
Data Classes & Advanced Class Patterns
Master the powerful @dataclass decorator and advanced class patterns that professional engineers use when building large Python systems. Learn to create efficient, immutable, and production-ready data models used in APIs, ML pipelines, and enterprise architectures.
What You'll Learn
This comprehensive lesson teaches you everything professional engineers use when building large Python systems:
- ✔ Mastering @dataclass
- ✔ Slots, immutability & performance
- ✔ Default values & factories
- ✔ Post-init processing
- ✔ Comparison & ordering
- ✔ Frozen models for safety
- ✔ Class patterns used in real architectures
- ✔ Mixing dataclasses with OOP, typing, inheritance
- ✔ How frameworks like FastAPI, Pydantic & ORMs use these ideas
🔥 1. Why Dataclasses Exist
Before Python 3.7, writing classes was repetitive:
Traditional Class (Before Dataclasses)
class User:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f"User(name={self.name}, age={self.age})"
def __eq__(self, other):
return self.name == other.name and self.age == other.ageDataclasses remove the boilerplate:
Modern Dataclass
from dataclasses import dataclass
@dataclass
class User:
name: str
age: intAutomatically provides:
- ✔ __init__
- ✔ __repr__
- ✔ __eq__
- ✔ type-hint support
- ✔ defaults
- ✔ ordering
- ✔ immutability support
This is why dataclasses became standard in production systems.
⚙️ 2. Creating Dataclasses
The simplest structure:
Basic Dataclass
from dataclasses import dataclass
@dataclass
class Product:
id: int
price: float
name: strYou get:
- full constructor
- debug-friendly repr
- equality comparison
🧠 3. Default Values
Just assign values:
Default Values
@dataclass
class User:
name: str
active: bool = TrueBut never use mutable defaults!
❌ BAD:
Bad Mutable Default
@dataclass
class Bad:
tags: list[str] = [] # All instances share the same list!✔ FIX:
Correct Default Factory
from dataclasses import field
@dataclass
class Good:
tags: list[str] = field(default_factory=list)default_factory is critical for safe dataclass design.
🧩 4. Post-Init Processing
Sometimes you need validation or computed attributes.
Use __post_init__:
Post-Init Processing
@dataclass
class User:
name: str
age: int
def __post_init__(self):
if self.age < 0:
raise ValueError("Age cannot be negative")Used in:
- schema validation
- API models
- ML configs
- database models
⚡ 5. Making Dataclasses Immutable (Frozen Models)
Frozen Dataclass
@dataclass(frozen=True)
class Config:
host: str
port: intNow:
- No attribute changes allowed
- Hashable
- Can be dict keys
- Safe for caching
Frozen dataclasses behave like lightweight value objects (DDD concept).
Used for:
- ✔ settings
- ✔ user identity models
- ✔ cache keys
- ✔ configuration objects
🔄 6. Ordering & Comparison
Add ordering:
Ordered Dataclass
@dataclass(order=True)
class Score:
points: int
player: strAutomatically gives: <, <=, >, >=
Used in:
- ✔ leaderboards
- ✔ sorting jobs
- ✔ priority queues
- ✔ scheduling systems
📦 7. Dataclasses + Type Hints (Power Combo)
Dataclasses work perfectly with typing tools like MyPy, Pyright, and IDE autocomplete.
Typed Dataclass
@dataclass
class Item:
id: int
price: float
tags: list[str]Your entire codebase becomes clearer, safer, faster to maintain.
🧬 8. Slots Dataclasses (Big Performance Boost)
Python normally stores instance values in a dictionary (__dict__).
Slots remove the dict and store variables in fixed memory locations.
Slots Dataclass
@dataclass(slots=True)
class User:
id: int
name: strBenefits:
- ✔ 50–70% less memory
- ✔ faster attribute access
- ✔ ideal for thousands/millions of objects
Used in:
- game engines
- ML vector operations
- high-performance APIs
- real-time systems
🔥 9. Inheritance With Dataclasses
Simple:
Dataclass Inheritance
@dataclass
class Base:
id: int
@dataclass
class User(Base):
name: strRules to know:
- Parent fields go first
- Child fields come after
- Use keyword-only fields if needed
🧱 10. Frozen + Slots (Enterprise Pattern)
Frozen + Slots
@dataclass(frozen=True, slots=True)
class Vector2:
x: float
y: floatThis pattern gives:
- ✔ immutability
- ✔ low memory
- ✔ high performance
- ✔ thread safety
- ✔ predictable behavior
This is used heavily in:
- physics engines
- rendering pipelines
- finance systems
- crypto hashing models
📊 11. Dataclasses vs NamedTuple vs Pydantic
Dataclasses
- good defaults
- easy
- flexible
- great for general Python
NamedTuple
- immutable
- memory small
- tuple-like
Pydantic
- full validation
- serialization
- great for APIs
A backend system often uses all three, depending on needs.
🎮 12. Real Project Example — Inventory Item
Inventory Item
@dataclass(slots=True)
class InventoryItem:
id: int
name: str
quantity: int = 0
tags: list[str] = field(default_factory=list)
def add_stock(self, amount: int):
self.quantity += amountThis structure is suitable for:
- games
- ecommerce
- warehouse systems
- TikTok Shop automation
🧪 13. Real Project Example — API Request Model
API Request Model
@dataclass(frozen=True)
class CreateUserRequest:
email: str
password: strThis mirrors real FastAPI/Pydantic usage but with pure dataclasses.
🎯 14. Real Project Example — ML Config
ML Config
@dataclass
class TrainConfig:
batch_size: int
lr: float
epochs: int = 10
optimizer: str = "adam"Dataclasses are used massively in ML research tools like:
- PyTorch Lightning
- HuggingFace Transformers
- TensorFlow configs
🔥 15. Field Customization (metadata, repr, compare, init control)
Every field in a dataclass can be finely controlled:
Field Customization
from dataclasses import dataclass, field
@dataclass
class User:
id: int
name: str = field(repr=True)
password: str = field(repr=False) # hide sensitive info
cache: dict = field(default_factory=dict, compare=False)What these do:
repr=False→ hides field in debug printscompare=False→ excluded from equalityinit=False→ not settable in constructordefault_factory=...→ safe mutable defaultmetadata={...}→ pass additional data for frameworks
Metadata example (used in FastAPI/Pydantic-style schemas):
Field Metadata
@dataclass
class Product:
id: int = field(metadata={"description": "Unique ID"})
price: float = field(metadata={"currency": "GBP"})This allows libraries to generate automatic documentation.
⚙️ 16. Keyword-Only & Positional-Only Fields
Python supports forcing fields to be keyword-only:
Keyword-Only Fields
@dataclass
class Config:
host: str
port: int
*,
secure: bool = False
# Usage:
Config("localhost", 8000, secure=True)
# Not allowed:
# Config("localhost", 8000, True) # ❌ keyword-only violationWhy useful?
- ✔ prevents mistakes
- ✔ improves API clarity
- ✔ used heavily in frameworks
🧠 17. Dataclass Factories (Dynamic Dataclass Creation)
You can generate dataclasses at runtime:
Dynamic Dataclass Creation
from dataclasses import make_dataclass
User = make_dataclass(
"User",
[("name", str), ("age", int, field(default=0))]
)
u = User("Bob", 20)Useful for:
- dynamic APIs
- schema generation
- plugin systems
- reading DB table structure and generating models
This is an enterprise-level technique.
🔄 18. Inheritance Pitfalls & Solutions
Dataclasses + inheritance can get tricky.
PROBLEM 1: Parent fields come before child fields
Inheritance Works
@dataclass
class A:
x: int = 1
@dataclass
class B(A):
y: int = 2
# Works fine.PROBLEM 2: Parent has default values but child doesn't
Inheritance Pitfall Fix
@dataclass
class A:
x: int = 1
@dataclass
class B(A):
y: int # ❌ Error: non-default after default
# FIX: Use keyword-only fields or provide defaults
@dataclass
class B(A):
y: int = field(default=0)🧱 19. Mixing Dataclasses With OOP
Dataclasses are not a replacement for OOP — they enhance it.
Example with methods + state:
Dataclass with Methods
@dataclass
class BankAccount:
owner: str
balance: float = 0.0
def deposit(self, amount: float):
self.balance += amount
def withdraw(self, amount: float):
if amount > self.balance:
raise ValueError("Insufficient funds")
self.balance -= amountUsed in:
- business logic
- simulations
- game mechanics
- backend models
📦 20. Dataclasses + Abstract Base Classes (ABC)
Combine clean models with abstraction:
Dataclass with ABC
from dataclasses import dataclass
from abc import ABC, abstractmethod
class Shape(ABC):
@abstractmethod
def area(self) -> float:
...
@dataclass
class Circle(Shape):
radius: float
def area(self):
return 3.14 * self.radius**2This pattern powers plugin systems, physics engines, rendering systems, etc.
🧩 21. Immutable Value Objects (Enterprise Architecture)
In Domain-Driven Design (DDD), models like Money, Weight, Coordinates, Identity, Version should be immutable.
Dataclasses make it easy:
Immutable Value Object
@dataclass(frozen=True, slots=True)
class Money:
amount: float
currency: strBenefits:
- ✔ thread-safe
- ✔ hashable
- ✔ no accidental changes
- ✔ predictable logic
Used in finance & high-risk systems.
🧬 22. Dataclasses for Validation-Like Behavior
While not as powerful as Pydantic, you can build lightweight validators:
Validation in Dataclass
@dataclass
class User:
email: str
age: int
def __post_init__(self):
if "@" not in self.email:
raise ValueError("Invalid email")
if self.age < 0:
raise ValueError("Age must be positive")Useful for:
- form data
- API requests
- game character creation
- configuration files
📚 23. Dataclasses as DTOs (Data Transfer Objects)
DTOs move data between layers:
DTO Pattern
@dataclass
class UserDTO:
id: int
email: str
premium: boolFrameworks like Django, Flask, FastAPI use DTO patterns everywhere.
🚀 24. Dataclasses + JSON Serialization
Using asdict():
JSON Serialization
from dataclasses import asdict
@dataclass
class User:
name: str
age: int
u = User("Sam", 30)
print(asdict(u))
# Output: {'name': 'Sam', 'age': 30}For nested dataclasses:
- ✔ everything converts cleanly
- ✔ ready for APIs or file storage
🧵 25. Frozen Dataclasses + Hashing
Use frozen to make objects hashable:
Frozen + Hashing
@dataclass(frozen=True)
class Point:
x: int
y: int
# Now:
s = {Point(1, 2), Point(1, 2)}
# Only one entry exists.Used in:
- caching layers
- memoization
- graph algorithms
🕹 26. Advanced Pattern — Rich Models With Methods + Validation
Example combining: slots, frozen, methods, computed properties
Rich Frozen Model
@dataclass(slots=True, frozen=True)
class Rectangle:
width: float
height: float
@property
def area(self):
return self.width * self.height
def scale(self, factor: float):
return Rectangle(self.width * factor, self.height * factor)Immutable models like this are ideal for:
- geometry
- UI layout engines
- game engines
- simulation models
🎮 27. Real Project Example — E-Commerce Order Model
E-Commerce Order Model
@dataclass
class OrderItem:
product_id: int
quantity: int
price: float
@property
def total(self):
return self.quantity * self.priceUsed in:
- Shopify clones
- TikTok Shop bots
- Amazon FBA automation
🔥 28. Slots + Dataclasses — High-Performance Python
Adding slots=True dramatically reduces memory usage and speeds attribute access.
High-Performance Slots
@dataclass(slots=True)
class Particle:
x: float
y: float
z: floatBenefits of slots:
- ✔ Objects use ~30–40% less memory
- ✔ Faster attribute lookup
- ✔ Prevents accidental new attributes
- ✔ Ideal for millions of objects (games, simulations, ML features)
Used in:
- physics engines
- particle systems
- real-time simulations
- large-scale data models
⚙️ 29. Combining Frozen + Slots (Ultimate Efficiency)
A frozen & slotted dataclass is: immutable, hashable, extremely memory efficient, and extremely fast.
Ultimate Efficiency Pattern
@dataclass(frozen=True, slots=True)
class Vector:
x: float
y: float
z: floatUsed in:
- ✔ AI vector embeddings
- ✔ 3D game engines
- ✔ robotics simulations
- ✔ mathematical modeling
This is production-grade performance tuning.
🧠 30. Overriding post_init in Frozen Dataclasses
Frozen normally blocks all changes — but you can bypass immutability inside __post_init__:
Frozen post_init Override
@dataclass(frozen=True)
class User:
name: str
email: str
def __post_init__(self):
if "@" not in self.email:
raise ValueError("Invalid email")
object.__setattr__(self, "email", self.email.lower())This trick allows:
- ✔ normalization
- ✔ validation
- ✔ canonical formatting
- ✔ hidden transformations
Used by: FastAPI, Pydantic, ORMs, Serializers
object.__setattr__() trick bypasses frozen constraints during initialization only. Test with Python installed.🔧 31. Rich Comparison & Ordering
Dataclasses let you customize how objects compare.
Basic Ordering
@dataclass(order=True)
class Score:
points: int
player: strNow objects have: <, >, <=, >=, ==
Used in: ranking systems, leaderboards, sorting algorithms, priority queues
For custom logic:
Custom Sort Index
@dataclass(order=True)
class Product:
sort_index: float = field(init=False, repr=False)
price: float
rating: float
def __post_init__(self):
object.__setattr__(self, "sort_index", self.price / self.rating)Sorts products by value per rating.
📦 32. Converting Between Models (DTO ↔ Entity)
Dataclasses shine when mapping: database rows → Python objects, API requests → models, ML preprocessing → features
DTO to Entity Conversion
@dataclass
class UserEntity:
id: int
name: str
email: str
@dataclass
class UserDTO:
name: str
email: str
def to_entity(self, id: int):
return UserEntity(id=id, name=self.name, email=self.email)Used in: backend microservices, data ingestion pipelines, enterprise systems
🧬 33. Nested Dataclasses (Deep Structured Data)
Nested Dataclasses
@dataclass
class Address:
city: str
postcode: str
@dataclass
class Customer:
name: str
address: Address
# asdict() automatically handles nested structures
from dataclasses import asdict
print(asdict(Customer("Bob", Address("London", "SW1"))))
# Output: {'name': 'Bob', 'address': {'city': 'London', 'postcode': 'SW1'}}Perfect for JSON APIs.
asdict() function works recursively with nested dataclasses. Test with Python installed.🧵 34. Dataclasses + Thread Safety
Dataclasses are not thread-safe by default. To create safe models:
Thread-Safe Dataclass
from threading import Lock
from dataclasses import dataclass, field
@dataclass
class Counter:
value: int = 0
lock: Lock = field(default_factory=Lock, repr=False)
def increment(self):
with self.lock:
self.value += 1Used in: async job systems, game engine ticks, analytics counters, concurrent caches
🧩 35. Advanced Pattern — Config Objects (Immutable + Validated)
Real systems use typed configuration objects.
Immutable Config Object
@dataclass(frozen=True)
class AppConfig:
env: str
debug: bool
db_url: str
def __post_init__(self):
if self.env not in {"dev", "prod"}:
raise ValueError("Invalid environment")Benefits:
- ✔ safer than dictionaries
- ✔ fully typed
- ✔ validated once at startup
Used in: FastAPI projects, internal developer tools, cloud services
⚡ 36. Dataclasses + Caching Layers
Make models hashable → usable as cache keys.
Caching with Dataclasses
from functools import lru_cache
@dataclass(frozen=True)
class Query:
user_id: int
limit: int
@lru_cache(maxsize=1000)
def get_user_feed(query: Query):
...Used in: feed ranking systems, recommendation engines, caching APIs
🧠 37. Dataclasses in Clean Architecture (DDD)
Domain-Driven Design heavily uses dataclasses for: Value objects, Entities, Aggregates, DTOs, Commands, Events
Example event object:
DDD Event Object
@dataclass(frozen=True)
class UserRegistered:
user_id: int
email: str
timestamp: floatUsed in: Kafka event streams, cloud-native apps, CQRS systems
🔥 38. Dataclasses as Event Objects (Message Buses)
Perfect for internal event buses:
Event Object for Message Bus
@dataclass(frozen=True)
class OrderPlaced:
order_id: int
user_id: int
total: floatEvent handlers consume these structured dataclass messages.
🧱 39. Serialization Hooks (post_init + getstate)
Customize pickling/serialization:
Serialization Hooks
@dataclass
class Session:
user: str
token: str
def __getstate__(self):
return {"user": self.user}Used in: caching, distributed systems, multiprocessing
🚀 40. Combining Dataclasses With Polymorphism
Polymorphic Dataclasses
@dataclass
class Vehicle:
speed: int
@dataclass
class Car(Vehicle):
seats: int
@dataclass
class Truck(Vehicle):
capacity: intUseful for: game engines, simulation systems, logistics modeling
🎮 41. Dataclasses in Game Development
Used for:
- ✔ entity stats
- ✔ world state
- ✔ physics data
- ✔ event messages
- ✔ networked packets
Game NPC Dataclass
@dataclass(slots=True)
class NPC:
name: str
hp: int
position: tupleExtremely efficient for large worlds (like Minecraft entities).
🧊 42. Dataclasses for Tensor Metadata (ML Workflows)
Track preprocessing:
Tensor Metadata
@dataclass
class TensorMeta:
shape: tuple
dtype: str
source: strUsed in: ML pipelines, dataset loaders, feature engineering
📌 43. Best Practices Summary (Elite Level)
- ✔ Use
slots=Truefor performance - ✔ Use
frozen=Truefor immutability & hashability - ✔ Validation belongs in
__post_init__ - ✔ Use dataclasses for DTOs, configs, events, domain models
- ✔ Avoid heavy logic → keep models lightweight
- ✔ Use factories or ABCs for polymorphism
- ✔ Prefer nested dataclasses for structured data
- ✔ Avoid mutating fields in frozen models
- ✔ Use
default_factoryfor mutable types
🎉 Conclusion — You Now Write Enterprise-Grade Python Models
You've mastered:
- ✔ slots
- ✔ frozen models
- ✔ validation
- ✔ DTO patterns
- ✔ events
- ✔ polymorphism
- ✔ serialization
- ✔ domain-driven architecture
- ✔ high-performance data structures
You're building at professional software engineer level.
📋 Quick Reference — Data Classes
| Syntax | What it does |
|---|---|
| @dataclass | Auto-generate __init__, __repr__, __eq__ |
| @dataclass(frozen=True) | Make class immutable (hashable) |
| field(default_factory=list) | Mutable default values safely |
| dataclasses.asdict(obj) | Convert dataclass to dict |
| @dataclass(order=True) | Auto-generate comparison methods |
🎉 Great work! You've completed this lesson.
You can now use @dataclass to build clean data containers with auto-generated methods, validation, and serialisation.
Up next: Magic Methods — control exactly how your objects behave with Python's dunder protocol.
Sign up for free to track which lessons you've completed and get learning reminders.