Courses/Python/Module & Package Architecture

    Lesson 32 • Advanced

    Module & Package Architecture for Large Codebases

    Learn how to structure and scale large Python applications. Master the architectural patterns used by Django, FastAPI, Airflow, and enterprise teams to build maintainable codebases with thousands of files.

    Module & Package Architecture for Large Codebases

    When your project grows beyond a few files, the MOST important factor for long-term success is structure. A great architecture makes your code:

    • ✔ easier to understand
    • ✔ easier to test
    • ✔ easier to extend
    • ✔ easier to debug
    • ✔ easier to onboard new developers
    • ✔ scale to thousands of files

    This lesson teaches you exactly how professional Python teams structure + scale large applications.

    🔥 1. Why Python Projects Need Good Architecture

    Small scripts can look like this:

    app.py
    utils.py
    database.py

    But large systems suffer without structure:

    • ❌ circular imports
    • ❌ duplicated logic
    • ❌ unclear module responsibilities
    • ❌ "god files" with 5,000–20,000 lines
    • ❌ impossible navigation
    • ❌ hard-to-test logic
    • ❌ breaking one feature breaks everything

    Large codebases need:

    • modular isolation
    • clear domain boundaries
    • strong naming conventions
    • proper folder hierarchy
    • dependency direction
    • package-level APIs

    ⚙️ 2. How Python Imports Actually Work

    Understanding the import system is key.

    When Python imports a module:

    1. It searches directories from sys.path
    2. Looks for a package folder OR .py file
    3. Executes the module once
    4. Caches it in sys.modules

    Meaning:

    • ✔ imports are cached
    • ✔ circular imports cause runtime errors
    • ✔ module execution on import can be expensive

    Rule:

    👉 Keep import side-effects to zero. (No DB connections, no heavy computation.)

    📦 3. What Is a Package? (And Why It Matters)

    A package is a folder containing an __init__.py.

    myapp/
        __init__.py
        users/
            __init__.py
            models.py
            routes.py

    Without __init__.py, Python treats folders as namespace packages.

    With it → proper isolated packages.

    Use regular packages unless you need distributed namespace packages.

    🧱 4. Standard Large-Scale Project Structure

    Professional Python projects (Django, Flask, Airflow, FastAPI) use this format:

    project/
        app/
            __init__.py
            config/
                __init__.py
                settings.py
            core/
                __init__.py
                exceptions.py
                interfaces.py
            domain/
                __init__.py
                users/
                    __init__.py
                    models.py
                    services.py
                payments/
                    __init__.py
                    models.py
                    services.py
            infrastructure/
                __init__.py
                db/
                    __init__.py
                    repository.py
                    connection.py
                cache/
                    redis_client.py
            api/
                __init__.py
                routes/
                    users.py
                    payments.py
            tests/
                ...
        scripts/
        docs/

    This is clean because:

    • domain layer contains business logic
    • infrastructure contains external systems
    • api layer exposes HTTP / CLI interface
    • core holds shared primitives
    • config holds settings

    This scales to 100K+ lines.

    🧩 5. The Layered Architecture (Most Common Design)

    Layers:

    1. API Layer

    FastAPI routers, Flask routes, CLI commands.

    2. Service Layer

    Business logic, domain rules, coordination.

    3. Data/Infrastructure Layer

    Database, cache, filesystem, external APIs.

    4. Core / Shared Components

    Cross-cutting concerns:

    • exceptions
    • interfaces
    • abstractions
    • helpers

    Benefits:

    • ✔ isolates business logic
    • ✔ minimizes circular imports
    • ✔ pluggable infrastructure (swap DB easily)
    • ✔ easier unit testing

    🔌 6. Dependency Direction (The #1 Rule)

    High-level modules must NOT depend on low-level modules.

    Instead → low-level depends on high-level interfaces.

    Example:

    domain/service.py  → depends on → interface
    infrastructure/db_repository.py → implements interface

    This enables:

    • ✔ dependency injection
    • ✔ testing with mock DB
    • ✔ clean separation
    • ✔ avoiding circular imports

    🔍 7. Avoiding Circular Imports

    Circular imports happen when two modules import each other:

    a.py → imports b.py
    b.py → imports a.py

    Fix by:

    • ✔ moving shared logic into core
    • ✔ using local imports inside functions
    • ✔ introducing interfaces
    • ✔ separating pure logic from IO

    Example Fix:

    Avoiding Circular Imports

    Use interfaces to break circular dependencies

    Try it Yourself »
    Python
    # instead of importing across layers
    # from domain.user_service import get_user
    
    # restructure using interfaces
    from typing import Protocol
    
    class UserGetter(Protocol):
        def get_user(self, user_id: str): ...
    
    # Now domain depends on interface, not concrete implementation
    class UserService:
        def __init__(self, user_getter: UserGetter):
            self.user_getter = user_getter
        
        def get_user_profile(self, user_id: str):
            return self.user_getter.get_user(user_id)
    
    print("Interface
    ...

    🧠 8. Internal Package APIs (__all__ and public API)

    Every package should define what it exposes:

    Package Public API

    Define what your package exposes with __all__

    Try it Yourself »
    Python
    # __init__.py example:
    
    # Simulating what would be in separate files
    class UserService:
        def get_user(self, user_id):
            return f"User {user_id}"
    
    class User:
        def __init__(self, name):
            self.name = name
    
    # Export only the public API
    __all__ = ["UserService", "User"]
    
    # Usage demonstration
    service = UserService()
    user = User("Alice")
    print(service.get_user("123"))
    print(f"Created user: {user.name}")

    Benefits:

    • ✔ clean external imports
    • ✔ hides internal details
    • ✔ stable API for other modules

    🚀 9. Configuration Architecture

    NEVER scatter config constants in files.

    Use:

    app/config/settings.py

    Support overrides:

    • settings_local.py
    • .env files
    • environment variables

    Avoid hard-coding secrets, URLs, DB credentials.

    Configuration Management

    Centralized config with environment variables

    Try it Yourself »
    Python
    # config/settings.py
    import os
    from pathlib import Path
    
    BASE_DIR = Path(__file__).resolve().parent.parent
    
    DEBUG = os.getenv("DEBUG", "False") == "True"
    DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///db.sqlite3")
    SECRET_KEY = os.getenv("SECRET_KEY", "change-me-in-production")
    
    # Feature flags
    ENABLE_ANALYTICS = os.getenv("ENABLE_ANALYTICS", "True") == "True"
    ENABLE_CACHING = os.getenv("ENABLE_CACHING", "False") == "True"
    
    print(f"DEBUG mode: {DEBUG}")
    print(f"Database: {DATABASE_URL}")
    pri
    ...

    📚 10. Module Naming Conventions (Professional Standard)

    Use:

    • models.py — classes representing data
    • services.py — business logic
    • repository.py — DB access
    • routes.py — API routes
    • tasks.py — background jobs
    • exceptions.py — error definitions
    • utils.py — ONLY for generic helpers

    Avoid:

    • ❌ utils2.py
    • ❌ helpers_mixed.py
    • ❌ random_functions.py

    Consistency wins.

    🧪 11. Testing Architecture

    Test mirrors app structure:

    tests/
        domain/
            test_users.py
        api/
            test_routes.py
        infrastructure/
            test_repository.py

    Use factories for test data.

    Separate unit and integration tests.

    Principles:

    • domain tested heavily with unit tests
    • infrastructure tested with mocks
    • API tested with integration tests

    Part 2: Domain-Driven Design & Advanced Patterns

    🔥 12. Domain-Driven Design (DDD) in Python

    DDD is a system-design method focused on modelling business rules, not framework limitations.

    Its main idea:

    • ✔ Code structure mirrors business structure
    • ✔ Each domain is isolated
    • ✔ Logic belongs to the domain, not to API or DB layers
    • ✔ Domain objects represent real-world concepts

    Example domain structure:

    domain/
        users/
            models.py
            services.py
            validators.py
        payments/
            models.py
            services.py
            policies.py
        inventory/
            models.py
            rules.py

    Why DDD fits Python:

    • Python's dynamic nature simplifies domain modelling
    • Dataclasses + type hints make models clean
    • Package isolation prevents circular imports
    • Encourages clean business rules without tech dependencies

    Goal: Domains should NOT depend on frameworks or external APIs. Only infrastructure depends on domains.

    ⚙️ 13. Domain Layer Responsibilities

    The domain layer should contain:

    ✔ Entities

    Objects that have identity across time (e.g., User, Order).

    ✔ Value Objects

    Objects identified by value, not identity (e.g., Price, Email, Coordinates).

    ✔ Domain Services

    Logic that doesn't naturally belong to any one entity.

    ✔ Policies & Rules

    Business validation, decision logic.

    ✔ Domain Exceptions

    Errors specific to the domain.

    ✔ Events

    e.g., UserRegistered, PaymentCompleted

    What domain must NOT contain:

    • ❌ database code
    • ❌ API framework code
    • ❌ external library calls
    • ❌ logging / caching
    • ❌ infrastructure details

    This keeps the code clean, portable, and testable.

    🧱 14. Infrastructure Layer (DB, Cache, External Systems)

    Infrastructure is where all "real-world" systems live:

    infrastructure/
        db/
            repository.py
            connection.py
        cache/
            redis_client.py
        external/
            payment_gateway.py
            sms_service.py

    Responsibilities:

    • actual SQL queries
    • ORM models
    • Redis connections
    • HTTP clients
    • integrations with Stripe, AWS, etc.

    Must NOT:

    • ❌ contain business rules
    • ❌ call domain services

    Instead → infrastructure implements interfaces defined in the domain.

    Example interface:

    Repository Pattern

    Infrastructure implements domain interfaces

    Try it Yourself »
    Python
    # domain/users/interfaces.py
    from typing import Protocol
    
    class UserRepository(Protocol):
        def save(self, user): ...
        def find_by_email(self, email: str): ...
    
    # Infrastructure implementation:
    # infrastructure/db/user_repository.py
    class SQLUserRepository:
        """Implements UserRepository interface"""
        
        def save(self, user):
            # actual SQL logic
            print(f"Saving user to database: {user}")
        
        def find_by_email(self, email: str):
            # actual SQL query
            pr
    ...

    🧩 15. The API Layer

    This is the "edge" of the app. Usually contains:

    api/
        routes/
            users.py
            payments.py
        schemas.py
        dependencies.py

    Framework examples:

    • Flask blueprints
    • FastAPI routers
    • Django views
    • CLI commands (Click, Typer)
    • Websocket handlers

    Responsibilities:

    • ✔ parsing requests
    • ✔ converting domain errors → HTTP codes
    • ✔ authentication
    • ✔ response formatting

    Must NOT:

    • ❌ contain business decisions
    • ❌ contain SQL queries
    • ❌ talk directly to infrastructure
    • ❌ hold domain logic

    API should talk ONLY to domain services.

    Part 3: Real-World Architectures

    🔥 16. The Three Master Architectures for Large Python Projects

    There are 3 real-world architectures used by engineering teams once codebases reach 50K+ lines:

    ✔ 1. Layered Architecture (most common)

    api/
    domain/
    infrastructure/
    core/

    Clear vertical layers with strict dependency rules.

    ✔ 2. Clean/Hexagonal Architecture (enterprise-grade)

    domain/
        models/
        services/
        events/
    adapters/
        db/
        cache/
        external/
    application/
    api/

    Domain is isolated and stable. Adapters wrap external systems. Application orchestrates flows.

    ✔ 3. Plugin / Modular Monolith (like Django)

    users/
    payments/
    inventory/
    notifications/
    analytics/

    Each feature is an "app" with its own mini-architecture inside.

    This is the architecture used by:

    • Django
    • Airflow
    • Open edX
    • Odoo
    • Many enterprise monoliths

    ⚙️ 17. How Django Organises Massive Codebases

    Django uses a modular app structure:

    project/
        settings/
        core/
        users/
        payments/
        api/
        dashboard/

    Each "app" has:

    • models.py
    • views.py
    • services.py
    • signals.py
    • admin.py

    Benefits:

    • ✓ isolation
    • ✓ easy testing
    • ✓ independent teams
    • ✓ plugin marketplace (reusable apps)

    Why this matters: If you model your website like this, you can scale to 200+ pages and thousands of functions without losing control.

    🔥 18. How FastAPI Organises Modern Backend Projects

    FastAPI encourages a clean, layered layout:

    app/
        main.py
        api/
            v1/
                routes/
                schemas/
        services/
        repositories/
        models/
        core/
            config.py
            security.py
            events.py
        db/
            session.py
            migrations/

    Strengths:

    • Fast startup
    • Async-first
    • Domain + repository pattern
    • Event-driven hooks

    Perfect for scalable SaaS backends.

    🎉 Final Conclusion

    Across the three parts, you've now learned:

    • ✔ Clean architecture
    • ✔ Domain-driven design
    • ✔ Layered module organization
    • ✔ Plugin-based modular monolith
    • ✔ Event-driven communication
    • ✔ Dependency inversion
    • ✔ Container-based dependency wiring
    • ✔ API versioning for long-term stability
    • ✔ Scaling to hundreds of modules
    • ✔ How real companies structure Python systems

    You now have the knowledge to architect and scale Python applications from startup MVPs to enterprise systems handling millions of users.

    📋 Quick Reference — Module Architecture

    ConceptWhat it means
    __init__.pyMakes a directory a Python package
    __all__ = [...]Control what's exported from a module
    from . import moduleRelative import within a package
    importlib.import_module()Dynamic import at runtime
    src/ layoutBest-practice project structure

    🎉 Great work! You've completed this lesson.

    You can now architect large Python codebases with proper packages, relative imports, and clean module boundaries.

    Up next: Logging & Debugging — add professional observability to your Python applications.

    Sign up for free to track which lessons you've completed and get learning reminders.

    Previous

    Cookie & Privacy Settings

    We use cookies to improve your experience, analyze traffic, and show personalized ads. You can manage your preferences below.

    By clicking "Accept All", you consent to our use of cookies for analytics and personalized advertising. You can customize your preferences or reject non-essential cookies.

    Privacy PolicyTerms of Service