Python Middle — PrepFor.tech

9. Databases & ORM

Model data carefully, avoid N+1, tune indexes, and use transactions deliberately.

Question: What is an "N+1" query problem, and how do you solve it with an ORM like SQLAlchemy?

Answer: The N+1 problem occurs when an ORM executes one query to fetch a list of parent objects, and then executes N additional queries inside a loop to fetch a related child object for each parent.

Explanation: This is extremely inefficient. The solution is to use eager loading. In SQLAlchemy, you can use selectinload or joinedload in your initial query. selectinload is often preferred as it uses a separate IN clause to fetch all related objects in a second query (totaling 2 queries), which is generally more efficient than a single complex JOIN.

from sqlalchemy import String, ForeignKey
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, Session, selectinload

class Base(DeclarativeBase): pass

class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    posts: Mapped[list["Post"]] = relationship(back_populates="author")

class Post(Base):
    __tablename__ = "posts"
    id: Mapped[int] = mapped_column(primary_key=True)
    author_id: Mapped[int] = mapped_column(ForeignKey("users.id"))
    author: Mapped[User] = relationship(back_populates="posts")

def get_users_with_posts(session: Session):
    # Solves N+1 by pre-loading all 'posts' relationships
    return session.query(User).options(selectinload(User.posts)).all()

Question: How do database indexes impact performance, and how do you verify usage?

Answer: Indexes speed reads (O(log n)) but slow writes. Use EXPLAIN/EXPLAIN ANALYZE to inspect query plans and confirm index scans.

Explanation: Create composite indexes matching predicates and order; beware low-selectivity columns.

Question: What is optimistic concurrency control and how do you implement it?

Answer: Use a version (or timestamp) column and include it in WHERE on update; if 0 rows affected, retry or surface a conflict.

Explanation: Prevents lost updates without heavy locking.

Question: How do you use SQLAlchemy 2.0 in async mode?

Answer: Create an async engine via create_async_engine, build sessions with async_sessionmaker, and manage them with async with.

Explanation: Async engines use drivers like asyncpg. Keep sessions short-lived and tied to a request or task.

from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker, AsyncSession

engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db", echo=False)
SessionFactory = async_sessionmaker(engine, expire_on_commit=False, class_=AsyncSession)

async def get_user(user_id: int):
    async with SessionFactory() as session:
        return await session.get(User, user_id)

Question: What are common isolation levels and why do they matter?

Answer: READ COMMITTED, REPEATABLE READ, SERIALIZABLE trade off stale reads vs contention; pick based on correctness needs.

# SQLAlchemy sync example
from sqlalchemy import text

with Session() as s, s.begin():
    s.execute(text("SET TRANSACTION ISOLATION LEVEL REPEATABLE READ"))
    ...

Question: What is the recommended transaction pattern?

Answer: Use short-lived transactions with context managers: with Session.begin(): ... and keep critical sections small.

Question: What are database transactions and what does ACID stand for?

Answer: A transaction is a sequence of database operations that are treated as a single, logical unit of work. All operations in a transaction must succeed; if any fails, the entire transaction is rolled back. ACID is an acronym that describes the four key properties of a reliable transaction system.

Explanation:

Atomicity: Guarantees that all operations within a transaction are completed as a single, indivisible unit. It's all or nothing.
Consistency: Ensures that a transaction brings the database from one valid state to another, maintaining database invariants.
Isolation: Ensures that concurrently executing transactions do not interfere with each other. The result is the same as if the transactions were executed serially.
Durability: Guarantees that once a transaction has been committed, it will remain committed even in the event of a system failure.

Question: Why is database connection pooling important in a web application?

Answer: Connection pooling is a technique for managing and reusing database connections. It is critical for performance and stability in a web application because the overhead of establishing a new database connection for every request is very high.

Explanation: A connection pool maintains a set of open, ready-to-use database connections. When the application needs to talk to the database, it borrows a connection from the pool. When it's done, it returns the connection to the pool instead of closing it. This avoids the costly setup/teardown process (TCP handshake, authentication) for each request, significantly reducing latency and resource consumption on both the application and database servers. Most production-grade database libraries (like SQLAlchemy) have built-in support for connection pooling.

Question: Why are database migrations important?

Answer: Database migrations allow you to manage and version your database schema in code. This makes schema changes repeatable, reversible, and safe to apply across different environments.

Explanation: Tools like Alembic (for SQLAlchemy) track changes to your ORM models and generate migration scripts. These scripts can then be run as part of a deployment process to update the database schema, avoiding manual, error-prone SQL changes and keeping your application code and database schema in sync.