Engineering Playbook
Databases

SQL Tuning

Indexing, Explain Analyze, and Optimization strategies.

SQL Tuning

Writing SQL is easy. Writing SQL that scales to 100 million rows requires understanding the engine.

Explain Analyze

Never guess why a query is slow. Ask the database. EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@test.com';

It tells you:

  1. Seq Scan: Did it scan every single row? (Bad).
  2. Index Scan: Did it jump to the exact row using a B-Tree? (Good).
  3. Cost: An arbitrary unit of effort (CPU + Disk I/O).

Indexing Strategies

B-Tree (The Standard)

Good for =, >, <, ORDER BY.

  • Mental Model: A sorted phone book. You can find "Smith" instantly.
  • Gotcha: WHERE name LIKE '%mith' breaks the index (because it starts with a wildcard).

Composite Indexes (Multi-column)

CREATE INDEX on users (last_name, first_name);

  • The Order Matters: The database can use this index for last_name alone, OR last_name + first_name.
  • It cannot use this index for first_name alone. (Think of the phone book: it's sorted by Last Name first).

BRIN (Block Range Index)

For massive time-series data. Instead of indexing every row (huge), it indexes "Chunks" of physical disk.

  • Use Case: "Find logs from yesterday." (Since logs are written sequentially, they are physically close on disk).

Common Killers

  1. SELECT *: Fetches huge columns (like JSON blobs) you might not need. Increases IOPS.
  2. Functions on Columns:
    • Bad: WHERE YEAR(created_at) = 2023. (Database has to calculate YEAR() for every row).
    • Good: WHERE created_at >= '2023-01-01'. (Can use the index).
  3. N+1 Queries: Running one query per row in a loop. (See GraphQL/ORM section).

The Index Tax

Indexes make Reads fast but Writes slow. Every time you INSERT or UPDATE, the database has to rearrange the B-Tree. Don't index every column.