Engineering Playbook
System Patterns

Microservices

Decomposition strategies, Database-per-Service, and the Microservice Tax.

Microservices Architecture

Microservices are not simply about making services "small." They are about making services independently deployable.

If you have to coordinate the deployment of Service A and Service B together because they break each other, you do not have microservices. You have something much worse.

The Anti-Pattern: Distributed Monolith

What is a Distributed Monolith?

A Distributed Monolith is a system that has the complexity of microservices (network latency, distributed tracing needs) but none of the benefits (independent scaling/deploying).

Symptoms:

  1. Lock-step Deploys: "We have to deploy the Order Service and the User Service at the exact same time."
  2. Shared Database: Multiple services reading/writing to the same tables.
  3. Chatty Interfaces: Service A calls Service B 100 times to render one page.

Decomposition Strategies

How do you actually split the system?

By Business Capability

Split by what the business *does* (e.g., Order Management, Inventory). This often aligns with organizational structure, following Conway's Law.

By Subdomain (DDD)

The preferred method. Align services with Bounded Contexts. (e.g., The 'Product Catalog' context is separate from the 'Inventory Level' context).


The Golden Rule: Database per Service

This is the hardest rule to follow, but the most important.

  • The Rule: Service A cannot access Service B's database tables directly. It must go through Service B's API.
  • The Why: If they share tables, you cannot change the schema without breaking the other service. You are coupled at the data layer.
  • The Cost: SQL Joins are impossible. You must perform "API Joins" (fetch ID from A, then fetch details from B) or replicate data using Event-Driven patterns.

Service Discovery

In a dynamic environment (like containers scaling up and down), IP addresses change constantly. You cannot hardcode IPs.

1. Client-Side Discovery

The client holds the logic. It queries a Service Registry to ask "Where is the Payment Service?" and then picks an available instance to call.

2. Server-Side Discovery (Platform Native)

The client calls a generic Virtual IP or DNS name, and the platform (Load Balancer) handles the routing to a healthy instance. The client is unaware of the backend complexity.


Practical Example: E-Commerce Platform

Monolith to Microservices Decomposition

Starting Point: Monolithic E-Commerce

Problems with Monolith:

  • Checkout team deployment blocks Search team updates
  • Black Friday traffic spikes entire application
  • Can't scale payment processing independently

Decomposition Strategy: Business Capabilities

Database-per-Service Example

Orders Service Schema

  • Table: orders with columns for id, user_id, status, total_amount, created_at
  • Table: order_items with order_id, product_id, quantity, price
  • Includes foreign key relationships and appropriate constraints
  • Optimized for order-related queries and business logic

Products Service Schema

  • Table: products with columns for id, name, price, inventory_count
  • Independent database with its own connection pool
  • No shared tables with Orders service
  • Optimized for product catalog operations

API Join Example (Cross-Service Query)

Problem: Need order details with product information

BAD Approach: Direct database access across service boundaries

  • Query Orders database directly and join with Products tables
  • Violates service autonomy and coupling principles
  • Creates tight coupling at data layer

GOOD Approach: API Join pattern

  • Call Orders Service API to get order details
  • For each order item, call Products Service API to get product details
  • Combine results in application layer
  • Maintains service boundary integrity

Trade-offs:

  • Increased network calls (multiple API calls vs single query)
  • Better service autonomy and maintainability
  • Can use caching or data replication to optimize performance

Service Discovery Implementation

Using Kubernetes (Server-Side Discovery):

  • Services communicate using internal DNS names (e.g., orders-service:3000)
  • Kubernetes Service objects handle load balancing across pods
  • No need to know individual pod IP addresses
  • Automatic service registration and health checks

Implementation Pattern:

  1. Define Kubernetes Service for each microservice
  2. Use stable service names in inter-service communication
  3. Platform handles pod discovery and load distribution
  4. Services can scale up/down without changing configuration

Benefits:

  • Simplified service communication
  • Automatic failover to healthy instances
  • No manual IP management required
  • Built-in health checking

Real-World Scenarios

Scenario 1: Black Friday Traffic Spikes

Problem: Orders service gets 100x traffic, but Products service is normal

Solution: Selective Scaling Strategy

  • Scale Orders Service to 20 replicas to handle checkout traffic
  • Keep Products Service at 2 replicas (normal catalog browsing)
  • Configure appropriate CPU and memory requests for each service
  • Use horizontal pod autoscaler based on metrics

Implementation Steps:

  1. Create deployment with higher replica count for Orders Service
  2. Set resource requests and limits appropriately
  3. Configure autoscaling policies based on CPU/memory metrics
  4. Monitor and adjust based on actual traffic patterns

Key Benefits:

  • Cost-effective scaling (only scale what needs it)
  • Maintains performance under load
  • Independent scaling decisions per service

Scenario 2: Database Migration in Orders Service

Problem: Need to add new columns without breaking other services

Solution: Incremental Database Migration Strategy

Step 1: Add nullable column

  • Add new column as optional/nullable
  • Deploy code that handles both old and new schema
  • No immediate data requirements

Step 2: Update application logic

  • Deploy code that properly handles null values
  • Add validation for new field where appropriate
  • Ensure backwards compatibility

Step 3: Backfill existing data

  • Populate null values with default or computed values
  • Run in batches to avoid performance impact
  • Monitor progress and handle failures gracefully

Step 4: Make column required

  • Add NOT NULL constraint once data is populated
  • Deploy final version requiring the field
  • Remove old conditional logic

Benefits:

  • Zero downtime migrations
  • Rollback capability at each step
  • No impact on other services

Scenario 3: Debugging Cross-Service Issues

Problem: User reports "Payment failed" - which service is the issue?

Solution: Distributed Tracing Implementation

Key Components:

  • Trace ID: Unique identifier for entire request flow
  • Span ID: Individual operation within a trace
  • Parent/Child relationships showing call hierarchy
  • Context propagation across service boundaries

Implementation Pattern:

  1. Start trace at entry point (API Gateway)
  2. Create spans for each service call
  3. Propagate trace context in HTTP headers
  4. Child services continue the trace
  5. Collect spans into centralized system

Benefits:

  • Complete request visibility across services
  • Easy identification of bottlenecks and failures
  • Performance optimization opportunities
  • Root cause analysis capability

Popular Tools:

  • Jaeger, Zipkin for open-source tracing
  • AWS X-Ray, Google Cloud Trace for managed solutions
  • OpenTelemetry for vendor-neutral instrumentation

The Microservice Tax

Microservices solve organizational scaling (100 devs working at once), not technical scaling. They introduce massive complexity:

  1. Latency: Every function call is now a network hop.
  2. Consistency: No more ACID transactions across boundaries (requires Sagas).
  3. Observability: You need distributed tracing just to find where an error occurred.