Networking & Content Delivery
    🌐Networking & Content Delivery

    Elastic Load Balancing (ALB/NLB)

    Automatically distributes incoming traffic across multiple targets

    A load balancer is like a traffic cop at a busy intersection, directing cars (requests) to different lanes (servers) to prevent any one lane from getting overwhelmed. Imagine a restaurant with 5 waiters; if all customers went to one waiter, they'd be swamped while others stand idle. A load balancer distributes customers evenly across all waiters. AWS offers three types: Application Load Balancer (ALB) for HTTP/HTTPS traffic with smart routing, Network Load Balancer (NLB) for ultra-high performance TCP/UDP traffic, and Classic Load Balancer (legacy). Load balancers also do health checks; if a server fails, they stop sending traffic to it automatically, ensuring users always reach healthy servers.

    ELB distributes traffic across multiple targets (EC2 instances, containers, IP addresses, Lambda functions) in one or more AZs. ALB operates at Layer 7 (application), supporting path-based routing (/api → backend, /images → image server), host-based routing (api.example.com → API servers), and HTTP/2, WebSocket. NLB operates at Layer 4 (transport), providing ultra-low latency (microseconds), static IPs, and handling millions of requests per second.

    Key Capabilities

    • Application Load Balancer (ALB, Layer 7): routes HTTP/HTTPS traffic by host header, path, query string, and HTTP method, and supports WebSockets, HTTP/2, gRPC, and Lambda targets
    • Network Load Balancer (NLB, Layer 4): handles millions of TCP/UDP requests per second with ultra-low latency, supports static Elastic IP addresses per Availability Zone, and preserves client source IPs
    • Gateway Load Balancer (GWLB): transparently routes traffic through third-party virtual network appliances (firewalls, IDS/IPS) using the GENEVE protocol for inline security inspection
    • Health checks: continuously test target availability and automatically stop routing traffic to unhealthy instances or containers until they recover
    • ALB native authentication: integrate with Amazon Cognito or any OIDC provider to authenticate users at the load balancer before requests reach the application
    • Access logs: detailed per-request logs delivered to S3 capturing client IP, latency, request path, server response, and target response for traffic analysis and debugging

    Gotchas & Constraints

    Gotcha #1: ALB adds latency (milliseconds) due to Layer 7 processing; use NLB for latency-sensitive workloads. Gotcha #2: Load balancers are regional; for global load balancing, use Route 53 with health checks. Constraints: ALB supports HTTP/HTTPS only (no TCP/UDP), NLB doesn't support path-based routing, and load balancers require at least 2 subnets in different AZs.

    An e-commerce site runs 20 EC2 instances across 3 AZs. Without a load balancer, they'd need to manually distribute traffic or use DNS round-robin (no health checks, no SSL termination). They deploy an ALB in front of instances, configuring target groups for different services: /api → API servers, /checkout → payment servers, /images → image servers. ALB performs health checks every 30 seconds; when an instance fails, ALB stops routing traffic to it within 1 minute. For SSL, they attach an ACM certificate to the ALB, which handles SSL termination (instances receive plain HTTP, reducing CPU load). During Black Friday, traffic spikes 10x; Auto Scaling launches 100 more instances, and ALB automatically includes them in the target group. ALB distributes traffic evenly, preventing any instance from being overwhelmed. They enable access logs to S3 for analytics and use ALB's built-in WAF integration to block malicious traffic.

    The Result

    99.99% availability, automatic failover, and simplified SSL management.

    Official AWS Documentation