Compute
    Compute

    Amazon EC2

    Virtual servers in the cloud with configurable compute capacity

    Imagine you need a computer to run your software, but you don't want to buy one, maintain it, or worry about it breaking. EC2 is like renting a computer by the hour from Amazon's massive warehouse of computers. You pick the size; maybe you need a tiny laptop's worth of power, or maybe you need a supercomputer. You turn it on, use it, and turn it off. You only pay for the hours you use it, just like renting a car. If your computer 'breaks' (which it won't, really), Amazon just gives you another one instantly. And if you suddenly need 100 computers instead of one, you get them in minutes, not months.

    EC2 instances are virtual machines running on AWS's physical servers, managed by the Nitro hypervisor. When you launch an instance, you're selecting an instance type (compute, memory, storage, or GPU optimized), an AMI (Amazon Machine Image: your OS and software), and a VPC subnet (your network location). Key configurations include security groups (stateful firewalls), IAM roles (permissions without credentials), and EBS volumes (persistent block storage).

    Gotchas & Constraints

    Common gotcha #1: Instances in public subnets need an Elastic IP or public IP to be internet-accessible, but they also need an Internet Gateway attached to the VPC and proper route table entries. Missing any piece breaks connectivity. Gotcha #2: Instance metadata service (IMDS) v1 is vulnerable to SSRF attacks; always use IMDSv2 with hop limits. Constraints: Instances are bound to a single AZ (not region-wide), so AZ failures take down instances. For multi-AZ resilience, use Auto Scaling Groups with instances spread across AZs behind an Elastic Load Balancer.

    A SaaS startup runs a web application that sees traffic spikes during business hours (9 AM - 5 PM EST) but is quiet at night. Previously, they ran physical servers sized for peak load, wasting 70% capacity off-hours. With EC2 and Auto Scaling, they run 2 t3.medium instances during off-peak (minimum for redundancy) and scale up to 20 c5.large instances during peak hours. Auto Scaling watches CloudWatch metrics: when CPU exceeds 70% for 2 minutes, it launches more instances; when it drops below 30%, it terminates extras. They attach instances to an Application Load Balancer for traffic distribution and health checks.

    The Result

    60% cost reduction, better performance during spikes, and automatic recovery from instance failures.

    Official AWS Documentation