Amazon Glacier
Low-cost archival storage for long-term data retention
Glacier is like a deep freeze for your data; it's incredibly cheap storage for stuff you rarely need but must keep for years (think legal documents, old backups, compliance records). The trade-off? Retrieving data takes time, anywhere from minutes to hours, depending on how much you're willing to pay. It's like storing boxes in a warehouse basement: super cheap rent, but when you need something, it takes a while to dig it out. Perfect for 'write once, read never (or almost never)' scenarios. You're trading access speed for massive cost savings of up to 95% cheaper than regular S3 storage.
Amazon Glacier (now part of S3 Glacier storage classes) provides ultra-low-cost archival storage with retrieval times ranging from minutes to hours. S3 Glacier offers three storage classes: S3 Glacier Instant Retrieval (millisecond access, for quarterly access), S3 Glacier Flexible Retrieval (minutes to hours, for annual access), and S3 Glacier Deep Archive (12-48 hours, for 7-10 year retention). Data is stored as archives within vaults (Glacier API) or as S3 objects with Glacier storage class (S3 API).
Key Capabilities
- Offers three retrieval tiers: Expedited (1-5 minutes), Standard (3-5 hours), and Bulk (5-12 hours), with cost scaling inversely with speed
- S3 Glacier Instant Retrieval provides millisecond access to archived data while still delivering low archival storage costs
- Integrates with S3 lifecycle rules to automatically archive objects from Standard or IA storage classes after a defined number of days
- Vault Lock enforces WORM (Write Once Read Many) compliance policies that are immutable once applied, supporting regulatory retention requirements
- S3 Glacier Flexible Retrieval and Deep Archive carry a 90-day and 180-day minimum storage duration respectively; early deletion incurs a prorated fee
- Data is redundantly stored across multiple AZs within a region, providing the same durability guarantee (11 nines) as standard S3
Gotchas & Constraints
Gotcha #1: Glacier has minimum storage duration charges. Deleting data before 90 days (Flexible Retrieval) or 180 days (Deep Archive) incurs early deletion fees. Gotcha #2: Retrieval costs money: bulk retrievals are cheap but slow, expedited retrievals are fast but expensive. Constraints: Not suitable for frequently accessed data. Use S3 Intelligent-Tiering if access patterns are unpredictable.
A healthcare provider must retain patient records for 10 years per HIPAA regulations. They have 2PB of medical imaging data (X-rays, MRIs, CT scans) that's rarely accessed after 1 year but must be available if needed. Storing this in S3 Standard would cost $46,000/month. They implement a lifecycle policy: new data stays in S3 Standard for 30 days (frequent access during active treatment), transitions to S3 Glacier Flexible Retrieval after 30 days (occasional access for follow-ups), and moves to S3 Glacier Deep Archive after 1 year (long-term archival). Storage cost drops to $2,000/month, a 96% reduction. When a patient requests records from 5 years ago, they initiate a standard retrieval (12 hours, $0.03/GB), and the data is available the next day. They use Glacier Vault Lock to enforce immutability: records can't be deleted or modified for 10 years, ensuring compliance.