Amazon Neptune
Fully managed graph database for connected datasets
Neptune is a database designed for relationships, not just storing data, but understanding how things connect. Imagine Facebook's friend network, or Amazon's product recommendations ('customers who bought X also bought Y'). Traditional databases struggle with these 'who knows who' or 'what's related to what' queries. Neptune is built for this; it stores data as a graph (nodes and edges), making relationship queries lightning fast. Instead of asking 'show me all friends of friends of friends' with complex SQL joins that take minutes, Neptune answers in milliseconds. It's like having a map of connections where you can instantly trace any path.
Neptune is a fully managed graph database supporting three query languages: Gremlin (Apache TinkerPop), SPARQL (RDF), and openCypher. Data is stored as vertices (nodes) and edges (relationships), optimized for traversal queries. Neptune replicates data across 3 AZs with 6 copies, provides read replicas (up to 15), and supports point-in-time recovery.
Key Capabilities
Key features: ACID transactions, full-text search integration with Elasticsearch, and ML-powered graph analytics via Neptune ML.
Gotchas & Constraints
Gotcha #1: Graph databases require different thinking than relational databases, you model relationships explicitly, not via foreign keys. Gotcha #2: Neptune pricing is based on instance hours and storage; it's more expensive than DynamoDB for simple key-value lookups. Use Neptune only when you need graph traversals. Constraints: Maximum graph size is 128TB, single-region deployment (use backup/restore for DR), and limited query language support (Gremlin, SPARQL, or openCypher, not SQL).
A fraud detection system analyzes transaction networks to identify suspicious patterns. Traditional SQL queries to find 'all accounts connected within 3 hops of a flagged account' require recursive joins and take 10 minutes. They migrate to Neptune, modeling accounts as vertices and transactions as edges. The same query in Gremlin completes in 200ms. When a transaction is flagged, Neptune instantly traverses the graph to find connected accounts, identifying fraud rings (multiple accounts controlled by the same person). They use Neptune ML to train a model that predicts fraud likelihood based on graph features (number of connections, transaction patterns). For social features, they implement 'friend recommendations' by finding common connections: 'users who know your friends but aren't your friends yet.' Neptune handles 10,000 graph queries per second with sub-second latency.
The Result
fraud detection accuracy improves by 40%, false positives drop by 60%, and investigation time drops from hours to minutes.