Mapreduce
MapReduce: Simplified Data Processing on Large Clusters
Chord
The Chord Protocol and Consistent Hashing
The Google File System Design Condensed
The Google File System paper was a lot of fun to read. This is what I got out of it while trying to figure out its main design points, parts summarized from the paper.
Amazon's Dynamo Design Condensed
This article is on Amazon’s Dynamo, a highly distributed key-value storage system, designed to be always available. Here is the paper if you want to read it all in full! -> Amazon’s Dynamo <-
Google's Bigtable Design Condensed
The first thing that jumps out at you when you read the paper is bigtable’s flexibility in terms of the data size and latency requirements it supports. It handles data from web indexing, Google Finance and Google Earth, from urls to images. (And that Orkut was a Google product, who knew)
Cassandra's Design Condensed
Cassandra is a popular decentralized distributed key value store, designed for write heavy workloads. Cassandra was designed at Facebook to meet its needs for reliability and scalalbility in the Inbox Search problem. Cassandra’s design seems to be heavily inspired by Amazon’s Dynamo.
Appendix
ACID
Atomicity, Consistency, Isolation and Durability. Relational DBs