This post is a work in progress.
Inspired by a recent purchase of the Red Book, which provides
a curated list of important papers around database systems, I’ve decided
to begin assembling a list of important papers in distributed systems.
Similar to the Red Book, I’ve broken each group of papers out into a
series of categories, each highlighting a progression of related ideas
over time focused in a specific area of research within the field.
Keeping the tradition of the Red Book, I’ve included both papers which
resulted in very successful systems and/or techniques, as well as papers
which introduced a concept which was either immediately dismissed or
proven incorrect. This emphasizes the progression of ideas which lead
to the development of these systems.
The problems of establishing consensus in a distributed system.
Types of consistency, and practical solutions to solving ensuring atomic
operations across a set of replicas.
- Highly Available Transactions: Virtues and Limitations
Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica
- Consistency Tradeoffs in Modern Distributed Database System Design
Daniel J. Abadi
- CAP Twelve Years Later: How the “Rules” Have Changed
Eric Brewer
- Optimistic Replication
Yasushi Saito and Marc Shapiro
- Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services
Seth Gilbert, Nancy Lynch
- Harvest, Yield, and Scalable Tolerant Systems
Armando Fox, Eric A. Brewer
- Linearizability: A Correctness Condition for Concurrent Objects
Maurice P. Herlihy, Jeannette M. Wing
- Time, Clocks, and the Ordering of Events in a Distributed System
Leslie Lamport
Conflict-free data structures
Studies on data structures which do not require coordination to ensure
convergence to the correct value.
Distributed programming
Languages aimed towards disorderly distributed programming as well as
case studies on problems in distributed programming.
- Logic and Lattices for Distributed Programming
Neil Conway, William Marczak, Peter Alvaro, Joseph M. Hellerstein, David Maier
- Dedalus: Datalog in Time and Space
Peter Alvaro, William R. Marczak, Neil Conway, Joseph M. Hellerstein, David Maier, Russell Sears
- MapReduce: Simplified Data Processing on Large Clusters
Jeffrey Dean, Sanjay Ghemawat
- A Note On Distributed Computing
Samuel C. Kendall, Jim Waldo, Ann Wollrath, Geoff Wyant
Implemented and theoretical distributed systems.
- A History Of The Virtual Synchrony Replication Model
Ken Birman
- Cassandra — A Decentralized Structured Storage System
Avinash Lakshman, Prashant Malik
- Dynamo: Amazon’s Highly Available Key-Value Store
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels
- Stasis: Flexible Transactional Storage
Russell Sears, Eric Brewer
- Bigtable: A Distributed Storage System for Structured Data
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber
- The Google File System
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
- Lessons from Giant-Scale Services
Eric A. Brewer
- Towards Robust Distributed Systems
Eric A. Brewer
- Cluster-Based Scalable Network Services
Armando Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, Paul Gauthier
- The Process Group Approach to Reliable Distributed Computing
Ken Birman
Overviews and details covering many of the above papers and concepts compiled into single resources.
I’m hoping to make this into a living document, so please submit pull
requests or leave comments!