Distributed And Cloud-Based Storage Systems
TTh 12:30-1:45, CSI 3120

The guiding philosophy of this course is that the best way to learn about real systems is to build one. We will gain an in-depth understanding of the issues involved in designing and deploying large-scale distributed file systems. In the course of this investigation we will be tackling a variety of topics, such as peer-to-peer systems, remote procedure calls, multi-threading, consensus protocols, cloud systems, layered systems (supporting high-level consistency guarantees on top of cloud services), and security as it relates to such systems.



Pete Keleher <keleher@cs.umd.edu> (include "818" in all correspondance)
Office hours: By appt.


The class will consist of lectures by the instructor, student project presentations, a final, and a series of probably four programming projects, all in the language Go (fear not if you don't know anything about go, we'll all be learning together). The end goal is to have built a full-scale reliable, highly-available, and secure distributed file system, using both local disks and cloud services as backing stores. My lectures will be split between those describing the tools we will use to build our file systems, and lectures based on recent research in the literature (such as those at FAST, OSDI, NSDI, and SOSP.

Examples of technologies we may use include FUSE (and MacFUSE), key value stores like Bolt or gkvlite or diskv or leveldb-go, the Amazon Simple Storage Service (and go binding), Google's Protocol Buffers or json (from Go), Google's Go language, PAXOS, SQLite, and Snappy.

      Note: this paper list will change by the end of the second week.

Tuesday Thursday
Aug 31
Reading: A Tour of Go, and Effective Go

Solve the following puzzle, copy your solution into a fresh playground, and send me the "Share" url before class Thursday.


Sep 2

"Immutability Changes Everything"


Sep 7
"The Design and Implementation of a Log-Structured File System"
"A Low-bandwidth Network File System"



Sep 9
Global system event orderings.

Project 1: Learning Go, In-Memory File System due Sunday night.

Sep 14
"MapReduce: simplified data processing on large clusters."
"Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing."



Sep 16
"The Google File System"
"GFS: Evolution on Fast-forward"



Sep 21
"Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System"
"Deciding when to forget in the Elephant file system"



Sep 23
"File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution."
"Lineage stash: fault tolerance off the critical path"

Project 2: Serialization, Persistence, and Immutability, due Sunday night.

Sep 28
"Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications" - kaitlyn
"Session types for Rust" - aaron

(talks: chord, rust)

Sep 30
Oct 5
"The worldwide computer"
"OceanStore: An Architecture for Global-Scale Persistent Storage"


Oct 7
"Paxos Made Simple"
"Paxos Made Live: an Engineering Perspective"


Project 3: Log Synchronization due Sunday night.

Oct 12
"In search of an understandable consensus algorithm"
"Egalitarian paxos"

(epaxos and raft)

Oct 14
"Scalable Causal Consistency for Wide-Area Storage with COPS"
 "Bolt-on causal consistency"

(cops and bolton)

Oct 19
Databases. No reading

(slides both days )

Oct 21
More Databases. No reading


(slides see above)

Oct 26
"Dynamo: Amazon's highly available key-value store" - akilesh
"Monarch: Google's planet-scale in-memory time series database" - elliot
Oct 28
PaPoC '21 workshop papers
"Read-Write Quorum Systems Made Practical"
"Convergent Causal Consistency for Social Media Posts" - kelsey

Project 4: Distributed Consensus: EPaxos due Sunday night.

Nov 2
PaPoC '21 workshop papers
"Totally-Ordered Prefix Parallel Snapshot Isolation" - anubhav
"Towards the Synthesis of Coherence/Replication Protocols from Consistency Models via Real-Time Orderings" - claude
Nov 4
"Tango: Distributed data structures over a shared log"
"Salt: Combining ACID and BASE in a Distributed Database"

(notes: tango and salt)

Nov 9
"Fast and secure global payments with Stellar" - william
"CALM: when distributed consistency is easy"
Nov 11
"Fast crash recovery in RAMCloud" - nikhil
"Implementing Linearizability at Large Scale and Low Latency"

(ram durable, lin notes)

Nov 16
"Spanner: Google's Globally-Distributed Database" - jerry
Background (no blog): "Living Without Atomic Clocks" (CockRoachDB)
Nov 18
"State-machine replication for planet-scale systems"
"High performance I/O for large scale deep learning" - Dhanvee
(AIStore slides, Atlas notes)
Nov 23Thanksgiving Nov 25Thanksgiving

Project 5a: Consistent Shared Objects due Sunday night. -ish.

Nov 30"The Fuzzylog: a Partially Ordered Shared Log"
"Calvin: fast distributed transactions for partitioned database systems"
Dec 2
Fault Tolerance and Security
"Practical Byzantine Fault Tolerance"
"Separating agreement from execution for byzantine fault tolerant services"
Dec 7
Hash Chains
"Bitcoin: A Peer-to-Peer Electronic Cash System"
"Architecture of the hyperledger blockchain fabric"
Dec 9
More Databases. Correctness Anomalies Under Serializable Isolation (no blog)

Project 5b: Transactions

Late Policies

All projects will have a due date, and a late due date two days later.
  • Do each project by yourself. Sadly, we can and do detect and fail those that do not abide by this policy each semester. You may ask, and answer, general questions on Piazza.
  • Your grade loses 20% of the max score if the project is turned in after the due date, but by the late due date. Anything after the late due date gives you a zero.

Attendance and general grading policies

Students are responsible for all material covered, and all announcements, deadlines, policies, etc., discussed in lecture and discussion section, regardless of whether they were in class to hear the information or not. It’s understood that students may occasionally have to miss class for various reasons, but email and office hours are not intended as a replacement for class attendance. Consequently, only students who typically and regularly attend class will receive assistance during office hours.

Coursework will count toward the final grade according to the following percentages:

  1. Projects: 65%
    • There will six projects, the first worth 10%, the rest 11% each.
    • Must get at least half credit on each project to pass the course.
  2. Blog entries: 10%
    • You are required to upload a blog entry before each class except the first. More details in class.
  3. Paper presentation / class participation: 5%
  4. Final exam: 20%

Academic integrity

The Campus Senate has adopted a policy asking students to include the following statement on each examination or assignment in every course: “I pledge on my honor that I have not given or received any unauthorized assistance on this examination (or assignment).” Consequently, you will be requested to include this pledge on each exam and project. You may review the University’s Code of Academic Integrity for yourself at

 Web Accessibility