This class will introduces students to the theory and practice of
building large scale computer systems that harness hundreds or thousands
for machines to attack problems of enormous scale. Such distributed
systems are necessary to solve problems of such large size that they
cannot complete in any reasonable time on a single machine.
These systems are known variously as clusters, clouds, and grids.
Students in this class will gain experience using several large
scale distributed systems deployed at Notre Dame and other partner
institutions around the country. Each assignment will involve writing
code or constructing a system that harnesses large numbers of machines.
This will be a highly practical class, and should be enjoyable to any student
who likes to write lots of code and make real systems work. Many students
who take this class end up using these tools in their daily work. The class is open to juniors, seniors, and graduate students.
Final talk Instructions are now available.
A4 is now available, due on April 15th.
Old exams to help you study: midterm-2007, midterm-2008,final-2007, final-2008
A3 is now available.
Instructions for the course project are now avilable.
A2 is now available.
Due to the various network and AFS outages, the due date for A1 has been pushed back to Feb 4th.
A1 is now available.
Prof Thain will have office hours 1-3PM on Wednesdays.
Class Mailing List
A0 - Warm Up Assignment
A1 - High Throughput Ray Tracing with Condor
A2 - Fast Othello Using Work Queue
A3 - Chirp Performance Study
A4 - Web Indexing with Map-Reduce
Overview Paper of Condor (Lecture Notes)
Condor at Notre Dame
Condor Tutorial and Slides
Condor Reference Manual
Work Queue and Makeflow
Work Queue Web Page
Work Queue API
Makeflow Web Page
Abstractions for Cloud Computing with Condor
Overview Paper of Chirp (Lecture Notes)
Chirp Web Pages
Parrot Web Pages
Outline of Map-Reduce Notes (Lecture Notes)
Research Paper Describing Map-Reduce
Hadoop at Notre Dame
Jimmy Lin and Chris Dyer, Book: Data Intensive Text Processing with Map-Reduce, book in draft form, Feb 2009.
A. Pavlo, E. Paulson, A. Rasin, D. Abadi, D. Dewitt, S. Madden, and M. Stonebraker, A Comparison of Approaches to Large-Scale Data Analysis, in Proceedings of SIGMOD 2009.
M. Isard, M. Budiu, Y. Yu, A Birrell, D. Fetterly, Dryad: Distributed Data Parallel Programming from Sequential Building Blocks, Proceedings of EuroSys 2007.