Course Project for Distributed Systems

The final project in this course will be open ended. You will propose, carry out, and report upon a project singly or in groups of two students. The project should be about twice the size or difficulty of one of the assigned class projects. Your project must involve each of the following three elements:
  • Build. Your project must involve building a system of some kind. You are encouraged to make use of existing systems and software packages, particularly those used in class. However, your project must involve coding of some kind, in whatever language is suited for the task.
  • Evaluate. You must evaluate what you have built for both functional correctness and quantitative performance. For correctness, you must develop a testing procedure that shows that your system operates correctly under all expected conditions. For performance, you must define an appropriate metric -- latency, bandwidth, throughput -- and measure it across a range of configurations.
  • Communicate. You must present your work cogently by writing a paper and making an oral presentation. The paper should describe the motivation, architecture, technical details, evaluation methods, and quantitative results of your project. The oral presentation should summarize the most important aspects of the paper in a fifteen minute presentation during the last week of class. More information on the requirements of each will be forthcoming.
  • Project Ideas

    The following are rough ideas for possible projects. You are strongly encouraged to make use of software and systems employed in the required projects. Part of your job will be to flesh out the details of the project before you begin work. You may do a project that is not on this list, but discuss it with Prof. Thain first.
  • Map-Reduce on Condor. Design and implement a program called condor_map_reduce that accepts a set of files, a map program, and a reduce program. Connect everything together to perform a map-reduce using multiple CPUs and local disk to parallelize the task. Evaluate the performance of your system on a variety of workloads. (Introduction to Map-Reduce here.)

  • Distributed Virtual Machine Facility. In both academic research and business computing, it is often necessary to precisely reproduce an execution environment: A particular program may only be compatible with a certain operating system version, or a particular application may require root privileges. To make it easy to reproduce a given environment, design a virtual machine facility that allows the user to simply issue something like "runvm redhat73 emacs", causing a virtual machine of the given type to be created, submitted to Condor to allocate a CPU, and then connected to the submitting user's display via VNC. The challenge is to figure out how to efficiently manage all of the large VM images without overloading the caller's workstation.

  • A Cloud of Web Servers. Suppose that you operate a data center for a large corporation that has highly varying loads in web traffic. When traffic is low, you only need a few servers behind your multiplexer, but when big news hits, the system should automatically expand by adding servers on idle machines, and then remove then again when load drops off. Build a system that does this, using your existing multiplexer as a front end that submits web servers to Condor (and removes them) as load comes and goes. Use Condor to generate load on the web server by submitting jobs that invoke wget to fetch web pages.

  • Peer to Peer Preservation. Design a peer-to-peer preservation system. Suppose that a user wishes to preserve a document forever. If the user delivers it to any one node in the system, then that node should take active steps to communicate with other known peers to make further copies of the document. Be careful to ensure that the system can recover from the loss of multiple nodes, but is also not causing continuous unnecessary network traffic. You may borrow ideas from other systems such as Gnutella Freenet, or Chord, provided that you implement the system yourself. Start by reviewing section 4.5.2 in the textbook.

  • Chained Message Queueing System. Build a chained message queueing system like that described in section 4.3. It should consist of a message forwarding process that reads files from a directory and then delivers them to an identical process running on another machine, which stores them in a directory where could be read by another forwarding process. Make sure that your system can handle expected failure modes such as server crashes, network outages, full disks. Evaluate the throughput of your system on a large amount of data.
  • Milestones

  • Monday, November 3rd. - Turn in a one page project proposal that describes the project that you intend to do, what languages and resources will be necessary to carry it out, and how you intend to evaluate the work. The proposal should be one to two pages of text. The instructor will follow up with you to make sure that the project is of appropriate size and difficulty.

  • Wednesday, December 3rd. - Give a ten minute presentation on your project. The talk should include an overview of the goal or problem, a detailed description of the structure of your system, an example of how your system operates, and an evaluation of the correctness and performance of your system. Your project does not have to be 100% complete, but it should be pretty close. Your talk should be accompanied by 5-10 carefully designed and edited PowerPoint slides.

  • Friday, December 12th, Noon - Turn in your code and the final paper. The code should be structured such that the instructor can build and execute it independently. The paper should give an overview of the goal or the problem, a detailed description of the structure of your system, including a good diagram where appropriate, and an evaluation of the correctness and performance of the system. There is no specific length requirement; the paper should be long enough to explain all of the necessary details. The said, anything less than five pages is probably too short; anything longer than fifteen pages is probably too long. Deposit your code in the dropbox directory and your paper in my mailbox in Room 384 Fitzpatrick.