The final project in this course will be open ended. You will propose,
carry out, and report upon a project in groups of one or two students.
Overall, the project should demonstrate how to make use of a cloud system
to scale up an application to approx 100X of the size or speed of a single machine.
While a truly novel project idea would be nice, it's ok to recreate something that already exists, so that you can learn from the process. If something similar already exists, you should be aware of it and read about how it works, but then write your own code.
The project has three main components:
Build it on the cloud. Your project must involve harnessing a cloud computing system of some kind. The system may be one of those that we have used or studied in class, or it may be some other commercial cloud computing or storage system.
Evaluate it critically. You must evaluate what you have built for correctness, performance, and scalability. For correctness, you must develop a procedure to evaluate that the system actually accomplishes what it intends to do. (e.g. If you store some data in the cloud, you must verify that it is the same when it is read back.) For performance and scalability, you must select an appropriate measure -- simulations/day, GB/s, transactions/hour -- and then
evaluate the performance as the system scales up to 100X or higher.
Communicate it clearly. You must present your work cogently by writing a paper and making an oral presentation. The paper should describe the motivation, architecture, technical details, evaluation methods, and quantitative results of your project. The oral presentation should summarize the most important aspects of the paper and give a demo of how it works, during the last week of class.
If your project will make use of Amazon web services, you can make use of a academic grant we have received from Amazon.
Each member of the class can receive a credit in Amazon web services, to be used to run virtual machines
and other services for the class. You will have to register with Amazon, enter your own credit card, and
then enter a special credit code. The TA will be distributing the credit codes shortly.
The following are rough ideas for possible projects:
Scalable Software Engineering. Many software engineering tasks
that were previously infeasible become much easier with access to cloud resources. For example, suppose that you want to evaluate how well your software works
on twenty different flavors Linux. Build a system which can build and test a piece of software on a large number of virtual machines simultaneously. Or, build a machine that evaluates a test procedure on a range of commits to see where a bug is present. (Be careful: The tricky part is to track your machines and work accurately so that you don't lose anything or run up a big bill.)
Convert an Existing Application to the Cloud.
Take a real application that you use in your research, classes, or for fun,
convert it to parallel form using a cloud programming model
such as Makeflow, Lambda, or Spark, and get it to run on as many processors as possible.
Be sure to carefully measure the performance at a varying number of processors
to produce a good speedup graph. What is the scalability limit of the system, and why?
Evaluate a NoSQL System. Select a NoSQL database like HBase, MongoDB, or Cassandra, and deploy it on the cloud using Amazon EC2. Learn how to upload data and issue queries, then evaluate the performance of the system as
you increase the number of clients and storage nodes. Note that these
systems are largely designed to handle multiple clients at once, so part
of the challenge is to run many clients via Condor or Amazon.
Build a Scalable Website. Design a simple interactive
web site that lets you upload stories and images, like a social networking site.
Now, use Condor to evaluate how many requests per second that one little site
can supply. Then, modify the site to use cloud services for scalability --
store your data in HBase or SimpleDB, and use load balancers to scale up the
web servers. Show that you site can grow to handle hundreds or thousands of users at once!
A natural limitation of a master-worker system like Work Queue is
the master itself, which is a single point of failure and also a bottleneck
for the dispatch and execution of tasks. Design an alternative system
for distributed execution that is more peer-to-peer and does not have
a single bottleneck. Evaluate the performance compared to Work Queue
on a large number of machines, and show it what ways it is better or worse.
Cloud Head-to-Head Evaluation. Everyone wants to know
which cloud provider is the "best" in terms of performance, scalability,
and cost. To answer this question, compare two cloud providers
(pick from Amazon EC2, Google App Engine, Windows Azure, or IBM Cloud)
head-to-head in a variety of basic operations: time to start a virtual
machine, CPU performance, upload/download speed, storage operations per second,
and so forth.
Build a Cloud Filesystem. Amazon S3 stores blobs in a flat
namespace, but doesn't provide a directory tree like a conventional filesystem.
Solve this problem by building a cloud filesystem library that stores
the directory tree in SimpleDB and the file blobs in S3. Evaluate
the performance and scalability compared to a local filesystem.
Come Up With Your Own. These are only ideas to get you thinking! Come up with your own idea, or modify one of these above.
Friday, October 5th - Turn in a printed one page project proposal
that describes the project members, the cloud system that you intend
to use, what resources will be necessary to carry it out, and how you
intend to evaluate the performance and/or scalability of the system.
The instructor will follow up with you to make sure that the project
is of appropriate size and difficulty. If multiple groups propose
substantially similar projects, we may ask you to adjust your work slightly.
Week of November 12th - Meet with the instructor to give
a demo on what you have working so far. At this point, you should
have installed (or have access to) the appropriate software and systems,
be able to show them working in some way, and have made an initial
measurement of performance or scalability. We will discuss the plan
for finishing in a timely way, and make any necessary course corrections.
Week of November 26th. - Give a 10-15 minute in-class
presentation on your project. The talk should include an overview of
the goal or problem, an overview of the cloud system that you employed,
how your application makes use of the system, and present some initial
results on performance and scalability. The work need not be totally complete
at this point, but it should be well along the way.
Each project partner should speak for a portion of the time.
Your talk should be accompanied by about 10 carefully designed and edited slides.
Friday, December 7th, noon - Turn in your final paper and your code.
The paper should give an overview of the goal or the problem,
a detailed description of the structure of your system and the application,
and an evaluation of the correctess and performance of your system.
The paper should include at least one diagram indicating the architecture
of the system and at least one graph which summarizes your performance
evaluation. There is no specific length requirement; the paper should be long enough to explain all of the necessary details. That said, anything less than ten pages is probably too short; anything longer than twenty pages is probably too long. All elements of the paper should be prepared with care
and attention to proper English. I an interested in your writing, not
your formatting, so please stick to standard 12-point Times font,
double-spaced, with one inch margins. Turn in your paper in PDF format
to your dropbox directory.
All relevant code should also be turned into your dropbox directory,
including source code, configuration files, scripts, etc.
The code should be complete enough that the grader can build and
run your work in the appropriate environment. If there are
important elements that cannot be turned in as code for whatever
reason (e.g. too big or expensive to download from the cloud) then turn
in links, screenshots, or other similar evidence of the completed work.
Sample Projects from Prior Years
Scaling Up with AWS, Alan Vuong and Katie Quinn
Work Queue and Google App Engine (Movie), Dylan Zaragoza
FuS3FS -- A Cloud Filesystem, Kaijun Feng and Chao Luo
Short Read Alignment, Xinyi Wang and Xuanyi Li
Parallelizing BWA, Christopher Ray
Movie Rendering Service, Samantha Rack
Building a Scalable Dynamic Website, Lucas Barbosa-Parzianello
The Future of Microblogging, Jack Magiera and Jon Richelsen
Machine Learning with MLLib and scikit-learn, Christopher Homa
Matrix Multiplication in Hadoop, Siddharth Saraph
Scaling Kamona, Bruno Braga and Fernando Beletti
Jewel: File Syncing with AWS, Kevin Riehm
Amazon DynamoDB: Scaling and Benchmarking, Celeste Castillo and Ben Kennel