Assignment A4: Distributed Storage using Chirp

In this assignment, you will learn how to make use of distributed storage to construct a web server with multiple redundant backend servers, synchronized by periodic updates. You will need the code from the web server that you developed in the previous project. This assignment has three steps to keep you on track:

Part 1: Warm Up

For this project, you will be making use of the Chirp fileserver. Start at the Chirp homepage, and download and install the software in your home directory. Start a private Chirp server, and make sure that you can access it using both Parrot and the command line tools.

Then, check out the Chirp API Documentation, and follow the instructions titled Writing Your First Program with Chirp. Make sure that you can compile and run a simple program using the Chirp API.

Part 2: Modify the Web Server

Modify your web server to load a list of Chirp servers from a file named servers.txt. When the client requests a file, it should not be loaded from the local filesystem, but from a Chirp server chosen at random from the list. Make use of chirp_reli_open and similar calls to access the data. Start two or three Chirp servers of your own on different machines to test this feature.

Next, make your server fault tolerant. If the server chosen at random is crashed or otherwise unavailable (i.e. a chirp call returns -1 with errno==ECONNRESET) then the web server should pick another server at random and try again. In other words, the web browser must never know that a failure occured. Your system should simply try another and keep going. To test this feature, kill off one or two of your Chirp servers, and verify that the web server still works.

Part 3: Synchronize Multiple Servers

Of course, it only makes sense to have multiple redundant servers if they actually store the same data! For the previous step, you might have manually copied data from one server to another, but this would quickly become tedious. Instead, you will build a solution that uses one master copy of the data, Whenever you wish to change the website, you simply make the changes on one server, and then copy the changes to the other servers with a tool called chirp_synchronize.

Using the Chirp API, build a tool called chirp_synchronize that is invoked as follows:

chirp_synchronize serverA dirA serverB dirB
Your tool should examine dirA on serverA, and transfer any files that it finds to dirB on serverB if they are missing or have changed. (For extra credit, you can also make the tool work recursively by synchronizing directories as well.) Here are some pointers in the API to get you started:
  • To read a directory, see chirp_reli_opendir.
  • To check for the presence of a file, use chirp_reli_stat.
  • If that succeeds, then you can look at the chirp_stat structure, particularly cst_mtime or cst_size to see if the file has been changed.
  • To copy a file, call chirp_reli_open once to open one file for reading, and again to open the other file for writing. Then, use chirp_reli_pread and chirp_reli_pwrite to copy the data.
  • Handing In

    Turn in all of your source files and your Makefile to the dropbox directory:
    /afs/nd.edu/coursefa.08/cse/cse40771.01/dropbox/YOURNAME/a4
    
    Your grade will be weighted as follows:
  • 25% - Web Server Accesses Chirp Correctly
  • 25% - Web Server Tolerates Offline Servers
  • 40% - Synchronize Multiple Servers
  • 10% - Good Coding Style