CSE 40822 / A0

A0 - Warm Up Assignment

For many of the assignments in this class, you will start by logging into the student cluster, which consists of the following machines:
student00.cse.nd.edu
student01.cse.nd.edu
student02.cse.nd.edu
student03.cse.nd.edu
Use ssh to connect to these machines. Accounts should have been created for everyone in the class at the beginning of the semester. If you do not have an account (perhaps you registered late) email the instructor. (Don't wait until the last minute.) If you find one of these machines to be overloaded or unresponsive, you can always log into another one.

For a warm up assignment, you will make some observations about the ND Condor pool, and then learn how to submit simple jobs. To begin, read these two pages:

  • Introduction to Condor at Notre Dame
  • Using Condor at Notre Dame
  • You may also find it useful to have the complete Condor manual handy, but you don't have to read it from beginning to end:
  • Condor Manual Online
  • Part 1: Pool Dimensions

    To view basic info about the machines in the pool:
    condor_status
    
    There is much more information available than is shown in the basic display. To show all information available about every machine:
    condor_status -long
    
    To customize the basic view, try a command like this. Note the similarity to printf format strings.
    condor_status -format "%s\t" Name -format "%d\n" TotalMemory
    
    Now answer the following questions about each of the Notre Dame Condor pool:
    1. How many machines (slots) are present?
    2. Which machine has the largest/smallest amount of memory?
    3. Which machine has the largest/smallest number of CPUs?
    4. Which machine has the largest/smallest disk space?
    5. Use your favorite graphing program to plot a scatter graph with one point for each CPU slot in the pool, with the CPU speed (Mips) on the X axis and the RAM (TotalMemory) on the Y axis.

    Part 2: Submit 1000 Jobs

    Submit one Condor job that simply runs the following script and returns the output:
    #!/bin/sh
    uname -a
    date
    sleep 1
    

    Once you are 100 percent sure that one job works correctly, then write a single condor_submit script that submits one thousand jobs like the one above, each writing output to a separate file. Make sure that your submit script generates a user log file as follows:

    log = userlog.log
    
    Once the batch is complete, the file userlog.log will tell you everything that Condor did on your behalf to execute the jobs.
    1. How many different machines did your jobs run on?
    2. How much time elapsed from submission of the first job to completion of the last job?
    3. How long does it take to run the same script 1000 times on one machine? Is it faster, or slower, and why? Hint: time repeat 1000 script.sh
    Type your answers neatly, print them out, and bring them to class. This assignment is due at the beginning of class on Friday, January 22.