|CCL HomeSoftware Community Operations||
This tutorial will have you install CCTools into your FutureGrid home directory and will take you through some distributed computation examples using Makeflow.
Login to the Future Grid Head Node
In this tutorial, we will use the alamo login node:
Download, Build, and Install CCTools
Navigate to the download page in your browser to review the most recent versions: http://www.nd.edu/~ccl/software/download.shtml
Setup a Sandbox for this Tutorial and Download a copy of CCTools 3.5.3
Build and Install CCTools
Set Environment Variables
You will need to add your CCTools directory to your $PATH:
The Makeflow script should look like:
Running with Local (Multiprocess) Execution
Here we're going to tell Makeflow to dispatch the jobs using regular local processes (no distributed computing!). This is basically the same as regular Unix Make using the -j flag.
If everything worked out correctly, you should see:
Running with FutureGrid's Torque
The following code tells Makeflow to dispatch jobs using the Torque batch submission system (qsub, qdel, qstat, etc.).
You will get as output:
Well... that's not right. Nothing was run! We need to clean out the generated output files and logs so Makeflow starts from a clean slate again:
We see it deleted the files we generated in the last run:
Now let's try again:
We get the output we expect:
Notice that the output is no different from using local execution. Makeflow is built to be execution engine agnostic. There is no difference between executing the task locally or remotely.
In this case, we can confirm that the job was run on another host by looking at the output produced by the simulation:
Here we see that the worker ran on node c056.cm.cluster. It took 5 seconds to complete.
Running Makeflow with Work Queue
The submission and wait times for the Makeflow tasks in the above case will vary because of the latencies in the underlying batch job submission platform (Torque). To avoid long submission and wait times, Makeflow can be run using Work Queue. Work Queue excels at handling low latency and short turn-around time jobs.
Here, we will start Makeflow which will setup a Work Queue master on an arbitrary port using -p 0. And, we will turn on the debugging output to see what happens as it runs.
You should see output like this:
Now, run the work_queue_worker with the port the master is listening on.
When the tasks are finished, the worker should quit due to the 10 second timeout.
Running Makeflow with Work Queue Workers on Torque
The goal of this exercise is to setup Work Queue workers on Torque compute nodes. Here we submit the worker tasks using the torque_submit_workers executable. To know more about the options and arguments for torque_submit_workers, do:
In this exercise, we will use the catalog server and a project name so workers can find the master without being provided with the master's hostname and port. We force Makeflow and the Work Queue workers to use the catalog server by specifying the -a option. We then specify a project name for the Makeflow script using the -N option that takes a string as an argument. The workers are then provided with the project name to connect to using the same -N option.
NOTE: Pick your own distinct project name for MYPROJECT.