Makeflow and Work Queue Tutorials

Download and Installation

Download and install the cctools software in your home directory on one of the student machines:
cd $HOME
wget http://ccl.cse.nd.edu/software/files/cctools-5.3.4-source.tar.gz
tar xvzf cctools-5.3.4-source.tar.gz
cd cctools-5.3.4-source
./configure --prefix $HOME/cctools --tcp-low-port 9000
make
make install
cd $HOME
The software is now installed in $HOME/cctools, so you must set your path appropriately:
setenv PATH ${HOME}/cctools/bin:${PATH}
If you use sh/ksh/zsh/bash instead, then do this:
export PATH=${HOME}/cctools/bin:${PATH}
Now double check that you can run the various commands, like this:
makeflow -v
work_queue_worker -v
work_queue_status

Makeflow Example

Let's being by using Makeflow to run a handful of simulation codes. First, make and enter a clean directory to work in:
mkdir ~/cctools-tutorial
cd ~/cctools-tutorial
Let's use the example Makeflow from the manual. Copy and paste this into a file called example.makeflow:
CURL=/usr/bin/curl
CONVERT=/usr/bin/convert
URL=http://ccl.cse.nd.edu/images/capitol.jpg

capitol.montage.gif: capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg
    LOCAL $CONVERT -delay 10 -loop 0 capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg capitol.270.jpg capitol.180.jpg capitol.90.jpg capitol.montage.gif

capitol.90.jpg: capitol.jpg $CONVERT
    $CONVERT -swirl 90 capitol.jpg capitol.90.jpg

capitol.180.jpg: capitol.jpg $CONVERT
    $CONVERT -swirl 180 capitol.jpg capitol.180.jpg

capitol.270.jpg: capitol.jpg $CONVERT
    $CONVERT -swirl 270 capitol.jpg capitol.270.jpg

capitol.360.jpg: capitol.jpg $CONVERT
    $CONVERT -swirl 360 capitol.jpg capitol.360.jpg

capitol.jpg: $CURL
    LOCAL $CURL -o capitol.jpg $URL 

To run it on your local machine:
makeflow example.makeflow
Note that if you run it a second time, nothing will happen, because all of the files are built:
makeflow example.makeflow
makeflow: nothing left to do
Use the -c option to clean everything up before trying it again:
makeflow -c example.makeflow
Use -d all to have makeflow print its actions as they are performed:
makeflow -d all example.makeflow
makeflow -d all -c example.makeflow
If you have a Condor Pool, then you can direct Makeflow to run your jobs there:
makeflow -d all -T condor example.makeflow
You may find that you cannot run condor from your HOME directory. This is because the condor daemons do not have the AFS credentials to access your files. Instead, try:
mkdir /tmp/$USER-makeflow
cp example.makeflow /tmp/$USER-makeflow
cd /tmp/$USER-makeflow
makeflow -T condor example.makeflow
If you have access to a batch system running SGE, then you can direct Makeflow to run your jobs there:
makeflow -d all -T sge example.makeflow
You will also notice that a workflow can run very slowly if you submit each batch job to SGE or Condor, because it typically takes 30 seconds or so to start each batch job running. You can use Work Queue to get around this limitation. This allows Makeflow to function as a master process that quickly dispatches work to remote worker processes.
cd ~/cctools-tutorial
makeflow -c example.makeflow
makeflow -T wq --port 0 example.makeflow 
listening for workers on port XXXX.
...
Now open up another shell and run a single worker process:
work_queue_worker localhost XXXX
Now, do these in-class exercises:
  • Run the same workflow again, but use a project name instead of a port number to match your workers with your workflow.
  • Now, run a selection of workers in Condor. Once you verify that the workers are running (condor_q) then start makeflow and see what happens.
  • Use the -d all option to work_queue_worker to see what it is doing.
  • Work Queue Tutorial

    Download the example Work Queue application in your favorite language:
  • C: work_queue_example.c
  • Python: work_queue_example.py
  • Perl: work_queue_example.pl
  • If you are using the C example, compile it like this:
    gcc work_queue_example.c -o work_queue_example -I${HOME}/cctools/include/cctools -L${HOME}/cctools/lib -lwork_queue -ldttools -lm
    
    If you are using the Python example, set PYTHONPATH to include the Python modules in cctools:
    export PYTHONPATH=${PYTHONPATH}:${HOME}/cctools/lib/python2.6/site-packages
    
    If you are using the Perl example, set PERL5LIB to include the Perl modules in cctools:
    export PERL5LIB=${PERL5LIB}:${HOME}/cctools/lib/perl5/site_perl
    
    The example program simply runs gzip on whatever files you give on the command line, so let's try it like this:
    ./work_queue_example *
    
    The example program listens on the default port of 9123. So, you are likely to get the following error:
    couldn't listen on port 9123: Address already in use
    
    Modify the program to use port 0, run it again, and you should see this:
    listening on port XXXX...
    ...
    waiting for tasks to complete...
    
    Now, open up a new window and run a single local worker:
    work_queue_worker localhost XXXX
    
    Now, do these in-class exercises:
  • Modify the example to use a project name instead of a port number to match your workers with your workflow.
  • Run the example again, using workers that are submitted to Condor (condor_submit_workers -h).
  • Modify the program to run one task at a time, wait for it to complete, then submit the next one.