Makeflow User's Manual

Last Updated June 2009

Makeflow is Copyright (C) 2009 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.

Overview

Makeflow is a workflow engine for distributed computing. It accepts a specification of a large amount of work to be performed, and runs it on remote machines in parallel where possible. In addition, Makeflow is fault-tolerant, so you can use it to coordinate very large tasks that may run for days or weeks in the face of failures. Makeflow is designed to be similar to Make, so if you can write a Makefile, then you can write a Makeflow.

You can run a Makeflow on your local machine to test it out. If you have a multi-core machine, then you can run multiple tasks simultaneously. If you have a Condor pool or a Sun Grid Engine batch system, then you can send your jobs there to run. If you don't already have a batch system, Makeflow comes with a system called Work Queue that will let you distribute the load across any collection of machines, large or small.

Makeflow is part of the Cooperating Computing Tools. You can download the CCTools from this web page, follow the installation instructions, and you are ready to go.

The Makeflow Language

The Makeflow language is very similar to Make. A Makeflow script consists of a set of rules. Each rule specifies a set of target files to create, a set of source files needed to create them, and a command that generates the target files from the source files.

Makeflow attempts to generate all of the target files in a script. It examines all of the rules and determines which rules must run before others. Where possible, it runs commands in parallel to reduce the execution time.

Here is a Makeflow that uses the convert utility to make an animation. It downloads an image from the web, creates four variations of the image, and then combines them back together into an animation. The first and the last task are marked as LOCAL to force them to run on the controlling machine.

CURL=/usr/bin/curl
CONVERT=/usr/bin/convert
URL=http://www.cse.nd.edu/~ccl/images/capitol.jpg

capitol.montage.gif: capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg
        LOCAL $CONVERT -delay 10 -loop 0 capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg capitol.270.jpg capitol.180.jpg capitol.90.jpg capitol.montage.gif

capitol.90.jpg: capitol.jpg $CONVERT
        $CONVERT -swirl 90 capitol.jpg capitol.90.jpg

capitol.180.jpg: capitol.jpg $CONVERT
        $CONVERT -swirl 180 capitol.jpg capitol.180.jpg

capitol.270.jpg: capitol.jpg $CONVERT
        $CONVERT -swirl 270 capitol.jpg capitol.270.jpg

capitol.360.jpg: capitol.jpg $CONVERT
        $CONVERT -swirl 360 capitol.jpg capitol.360.jpg

capitol.jpg: $CURL
        LOCAL $CURL -o capitol.jpg $URL
Note that Makeflow differs from Make in a few important ways. Read section 4 below to get all of the details.

Running Makeflow

To try out the example above, copy and paste it into a file named example.makeflow. To run it on your local machine:
% makeflow example.makeflow
Note that if you run it a second time, nothing will happen, because all of the files are built:
% makeflow example.makeflow
makeflow: nothing left to do
Use the -c option to clean everything up before trying it again:
% makeflow -c example.makeflow
If you have access to a batch system running SGE, then you can direct Makeflow to run your jobs there:
% makeflow -T sge example.makeflow
Or, if you have a Condor Pool, then you can direct Makeflow to run your jobs there:
% makeflow -T condor example.makeflow
To submit Makeflow as a Condor job that submits more Condor jobs:
% condor_submit_makeflow example.makeflow
You will notice that a workflow can run very slowly if you submit each batch job to SGE or Condor, because it typically takes 30 seconds or so to start each batch job running. To get around this limitation, we provide the Work Queue system. This allows Makeflow to function as a master process that quickly dispatches work to remote worker processes.

To begin, let's assume that you are logged into a machine named barney.nd.edu. start your Makeflow like this:

% makeflow -T wq example.makeflow
Then, submit 10 worker processes to Condor like this:
% condor_submit_workers barney.nd.edu 9123 10
Submitting job(s)..........
Logging submit event(s)..........
10 job(s) submitted to cluster 298.
Or, submit 10 worker processes to SGE like this:
% sge_submit_workers barney.nd.edu 9123 10
Or, you can start workers manually on any other machine you can log into:
% worker barney.nd.edu 9123
Once the workers begin running, Makeflow will dispatch multiple tasks to each one very quickly. If a worker should fail, Makeflow will retry the work elsewhere, so it is safe to submit many workers to an unreliable system.

When the Makeflow completes, your workers will still be available, so you can either run another Makeflow with the same workers, remove them from the batch system, or wait for them to expire. If you do nothing for 15 minutes, they will automatically exit.

Note that condor_submit_workers and sge_submit_workers are simple shell scripts, so you can edit them directly if you would like to change batch options or other details.

The Fine Details

The Makeflow language is very similar to Make, but it does have a few important differences that you should be aware of.

Get the Dependencies Right

You must be careful to accurately specify all of the files that a rule requires and creates, including any custom executables. This is because Makeflow uses this information to construct the environment for a remote job. For example, suppose that you have written a simulation program called mysim.exe that reads calib.data and then produces and output file. The following rule won't work, because it doesn't inform Makeflow what files are neded to execute the simulation:
# This is an incorrect rule.

output.txt:
        ./mysim.exe -c calib.data -o output.txt
However, the following is correct, because the rule states all of the files needed to run the simulation. Makeflow will use this information to construct a batch job that consists of mysim.exe and calib.data and uses it to produce output.txt:
# This is a correct rule.

output.txt: mysim.exe calib.data
        ./mysim.exe -c calib.data -o output.txt

No Phony Rules

For a similar reason, you cannot have "phony" rules that don't actually create the specified files. For example, it is common practice to define a clean rule in Make that deletes all derived files. This doesn't make sense in Makeflow, because such a rule does not actually create a file named clean. Instead use the -c option as shown above.

Just Plain Rules

Makeflow does not support all of the syntax that you find in various versions of Make. Each rule must have exactly one command to execute. If you have multiple commands, simply join them together with semicolons. Makeflow allows you to define and use variables, but it does not support pattern rules, wildcards, or special variables like $< or $@. You simply have to write out the rules longhand, or write a script in your favorite language to generate a large Makeflow.

Local Job Execution

Certain jobs don't make much sense to distribute. For example, if you have a very fast running job that consumes a large amount of data, then it should simply run on the same machine as Makeflow. To force this, simply add the word LOCAL to the beginning of the command line in the rule.

Batch Job Refinement

When executing jobs, Makeflow simply uses the default settings in your batch system. If you need to pass additional options, use the BATCH_OPTIONS variable or the -B option to Makeflow.

When using Condor, this string will be added to each submit file. For example, if you want to add Requirements and Rank lines to your Condor submit files, add this to your Makeflow:

BATCH_OPTIONS = Requirements = (Memory>1024)

When using SGE, the string will be added to the qsub options. For example, to specify that jobs should be submitted to the devel queue:

BATCH_OPTIONS = -q devel

Displaying a Makeflow

When run with the -D option, Makeflow will emit a diagram of the Makeflow in the Graphviz DOT format. If you have dot installed, then you can generate an image of your workload like this:
% makeflow -D example.makeflow | dot -T gif > example.gif

For More Information

For the latest information about Makeflow, please visit our web site and subscribe to our mailing list.