Last edited: May 2015
Prune is Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005- The University of Notre Dame.
All rights reserved.
This software is distributed under the GNU General Public License.
See the file COPYING for details.
Prune is a system for executing and precisely preserving scientific workflows. Collaborators can verifiy research results and easily extend them at a granularity that makes sense to users, since the granularity was determined by a user.
The following basic commands are fundamental to PRUNE:
PUT [local.txt] AS [prune_name]: Files to be used in the workflow can be added to the preserved namespace.
Ex. PUT mylocalfilename.txt AS data_name_in_prune
EVAL <name(s)> = <expression> (default): Evaluate an expression and assign the results to the specified name(s). Work defined in these statements will not be executed until control is passes to PRUNE such as with USE and WORK commands.
Ex. sorted_data = sort(input_data)
USE <framework> [<arguments>]: Define how the execution should be performed
Ex. USE wq prune #Start and use a Work Queue master with the name 'prune'
Ex. USE local 1 #Use 1 local execution thread
WORK [FOR <timeout>]: Blocking function that instructs PRUNE to start executing the workflow with whatever resources are available.
Ex. WORK FOR 60 #Block for 60 seconds so PRUNE can work
GET [prune_name] AS [local.txt]: Files generated by the workflow can be put into the user's namespace for use by some external program.
Ex. GET data_name_in_prune AS mylocalfilename.txt
Functions are currently defined within a script file (which is PUT in the system) to define the types of arguments the script expects and the output files that should be preserved.
Ex. #PRUNE_INPUTS File File Text
Ex. #PRUNE_OUTPUT outfile1.txt
Ex. #PRUNE_OUTPUT outfile2.txt
The following commands are also available for convenience:
When PRUNE is started using the "prune" executable, an interface becomes available for the user to type commands. Ctrl-C will terminate PRUNE and upon restarting any previously running tasks revert to the "Run" status.
The user will be initially prompted to specify locations for a database file, data folder, and sandbox folder. This meta-data will be stored in the working directory as a file named .prune.conf for PRUNE to use later.
Here is a full example of a merge sort in PRUNE:
$ prune
$ PUT sort.sh AS sort
$ PUT merge.sh AS merge
$ PUT nouns.txt AS nouns
$ PUT verbs.txt AS verbs
$ n = sort(nouns)
$ v = sort(verbs)
$ merged_result = merge(n,v)
$ USE wq prune
Work Queue master started on port 9001 with name 'prune'...
$ WORK
...
PRUNE finished work in: 01m16s
$ GET merged_result AS test.txt
Contents of the file "sort.sh":
#!/bin/bash
#PRUNE_INPUTS File
#PRUNE_OUTPUT sorted_data.txt
sort $1 > sorted_data.txt
Contents of the file "merge.sh":
#!/bin/bash
#PRUNE_INPUTS File*
#PRUNE_OUTPUT merged_output.txt
sort -m $@ > merged_output.txt