work_queue_pool(1)

NAME

work_queue_pool - submit a pool of Work Queue workers on various batch systems.

SYNOPSIS

work_queue_pool [options] <hostname> <port> <number>

or

work_queue_pool [options] -a <num-workers>

DESCRIPTION

work_queue_pool submits and maintains a number of work_queue_worker(1) processes on various batch systems, such as Condor and SGE. Each work_queue_worker process represents a Work Queue worker. All the Work Queue workers managed by a work_queue_pool process can be pointed to a specific Work Queue master, or be instructed to find their preferred masters through a catalog server.

If the <hostname> and <port> arguments are provided, the workers maintained by the work_queue_pool process would only work for the master running at <hostname>:<port>. If the -a option is present, then the <hostname> and <port> arguments are not needed and the workers would contact a catalog server to find out the appropriate masters (see the -N option). In either case, the <number> argument specifies the number of workers that work_queue_pool should maintain.

If a work_queue_worker process managed by the work_queue_pool is shutdown (i.e. failure, eviction, etc.), then the work_queue_pool will re-submit a new work_queue_worker to the specified batch system <type> in order to maintain a constant <number> of work_queue_worker processes.

OPTIONS

Batch Options

-d <subsystem>
Enable debugging for this subsystem.
-S <scratch>
Scratch directory. (default is /tmp/${USER}-workers)
-T <type>
Batch system type: unix, condor, sge, workqueue, xgrid. (default is unix)
-r <count>
Number of attemps to retry if failed to submit a worker.
-W <path>
Path to the work_queue_worker(1) executable.
-h Show this screen.

Worker Options

-a Enable auto mode. In this mode the workers would ask a catalog server for available masters.
-t <time>
Abort after this amount of idle time.
-C <catalog>
Set catalog server to <catalog>. Format: HOSTNAME:PORT
-N <project>
Name of a preferred project. A worker can have multiple preferred projects.
-s Run as a shared worker. By default the workers would only work for preferred projects.
-o <file>
Send debugging to this file.

EXIT STATUS

On success, returns zero. On failure, returns non-zero.

EXAMPLES

Example 1

Suppose you have a Work Queue master running on barney.nd.edu and it is listening on port 9123. To start 10 workers on the Condor batch system for your master, you can invoke work_queue_pool like this:
work_queue_pool -T condor barney.nd.edu 9123 10
If you want to start the 10 workers on the SGE batch system instead, you only need to change the -T option:
work_queue_pool -T sge barney.nd.edu 9123 10
If you have access to both of the Condor and SGE systems, you can run both of the above commands and you will then get 20 workers for your master.

Example 2

Suppose you have started a Work Queue master with makeflow(1) like this:
makeflow -T wq -a -N myproject makeflow.script
The -N option given to makeflow specifies the project name for the Work Queue master. The master's information, such as hostname and port, will be reported to a catalog server. The work_queue_pool program can start workers that prefer to work for this master by specifying the same project name on the command line (see the -N option):
work_queue_pool -T condor -N my_project -a 10

COPYRIGHT

The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005-2011 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.

SEE ALSO

  • Cooperative Computing Tools Documentation
  • Work Queue User Manual
  • work_queue_worker(1)
  • work_queue_status(1)
  • work_queue_pool(1)
  • condor_submit_workers(1)
  • sge_submit_workers(1)
  • ec2_submit_workers(1)
  • ec2_remove_workers(1)

  • CCTools 3.4.2 released on 02/10/2012