work_queue_pool(1)

NAME

work_queue_pool - submit a pool of Work Queue workers on various batch systems.

SYNOPSIS

work_queue_pool [options] <hostname> <port> <number>

or

work_queue_pool [options] -a <num-workers>

DESCRIPTION

work_queue_pool submits and maintains a number of work_queue_worker(1) processes on various batch systems, such as Condor and SGE. Each work_queue_worker process represents a Work Queue worker. All the Work Queue workers managed by a work_queue_pool process can be pointed to a specific Work Queue master, or be instructed to find their preferred masters through a catalog server.

If the <hostname> and <port> arguments are provided, the workers maintained by the work_queue_pool process would only work for the master running at <hostname>:<port>. If the -a option is present, then the <hostname> and <port> arguments are not needed and the workers would contact a catalog server to find out the appropriate masters (see the -N option). In either case, the <number> argument specifies the number of workers that work_queue_pool should maintain.

If a work_queue_worker process managed by the work_queue_pool is shutdown (i.e. failure, eviction, etc.), then the work_queue_pool will re-submit a new work_queue_worker to the specified batch system <type> in order to maintain a constant <number> of work_queue_worker processes.

OPTIONS

Batch Options

-d,--debug <flag>
Enable debugging for this subsystem.
-l,--logfile <logfile>
Log work_queue_pool status to logfile.
-S,--scratch <file>
Scratch directory. (default is /tmp/${USER}-workers)
-T,--batch-type <type>
Batch system type: unix, condor, sge, workqueue, xgrid. (default is unix)
-r,--retry <count>
Number of attemps to retry if failed to submit a worker.
-m,--workers-per-job <count>
Each batch job will start local workers. (default is 1)
-W,--worker-executable <path>
Path to the work_queue_worker(1) executable.
-A, --auto-pool-feature Enable auto worker pool feature.
-c,--config <config>
Path to the pool configuration file. This option is only effective when '-A' option is on. (default is work_queue_pool.conf)
-q, --one-shot Gurantee running workers and quit. The workers would terminate after their idle timeouts unless the user explicitly shuts them down.
-h, --help Show this screen.

Worker Options

-a, --advertise Enable auto mode. In this mode the workers would ask a catalog server for available masters. (deprecated, implied by -M).
-t,--timeout <time>
Abort after this amount of idle time.
-C,--catalog <catalog>
Set catalog server to <catalog>. Format: HOSTNAME:PORT
-M,--master-name <project>
Name of a preferred project. A worker can have multiple preferred projects.
-N Same as -M,--master-name (deprecated).
-o,--debug-file <file>
Send debugging to this file.
-E, --extra-options Extra options that should be added to the worker.

EXIT STATUS

On success, returns zero. On failure, returns non-zero.

EXAMPLES

Example 1

Suppose you have a Work Queue master running on barney.nd.edu and it is listening on port 9123. To start 10 workers on the Condor batch system for your master, you can invoke work_queue_pool like this:
work_queue_pool -T condor barney.nd.edu 9123 10
If you want to start the 10 workers on the SGE batch system instead, you only need to change the -T option:
work_queue_pool -T sge barney.nd.edu 9123 10
If you have access to both of the Condor and SGE systems, you can run both of the above commands and you will then get 20 workers for your master.

Example 2

Suppose you have started a Work Queue master with makeflow(1) like this:
makeflow -T wq -a -N myproject makeflow.script
The -N option given to makeflow specifies the project name for the Work Queue master. The master's information, such as hostname and port, will be reported to a catalog server. The work_queue_pool program can start workers that prefer to work for this master by specifying the same project name on the command line (see the -N option):
work_queue_pool -T condor -N my_project -a 10

COPYRIGHT

The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005-2011 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.

SEE ALSO

  • Cooperative Computing Tools Documentation
  • Work Queue User Manual
  • work_queue_worker(1)
  • work_queue_status(1)
  • work_queue_pool(1)
  • condor_submit_workers(1)
  • sge_submit_workers(1)
  • torque_submit_workers(1)
  • ec2_submit_workers(1)
  • ec2_remove_workers(1)

  • CCTools 4.0.1 released on 07/31/2013