work_queue_pool(1)

NAME

work_queue_pool - maintain a pool of Work Queue workers on a batch system.

SYNOPSIS

work_queue_pool -M <project-name> -T <batch-type> [options]

DESCRIPTION

work_queue_pool submits and maintains a number of work_queue_worker(1) processes on various batch systems, such as Condor and SGE. All the workers managed by a work_queue_pool process will be directed to work for a specific master, or any set of masters matching a given project name. work_queue_pool will automatically determine the correct number of workers to have running, based on criteria set on the command line. The decision on how many workers to run is reconsidered once per minute.

By default, work_queue_pool will run as many workers as the indicated masters have tasks ready to run. If there are multiple masters, then enough workers will be started to satisfy their collective needs. For example, if there are two masters with the same project name, each with 10 tasks to run, then work_queue_pool will start a total of 20 workers.

If the number of needed workers increases, work_queue_pool will submit more workers to meet the desired need. However, it will not run more than a fixed maximum number of workers, given by the -W option.

If the need for workers drops, work_queue_pool does not remove them immediately, but waits to them to exit on their own. (This happens when the worker has been idle for a certain time.) A minimum number of workers will be maintained, given by the -w option.

If given the -c option, then work_queue_pool will consider the capacity reported by each master. The capacity is the estimated number of workers that the master thinks it can handle, based on the task execution and data transfer times currently observed at the master. With the -c option on, work_queue_pool will consider the master's capacity to be the maximum number of workers to run.

If work_queue_pool receives a terminating signal, it will attempt to remove all running workers before exiting.

OPTIONS

Batch Options

-M,--master-name <project>
Name of a preferred project. A worker can have multiple preferred projects.
-T,--batch-type <type>
Batch system type: unix, condor, sge, workqueue, xgrid. (default is unix)
-w,--min-workers <workers>
Minimum workers running. (default=5)
-W,--max-workers <workers>
Maximum workers running. (default=100)
-c --capacity Use worker capacity reported by masters.
-P,--password <file>
Password file for workers to authenticate to master.
-t,--timeout <time>
Abort after this amount of idle time.
-E,--extra-options <options>
Extra options that should be added to the worker.
-S,--scratch <file>
Scratch directory. (default is /tmp/${USER}-workers)
-d,--debug <flag>
Enable debugging for this subsystem.
-o,--debug-file <file>
Write debugging output to this file. By default, debugging is sent to stderr (":stderr"). You may specify logs be sent to stdout (":stdout"), to the system syslog (":syslog"), or to the systemd journal (":journal").
-h, --help Show this screen.

EXIT STATUS

On success, returns zero. On failure, returns non-zero.

EXAMPLES

Suppose you have a Work Queue master with a project name of "barney". To maintain workers for barney, do this:
work_queue_pool -T condor -M barney
To maintain a maximum of 100 workers on an SGE batch system, do this:
work_queue_pool -T sge -M barney -W 100
To start workers according to the master's capacity, such that the workers exit after 5 minutes (300s) of idleness:
work_queue_pool -T condor -M barney -c -t 300
If you want to start workers that match any project that begins with barney, use a regular expression:
work_queue_pool -T condor -M barney.\* -c -t 300

KNOWN BUGS

The capacity measurement currently assumes single-core tasks running on single-core workers, and behaves unexpectedly with multi-core tasks or multi-core workers. When generating a worker pool for a foreman, specify a minimum number of workers to run at all times. Otherwise, the master will not assign any tasks to the foreman, because it (initally) has no workers.

COPYRIGHT

The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005-2011 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.

SEE ALSO

  • Cooperative Computing Tools Documentation
  • Work Queue User Manual
  • work_queue_worker(1)
  • work_queue_status(1)
  • work_queue_pool(1)
  • condor_submit_workers(1)
  • sge_submit_workers(1)
  • torque_submit_workers(1)
  • ec2_submit_workers(1)
  • ec2_remove_workers(1)

  • CCTools 4.3.0rcpre-parrot released on 09/08/2014