work_queue_stats Struct Reference

Statistics describing a work queue. More...

#include <work_queue.h>

Data Fields

int workers_connected
 Number of workers currently connected to the master.
int workers_init
 Number of workers connected, but that have not send their available resources report yet.
int workers_idle
 Number of workers that are not running a task.
int workers_busy
 Number of workers that are running at least one task.
int workers_able
 Number of workers on which the largest task can run.
int workers_joined
 Total number of worker connections that were established to the master.
int workers_removed
 Total number of worker connections that were released by the master, idled-out, fast-aborted, or lost.
int workers_released
 Total number of worker connections that were asked by the master to disconnect.
int workers_idled_out
 Total number of worker that disconnected for being idle.
int workers_fast_aborted
 Total number of worker connections terminated for being too slow.
int workers_blacklisted
 Total number of workers blacklisted by the master.
int workers_lost
 Total number of worker connections that were unexpectedly lost.
int tasks_waiting
 Number of tasks waiting to be dispatched.
int tasks_on_workers
 Number of tasks currently dispatched to some worker.
int tasks_running
 Number of tasks currently executing at some worker.
int tasks_with_results
 Number of tasks with retrieved results and waiting to be returned to user.
int tasks_submitted
 Total number of tasks submitted to the queue.
int tasks_dispatched
 Total number of tasks dispatch to workers.
int tasks_done
 Total number of tasks completed and returned to user.
int tasks_failed
 Total number of tasks completed and returned to user with result other than WQ_RESULT_SUCCESS.
int tasks_cancelled
 Total number of tasks cancelled.
int tasks_exhausted_attempts
 Total number of task executions that failed given resource exhaustion.
timestamp_t time_when_started
 Absolute time at which the master started.
timestamp_t time_send
 Total time spent in sending tasks to workers (tasks descriptions, and input files.
timestamp_t time_receive
 Total time spent in receiving results from workers (output files.
timestamp_t time_send_good
 Total time spent in sending data to workers for tasks with result WQ_RESULT_SUCCESS.
timestamp_t time_receive_good
 Total time spent in sending data to workers for tasks with result WQ_RESULT_SUCCESS.
timestamp_t time_status_msgs
 Total time spent sending and receiving status messages to and from workers, including workers' standard output, new workers connections, resources updates, etc.
timestamp_t time_internal
 Total time the queue spents in internal processing.
timestamp_t time_polling
 Total time blocking waiting for worker communications (i.e., master idle waiting for a worker message).
timestamp_t time_application
 Total time spent outside work_queue_wait.
timestamp_t time_workers_execute
 Total time workers spent executing done tasks.
timestamp_t time_workers_execute_good
 Total time workers spent executing done tasks with result WQ_RESULT_SUCCESS.
timestamp_t time_workers_execute_exhaustion
 Total time workers spent executing tasks that exhausted resources.
int64_t bytes_sent
 Total number of file bytes (not including protocol control msg bytes) sent out to the workers by the master.
int64_t bytes_received
 Total number of file bytes (not including protocol control msg bytes) received from the workers by the master.
double bandwidth
 Average network bandwidth in MB/S observed by the master when transferring to workers.
int capacity_tasks
 The estimated number of tasks that this master can effectively support.
int capacity_cores
 The estimated number of workers' cores that this master can effectively support.
int capacity_memory
 The estimated number of workers' MB of RAM that this master can effectively support.
int capacity_disk
 The estimated number of workers' MB of disk that this master can effectively support.
int64_t total_cores
 Total number of cores aggregated across the connected workers.
int64_t total_memory
 Total memory in MB aggregated across the connected workers.
int64_t total_disk
 Total disk space in MB aggregated across the connected workers.
int64_t committed_cores
 Committed number of cores aggregated across the connected workers.
int64_t committed_memory
 Committed memory in MB aggregated across the connected workers.
int64_t committed_disk
 Committed disk space in MB aggregated across the connected workers.
int64_t max_cores
 The highest number of cores observed among the connected workers.
int64_t max_memory
 The largest memory size in MB observed among the connected workers.
int64_t max_disk
 The largest disk space in MB observed among the connected workers.
int64_t min_cores
 The lowest number of cores observed among the connected workers.
int64_t min_memory
 The smallest memory size in MB observed among the connected workers.
int64_t min_disk
 The smallest disk space in MB observed among the connected workers.
int total_workers_connected
int total_workers_joined
int total_workers_removed
int total_workers_lost
int total_workers_idled_out
int total_workers_fast_aborted
int tasks_complete
int total_tasks_dispatched
int total_tasks_complete
int total_tasks_failed
int total_tasks_cancelled
int total_exhausted_attempts
timestamp_t start_time
timestamp_t total_send_time
timestamp_t total_receive_time
timestamp_t total_good_transfer_time
timestamp_t total_execute_time
timestamp_t total_good_execute_time
timestamp_t total_exhausted_execute_time
int64_t total_bytes_sent
int64_t total_bytes_received
double capacity
double efficiency
double idle_percentage
int64_t total_gpus
int64_t committed_gpus
int64_t max_gpus
int64_t min_gpus
int port
int priority
int workers_ready
int workers_full
int total_worker_slots
int avg_capacity

Detailed Description

Statistics describing a work queue.


Field Documentation

Number of workers currently connected to the master.

Number of workers connected, but that have not send their available resources report yet.

Number of workers that are not running a task.

Number of workers that are running at least one task.

Number of workers on which the largest task can run.

Total number of worker connections that were established to the master.

Total number of worker connections that were released by the master, idled-out, fast-aborted, or lost.

Total number of worker connections that were asked by the master to disconnect.

Total number of worker that disconnected for being idle.

Total number of worker connections terminated for being too slow.

(see work_queue_activate_fast_abort)

Total number of workers blacklisted by the master.

(Includes workers_fast_aborted.)

Total number of worker connections that were unexpectedly lost.

(does not include idled-out or fast-aborted)

Number of tasks waiting to be dispatched.

Number of tasks currently dispatched to some worker.

Number of tasks currently executing at some worker.

Number of tasks with retrieved results and waiting to be returned to user.

Total number of tasks submitted to the queue.

Total number of tasks dispatch to workers.

Total number of tasks completed and returned to user.

(includes tasks_failed)

Total number of tasks completed and returned to user with result other than WQ_RESULT_SUCCESS.

Total number of tasks cancelled.

Total number of task executions that failed given resource exhaustion.

Absolute time at which the master started.

Total time spent in sending tasks to workers (tasks descriptions, and input files.

).

Total time spent in receiving results from workers (output files.

).

Total time spent in sending data to workers for tasks with result WQ_RESULT_SUCCESS.

Total time spent in sending data to workers for tasks with result WQ_RESULT_SUCCESS.

Total time spent sending and receiving status messages to and from workers, including workers' standard output, new workers connections, resources updates, etc.

Total time the queue spents in internal processing.

Total time blocking waiting for worker communications (i.e., master idle waiting for a worker message).

Total time spent outside work_queue_wait.

Total time workers spent executing done tasks.

Total time workers spent executing done tasks with result WQ_RESULT_SUCCESS.

Total time workers spent executing tasks that exhausted resources.

Total number of file bytes (not including protocol control msg bytes) sent out to the workers by the master.

Total number of file bytes (not including protocol control msg bytes) received from the workers by the master.

Average network bandwidth in MB/S observed by the master when transferring to workers.

The estimated number of tasks that this master can effectively support.

The estimated number of workers' cores that this master can effectively support.

The estimated number of workers' MB of RAM that this master can effectively support.

The estimated number of workers' MB of disk that this master can effectively support.

Total number of cores aggregated across the connected workers.

Total memory in MB aggregated across the connected workers.

Total disk space in MB aggregated across the connected workers.

Committed number of cores aggregated across the connected workers.

Committed memory in MB aggregated across the connected workers.

Committed disk space in MB aggregated across the connected workers.

The highest number of cores observed among the connected workers.

The largest memory size in MB observed among the connected workers.

The largest disk space in MB observed among the connected workers.

The lowest number of cores observed among the connected workers.

The smallest memory size in MB observed among the connected workers.

The smallest disk space in MB observed among the connected workers.

deprecated fields:

Deprecated:
Use workers_connected instead.
Deprecated:
Use workers_joined instead.
Deprecated:
Use workers_removed instead.
Deprecated:
Use workers_lost instead.
Deprecated:
Use workers_idled_out instead.
Deprecated:
Use workers_fast_aborted instead.
Deprecated:
Use tasks_with_results.
Deprecated:
Use tasks_dispatched instead.
Deprecated:
Use tasks_done instead.
Deprecated:
Use tasks_failed instead.
Deprecated:
Use tasks_cancelled instead.
Deprecated:
Use tasks_exhausted_attempts instead.
Deprecated:
Use time_when_started.
Deprecated:
Use time_send.
Deprecated:
Use time_receive.
Deprecated:
Use time_send_good + time_receive_good.
Deprecated:
Use time_workers_execute.
Deprecated:
Use time_workers_execute_good.
Deprecated:
Use time_workers_execute_exhaustion.
Deprecated:
Use bytes_sent.
Deprecated:
Use bytes_received.
Deprecated:
Use capacity_cores.
Deprecated:
. broken.
Deprecated:
: broken.
Deprecated:
: broken.
Deprecated:
: broken.
Deprecated:
: broken.
Deprecated:
Use work_queue_port Port of the queue.
Deprecated:
Not used.
Deprecated:
Use workers_idle instead.
Deprecated:
Use workers_busy insead.
Deprecated:
Use tasks_running instead.
Deprecated:
Use capacity_cores instead.

The documentation for this struct was generated from the following file:

Generated on 17 Oct 2016 for cctools by  doxygen 1.6.1