chirp_stream_files(1)

NAME

chirp_stream_files - move data to/from chirp servers in parallel

SYNOPSIS

chirp_stream_files [options] <copy|split|join> <localfile> { <hostname[:port]> <remotefile>

DESCRIPTION

chirp_stream_files is a tool for moving data from one machine to and from many machines, with the option to split or join the file along the way. It is useful for constructing scatter-gather types of applications on top of Chirp.

chirp_stream_files copy duplicates a single file to multiple hosts. The <localfile> argument names a file on the local filesystem. The command will then open a connection to the following list of hosts, and stream the file to all simultaneously.

chirp_stream_files split divides an ASCII file up among multiple hosts. The first line of <localfile> is sent to the first host, the second line to the second, and so on, round-robin.

chirp_stream_files join collects multiple remote files into one. The argument <localfile> is opened for writing, and the remote files for reading. The remote files are read line-by-line and assembled round-robin into the local file.

In all cases, files are accessed in a streaming manner, making this particularly efficient for processing large files. A local file name of - indicates standard input or standard output, so that the command can be used in a pipeline.

OPTIONS

-a,--auth <flag>
Require this authentication mode.
-b,--block-size <size>
Set transfer buffer size. (default is 1048576 bytes)
-d,--debug <flag>
Enable debugging for this subsystem.
-i,--tickes <files>
Comma-delimited list of tickets to use for authentication.
-t,--timeout <time>
Timeout for failure. (default is 3600s)
-v, --version Show program version.
-h, --help Show help text.

ENVIRONMENT VARIABLES

List any environment variables used or set in this section.

EXIT STATUS

On success, returns zero. On failure, returns non-zero.

EXAMPLES

To copy the file mydata to three locations:
% chirp_stream_files copy mydata server1.somewhere.edu /mydata
                                 server2.somewhere.edu /mydata
                                 server2.somewhere.edu /mydata
To split the file mydata into subsets at three locations:
% chirp_stream_files split mydata server1.somewhere.edu /part1
                                  server2.somewhere.edu /part2
                                  server2.somewhere.edu /part3
To join three remote files back into one called newdata:
% chirp_stream_files join newdata server1.somewhere.edu /part1
                                  server2.somewhere.edu /part2
                                  server2.somewhere.edu /part3

COPYRIGHT

The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005-2011 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.

SEE ALSO

  • Cooperative Computing Tools Documentation
  • Chirp User Manual
  • chirp(1)
  • chirp_status(1)
  • chirp_fuse(1)
  • chirp_get(1)
  • chirp_put(1)
  • chirp_stream_files(1)
  • chirp_distribute(1)
  • chirp_benchmark(1)
  • chirp_server(1)
  • chirp_server_hdfs(1)

  • CCTools 4.1.4rc5 released on 04/10/2014