parrot_run_hdfs(1)

NAME

parrot_run_hdfs - run a program in the Parrot virtual file system with HDFS client setup

SYNOPSIS

parrot_run_hdfs [parrot_options] program [program_options]

DESCRIPTION

parrot_run_hdfs runs an application or a shell inside the Parrot virtual filesystem.

HDFS is the primary distributed filesystem used in the Hadoop project. Parrot supports read and write access to HDFS systems using the parrot_run_hdfs wrapper. The command checks that the appropriate environmental variables are defined and calls parrot_run. See parrot_run(1).

In particular, you must ensure that you define the following environmental variables:

  • JAVA_HOME Location of your Java installation.
  • HADOOP_HOME Location of your Hadoop installation.
  • Based on these environmental variables, parrot_run_hdfs will attempt to find the appropriate paths for libjvm.so and libhdfs.so. These paths are stored in the environmental variables LIBJVM_PATH and LIBHDFS_PATH, which are used by the HDFS Parrot module to load the necessary shared libraries at run-time. To avoid the startup overhead of searching for these libraries, you may set the paths manually in your environment before calling parrot_run_hdfs, or you may edit the script directly.

    Note that while Parrot supports read access to HDFS, it only provides write-once support on HDFS. This is because the current implementations of HDFS do not provide reliable append operations. Likewise, files can only be opened in either read (O_RDONLY) or write mode (O_WRONLY), and not both (O_RDWR).

    For complete details with examples, see the Parrot User's Manual

    OPTIONS

    See parrot_run(1) for option listing.

    ENVIRONMENT VARIABLES

  • JAVA_HOME Location of your Java installation.
  • HADOOP_HOME Location of your Hadoop installation.
  • EXIT STATUS

    parrot_run_hdfs returns the exit status of the process that it runs. If parrot_run_hdfs is unable to start the process, it will return non-zero.

    EXAMPLES

    To access a single remote HDFS file using cat:
    % parrot_run_hdfs cat /hdfs/server:port/foo
    
    You can also run an entire shell inside of Parrot, like this:
    % parrot_run_hdfs bash
    % cd /hdfs/server:port/
    % ls -la
    % cat foo
    

    COPYRIGHT

    The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005-2011 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.

    SEE ALSO

  • Cooperative Computing Tools Documentation
  • Parrot User Manual
  • parrot_run(1)
  • parrot_run_hdfs(1)
  • parrot_cp(1)
  • parrot_getacl(1)
  • parrot_setacl(1)
  • parrot_mkalloc(1)
  • parrot_lsalloc(1)
  • parrot_locate(1)
  • parrot_timeout(1)
  • parrot_whoami(1)
  • parrot_md5(1)

  • CCTools 4.1.0 released on 02/24/2014