CCL Home

Research

Software Community Operations

Recent News in the CCL


Tutorial on Building Scalable Scientific Applications at XSEDE13

We will be offering a tutorial titled Building Scalable Scientific Applications using Makeflow and Work Queue as part of XSEDE 2013 in San Diego on July 22.

Monday, July 22, 2013 - Permalink


Dinesh Rajan Wins Best Talk at CCGrid 2013

Congratulations to CCL graduate student Dinesh Rajan, who won the Best Presentation Award at CCGrid 2013 for his presentation of Case Studies in Designing Elastic Applications!

Tuesday, May 21, 2013 - Permalink


Tutorial on Makeflow and Work Queue at CCGrid 2013

Dinesh Rajan will present a tutorial on Building Elastic Applications with Makeflow and Work Queue as part of CCGrid 2013 in Delft, the Netherlands on May 13th. Come join us and learn how to write applications that scale up to hundreds or thousands of nodes running on clusters, clouds, and grids.

Friday, March 22, 2013 - Permalink


Elastic Apps Paper at CCGrid 2013

Dinesh Rajan will present his paper Case Studies in Designing Elastic Applications at the IEEE International Conference on Clusters, Clouds, and Grids (CCGrid) in Delft, the Netherlands. This work was done in collaboration with Andrew Thrasher and Scott Emrich from the Notre Dame Bioinformatics Lab, and Badi Abdul-Wahid and Jesus Izaguirre from the Laboratory for Computational Life Sciences.

The paper describes our experience in designing three different elastic applications -- E-MAKER, Elastic Replica Exchange, and Folding at Work -- that run on hundreds to thousands of cores using the Work Queue framework. The paper offers six guidelines for designing similar applications:

  1. Abolish shared writes.
  2. Keep your software close and your dependencies closer.
  3. Synchronize two, you make company; synchronize three, you make a crowd.
  4. Make tasks of a feather flock together.
  5. Seek simplicity, and gain power.
  6. Build a model before scaling new heights.

Friday, March 22, 2013 - Permalink


Genome Assembly Paper in IEEE TPDS

A recent article in IEEE Transactions on Parallel and Distributed Computing describes our work in collaboration with the Notre Dame Bioinformatics Laboratory on SAND - The Scalable Assembler at Notre Dame.

In this article, we describe how to refactor the standard Celera genome assembly pipeline into a scalable computation that runs on thousands of distributed cores using the Work Queue. By explicitly handling the data dependencies between tasks, we are able to significantly improve runtime over Celera on a standard cluster. In addition this technique allows the user to break free of the shared filesystem and run on hundreds thousands of nodes drawn from clusters, clouds, and grids.

Thursday, March 21, 2013 - Permalink


CCTools 3.7.0 Released!

The Cooperative Computing Lab is pleased to announce the release of version 3.7.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

The software may be downloaded here.

This is a minor release which adds numerous features and fixes several bugs:

  • [WorkQueue] It is now possible to specify chunks (pieces) of an input file to be used as input for worker tasks. [Dinesh Rajan]
  • [Chirp] File extended attributes are now supported. [Patrick Donnelly]
  • [Makeflow] New -i switch now outputs pre-execution analysis of Makeflow DAG. [Li Yu]
  • [WorkQueue/Makeflow] Support for submitting tasks to the PBS batch submission platform added. [Dinesh Rajan]
  • [Makeflow] makeflow_log_parser now ignores comments in Makeflow logs. [Andrew Thrasher]
  • [Catalog] New catalog_update which reports information to a catalog server. [Peter Bui, Dinesh Rajan]
  • [WorkQueue] Various minor tweaks made to the API. [Li Yu, Dinesh Rajan]
  • [Catalog/WorkQueue] Support added for querying workers and tasks at run-time. [Douglas Thain]
  • [WorkQueue] Many environment variables removed in favor of option manipulation API. [Li Yu]
  • [Makeflow] Deprecated -t option (capacity tolerance) removed.
  • [WorkQueue] -W (worker status) now has working_dir and current_time fields.
  • [WorkQueue] -T (task status) now reports working_dir, current_time, address_port, submit_to_queue_time, send_input_start_time, execute_cmd_start_time. [Li Yu]
  • [WorkQueue] -Q (queue status) now reports working_dir.
  • [Makeflow] Input file (dependency) renaming supported with new "->" operator. [Michael Albrecht, Ben Tovar]
  • [WorkQueue] work_queue_pool now supports a new -L option to specify a log file. [Li Yu]
  • [WorkQueue] Tasks are now killed using SIGKILL.
  • [WorkQueue] Protocol based keep-alives added to workers. [Dinesh Rajan]

Thanks goes to the contributors for many minor features and bug fixes:

  • Michael Albrecht
  • Peter Bui
  • Patrick Donnelly
  • Brian Du Sell
  • Kyle Mulholland
  • Dinesh Rajan
  • Douglas Thain
  • Andrew Thrasher
  • Ben Tovar
  • Li Yu

Please send any feedback to the CCTools discussion mailing list.

Monday, February 18, 2013 - Permalink


CCTools 3.6.2 Released!

The Cooperative Computing Lab is pleased to announce the release of version 3.6.2 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

This is a bug fix release of version 3.6.1. No new features were added.

The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download

Changes:

  • [WorkQueue] Corrected memory errors leading to a SEGFAULT. [Li Yu]
  • [Makeflow] Properly interpret escape codes in Makeflow files: \n, \t, etc. [Brian Du Sell]
  • [Parrot] Watchdog now properly honors minimum wait time. [Li Yu]
  • [Parrot] Reports the logical executable name for /proc/self/exe instead of the physical name. [Douglas Thain]
  • [WorkQueue] Race conditions in signal handling for workers were corrected. Tasks now have a unique process group to properly kill all task children on abort. [Dinesh Rajan, Li Yu]
  • [WorkQueue] Corrected incorrect handling of -C option where worker would not use the same catalog server as work_queue_pool. [Li Yu]

Thanks goes to the contributors for this release: Patrick Donnelly, Brian Du Sell, Dinesh Rajan, Douglas Thain, and Li Yu.

Enjoy!

Monday, February 11, 2013 - Permalink


CCTools 3.6.1 Released!

The Cooperative Computing Lab is pleased to announce the release of version 3.6.1 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

This is a bug fix release of version 3.6.0. No new features were added.

The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download

Changes:

  • [Work Queue] Fixes bugs that resulted in a cancelled task becoming a zombie. [Dinesh Rajan]
  • [Makeflow] Various corrections to Makeflow manual and html documentation. [Li Yu]
  • [Makeflow] -I and -O options now correctly output file list to stdout. [Li Yu]
  • [*] Added missing debug flag for ldflags in configure. [Douglas Thain]
  • [Work Queue] Now correctly removes directories during cleanup. [Dinesh Rajan]
  • [Chirp] -b is now documented in the man/-h page. [Patrick Donnelly]
  • [Sand] Fixed a wrong error message. [Peter Bui, Li Yu]
  • [Catalog Server] -T option now properly accepts an argument. [Patrick Donnelly]
  • [*] Fixed a bug where the wrong version of perl was used in configure. [Dinesh Rajan]

Thanks goes to the contributors for this release: Dinesh Rajan, Patrick Donnelly, Peter Bui, Li Yu, and Douglas Thain.

Enjoy!

Friday, November 02, 2012 - Permalink


NSF Grant: Data and Software Preservation for Open Science

Mike Hildreth, Professor of Physics, Jarek Nabrzyski, Director of the Center for Research Computing and Concurrent Associate Professor of Computer Science and Engineering, and Douglas Thain, Associate Professor of Computer Science and Engineering, are the lead investigators on a project that will explore solutions to the problems of preserving data, analysis software, and how these relate to results obtained from the analysis of large datasets.

Known as Data and Software Preservation for Open Science (DASPOS), it is focused on High Energy Physics data from the Large Hadron Collider (LHC) and the Fermilab Tevatron. The group will also survey and incorporate the preservation needs of other communities, such as Astrophysics and Bioinformatics, where large datasets and the derived results are becoming the core of emerging science in these disciplines

The three-year $1.8M program, funded by the National Science Foundation, will include several international workshops and the design of a prototype data and software-preservation architecture that meets the functionality needed by the scientific disciplines. What is learned from building this prototype will inform the design and construction of the global data and software-preservation infrastructure for the LHC, and potentially for other disciplines.

The multi-disciplinary DASPOS team includes particle physicists, computer scientists, and digital librarians from Notre Dame, the University of Chicago, the University of Illinois Urbana-Champaign, the University of Nebraska at Lincoln, New York University, and the University of Washington, Seattle.

Thursday, October 04, 2012 - Permalink


Tutorial on Scalable Programming at Notre Dame

Tutorial: Introduction to Scalable Programming with Makeflow and Work Queue
October 24th, 3-5PM, 303 Cushing Hall

Register here (no fee) to reserve your spot in the class:
http://www.nd.edu/~ccl/software/tutorials/ndtut12

Would you like to learn how to write programs that can scale up to hundreds or thousands of machines?

This tutorial will provide an introduction to writing scalable programs using Makeflow and Work Queue. These tools are used at Notre Dame and around the world to attack large problems in fields such as biology, chemistry, data mining, economics, physics, and more. Using these tools, you will be able to write programs that can scale up to hundreds or thousands of machines drawn from clusters, clouds, and grids.

This tutorial is appropriate for new graduate students, undergraduate researchers, and research staff involved in computing in any department on campus. Some familiarity with Unix and the ability to program in Python, Perl, or C is required.

The class will consist of half lecture and half hands-on instruction in a computer equipped classroom. The instructors are Dinesh Rajan and Michael Albrecht, developers of the software who are PhD students in the CSE department.

For questions about the tutorial, contact Dinesh Rajan, dpandiar AT nd.edu.

Wednesday, October 03, 2012 - Permalink


CCTools 3.6.0 Released!

The Cooperative Computing Lab is pleased to announce the release ofversion 3.6.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download

This is a minor release which adds numerous features and fixes several bugs:

  • [WQ] Added API for logging functionality. [Christopher Bauschka]
  • [WQ] Python bindings have more complete access to the API available from C. Documentation has also been improved. [Dinesh Rajan]
  • [WQ] No longer manually redirects stdin/stdout/stderr by editing the user provided shell string, it now sets file descriptors directly. User redirections are no longer overridden. [Patrick Donnelly]
  • [WQ, Makeflow] The torque batch submission system is now supported. [Michael Albrecht, Douglas Thain]
  • [Parrot] Now supports extended attributes. [Patrick Donnelly]
  • [Makeflow] Now supports garbage collection of intermediate files. [Peter Bui]
  • [Makeflow] Now supports lexical scoping of Makeflow variables. [Peter Bui]
  • [Makeflow] New MAKEFLOW keyword for recursive Makeflows. [Peter Bui]
  • [WQ] Bindings for WQ now support SWIG versions >= 1.3.29. [Peter Bui]
  • [Parrot] iRODS now supports putfile/getfile operations for much faster file copies. [Douglas Thain]
  • [Parrot] Now includes watchdog support that runs alongside Parrot. [Douglas Thain, Brian Bockelman]
  • [*] CCTools now have been version information from the -v option. Version information is included in debug output with the `-d debug' flag. [Patrick Donnelly]
  • [WQ] work_queue_status output has been cosmetically improved. [Douglas Thain]
  • [WQ] New $WORK_QUEUE_SANDBOX environment variable. [Dinesh Rajan]

Thanks goes to the contributors for many minor features and bug fixes:

  • Michael Albrecht
  • Christopher Bauschka
  • Brian Bockelman
  • Dan Bradley
  • Peter Bui
  • Iheanyi Ekechukwu
  • Patrick Donnelly
  • Dinesh Rajan
  • Douglas Thain

Please send any feedback to the CCTools discussion mailing list. Enjoy!

Wednesday, September 19, 2012 - Permalink


Papers at e-Science Conference

Members of the CCL will present two papers and two posters at the upcoming IEEE Conference on e-Science in Chicago:

Tuesday, September 18, 2012 - Permalink


Lecture and Tutorial: Univ. of Arizona

We are doing a guest lecture and tutorial titled Building Scalable Data Intensive Applications with Makeflow and Work Queue at the University of Arizona as part of the Applied CI Concepts class on September 11 and 13, 2012.

Thursday, September 06, 2012 - Permalink


Tutorial at Cloud Summer School

We will be offering a tutorial titled Building Scalable Data Intensive Applications on the Cloud with Makeflow and Work Queue as part of the Science Cloud Summer School hosted by Indiana University and joined by universities around the country.

Tuesday, July 31, 2012 - Permalink


Talk at ICE Workshop

Prof. Thain gave a talk titled Computational Abstractions: Strategies for Scaling Up Applications at the Initiative for Computational Economics at the University of Chicago.

Friday, July 27, 2012 - Permalink


CCTools 3.5.2 Released

The Cooperative Computing Lab is pleased to announce the release of version 3.5.2 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

This is a bug fix release of version 3.5.1. A shell script executable has been added for Torque worker compatibility.

The software may be downloaded here.

Changes:

  • [WQ] Improved some debug messages. [Dinesh Rajan]
  • [AP] Fixed minor bug for dealing with comparison commands that produce no output. [Douglas Thain]
  • [WQ] Fixed a bug where the stdout buffer was not reset at the beginning of every task. [Douglas Thain]
  • [WQ] Documented -C option for work_queue_status. [Patrick Donnelly]
  • [WQ] Fixed a bug where pool configurations with an absolute path would result in a segfault. [Patrick Donnelly]
  • [Parrot] Fixed a bug where Parrot mistakenly thought it correctly wrote to memory using /proc/pid/mem. [Patrick Donnelly]
  • [*] Fixed a bug on OSX where non-blocking connects would result in an infinite loop. [Douglas Thain]
  • [*] Support for SWIG 1.3.29 added. [Peter Bui]
  • [*] Support has been added workers using Torque. [Michael Albrecht, Douglas Thain]
  • [Makeflow] Fixed option parsing. [Patrick Donnelly]

Tuesday, July 24, 2012 - Permalink


CCTools 3.5.1 Released

The Cooperative Computing Lab is pleased to announce the release of version 3.5.1 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

This is a bug fix release of version 3.5.0. No new features were added.

The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download

Changes:

  • Fixed a file descriptor leak in WorkQueue. [Joe Fetsch, Dinesh Rajan, Patrick Donnelly]
  • Better detection of fast memory access in Linux kernels >= 3.0. [Michael Hanke, Douglas Thain]

Thursday, June 28, 2012 - Permalink


CCTools 3.5.0 Released

The Cooperative Computing Lab is pleased to announce the release of version 3.5.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

The software may be downloaded at http://www.nd.edu/~ccl/software/download.

This is a minor release which adds numerous features and fixes several bugs:

  • Batch Job Hadoop module for submitting Hadoop jobs supports Hadoop 0.21.0. [Michael Albrecht, Patrick Donnelly]
  • Improvements to WorkQueue for worker accounting and management. WorkQueue now does master capacity estimation and automatic worker removal. [Li Yu]
  • WorkQueue now supports cancelling of a submitted task. [Dinesh Rajan]
  • WorkQueue workers can now report task execution time. [Li Yu]
  • Improved Batch Job local execution to fork and exec instead of fork and system.
  • Swig detection improved. [Peter Bui]
  • Improved and made consistent time formats for catalog server. [Li Yu]
  • Various corrections to Parrot's directory handling. [Douglas Thain]
  • Corrected numerous memory leaks and software bugs using Valgrind/ccpcheck. [Peter Bui]
  • WorkQueue workers now check for low disk space. [Dinesh Rajan]
  • Parrot now supports writable memory mapped files. [Douglas Thain]
  • WorkQueue MOAB support improved. [Peter Bui, Michael Albrecht]
  • WorkQueue now has prototype support for work_queue_pool resource management of multiple masters. work_queue_pool is now capable of automatically requesting resources from the underlying batch system as needed by the masters subject to a constraint file. [Li Yu]
  • WorkQueue now supports FIFO and LIFO task dispatch to workers. [Dinesh Rajan]
  • WorkQueue now has work_queue_version to differentiate versions of the library. [Peter Bui]
  • Chirp client status output is now properly sent to stderr. [Patrick Donnelly]
  • WorkQueue taskid assignment moved to submit from create. Submit now returns this unique id. [Dinesh Rajan]
  • Makeflow/WorkQueue/Chirp now support selecting an arbitrary port in a range using environment variables TCP_LOW_PORT and TCP_HIGH_PORT. [Patrick Donnelly]
  • Improved debug output for non-blocking tcp connections. [Li Yu]
  • WorkQueue task status is now appropriately set to complete when tasks are moved to complete list. [Dinesh Rajan]
  • Parrot now supports iRODS version 3.1. [Douglas Thain]
  • Parrot now allows an identity-boxed process to write to a world-writable file. (such as /dev/null) [Douglas Thain]
  • WorkQueue workers now have a tunable exponential backoff for reconnecting to masters. [Dinesh Rajan]
  • Updated WorkQueue documentation and examples. [Dinesh Rajan]
  • Various improvements to WorkQueue Python binding. [Peter Bui, Dinesh Rajan]
  • Numerous API/code improvements made to WorkQueue. [Li Yu, Dinesh Rajan, Douglas Thain]
  • Various compatibility improvements for building CCTools. [Douglas Thain, Patrick Donnelly]
Thanks goes to the contributors of other minor/bug fix corrections: Michael Albrecht, Roger Barthelson, Dan Bradley, Peter Bui, Rory Carmichael, Patrick Donnelly, Michael Hanke, Dinesh Rajan, Nathan Regola, Douglas Thain, and Li Yu.

Enjoy!

Monday, June 11, 2012 - Permalink


Ph.D. Defense: Peter Bui

Congratulations to Dr. Peter Bui, who successfully defended his dissertation titled "A Compiler Toolchain for Distributed Data Intensive Scientific Workflows" !

Thursday, June 07, 2012 - Permalink


Ph.D. Defense: Hoang Bui

Congratulations to Dr. Hoang Bui, who successfully defended his dissertation titled A Rich Metadata Filesystem for Scientific Data!

Thursday, May 24, 2012 - Permalink


Makeflow Paper at SWEET

Michael Albrecht will present our paper Makeflow: A Portable Abstraction for Data Intensive Computing on Clusters, Clouds, and Grids at the workshop on Scalable Workflow Enactment Engines and Technologies (SWEET), held with the SIGMOD conference.

This paper gives an overview of the Makeflow workflow engine, and presents a technique for evaluating the performance of workflows across multiple execution systems including Condor, SGE, Hadoop, and Work Queue. The software is currently available for download, and used by a growing open source community.

Thursday, May 17, 2012 - Permalink


Chirp Paper at CCGrid

Patrick Donnelly is presenting his most recent paper, Fine-Grained Access Control in the Chirp Distributed File System at the IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing in Ottowa this week.

This paper describes a lightweight authentication technique that allows jobs submitted to a batch system to access only the exact files that they need from a shared file server. The capability is integrated into the Chirp distributed filesystem which is currently available for download.

Thursday, May 17, 2012 - Permalink


CCTools 3.4.3 Released

The Cooperative Computing Lab is pleased to announce the release of version 3.4.3 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

This is a bug fix release of version 3.4.2. No new features were added.

The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download

Changes:

  • Fixed WQ Master contacting catalog server too frequently. [Li Yu]
  • Fixed bug in split_fasta that would omit the first id in each fasta split. [Rory Carmichael]
  • Fixed local execution timeouts for batch job. [Li Yu]
  • Fixed an issue with getdents on large directories in Parrot. [Douglas Thain]
  • Fixed an issue where a worker could use up all the disk space on a machine when accepting an incoming file. WQ workers now ensures that a configurable amount of space is available before accepting the file. [Dinesh Rajan]
  • Improved debug output for removed workers in WQ. [Li Yu]
  • Fixed an issue in Parrot with opening the current or parent directory without the O_DIRECTORY flag. [Douglas Thain, Dan Bradley]
  • Fixed a memory leak for Condor Batch Job. [Peter Bui]
  • Fixed a memory leak in WQ. [Peter Bui]
  • Added support for writing memory mapped files. [Douglas Thain]
  • Various fixes to configure and Make. [Douglas Thain, Peter Bui, Michael Hanke]
  • WQ Python binding correct to have catalog False by default to match C API. [Peter Bui]
  • Added escapes from the trimming stages of Celera when SAND is selected as the overlapper. These now match the places where Celera escapes for the UMD overlapper. [Andrew Thrasher]
  • Thanks goes to the contributors for this release: Dinesh Rajan, Patrick Donnelly, Peter Bui, Li Yu, Douglas Thain, Andrew Thrasher, Dan Bradley, and Michael Hanke.

    Enjoy!

    Monday, April 30, 2012 - Permalink


    CCL Workshop June 11-12

    The first annual CCL workshop will be held June 11-12th, 2012 on the campus of the University of Notre Dame. The theme of this year's workshop is "Scalable Software for Scientific Computing". This workshop is an opportunity to learn more about scalable software from the CCL and other related projects, see how others are applying it to advance their research, and to provide some input into the direction of our research and software development. The workshop will be of interest to both students and faculty involved in either the scientific objectives or the software tools of scalable science. An initial list of speakers is available, but we also invite proposals for research talks (20 minutes) or short reports (5 minutes) highlighting recent accomplishments. For more information: http://www.nd.edu/~ccl/workshop/2012

    Wednesday, April 11, 2012 - Permalink


    CCTools 3.4.2 Released

    The Cooperative Computing Lab is pleased to announce the release of version 3.4.2 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.

    The software may be downloaded here:
    http://www.cse.nd.edu/~ccl/software/download

    This is a minor release which fixes several bugs and adds minor features:

  • WorkQueue now does string interpolation on task input files. Currently, only the $OS and $ARCH variables are replaced. [Dinesh Rajan]
  • WorkQueue new work_queue_name API to get project name. [Peter Bui]
  • Parrot now supports the faccessat system call. [Douglas Thain]
  • Support added for catalog server, project name and shared workers to the SAND master executables. [Andrew Thrasher]
  • New daemon option for chirp_server and catalog_server. [Patrick Donnelly]
  • New FUSE mount option support for chirp_fuse. [Patrick Donnelly]
  • New support for CVMFS filesystem in Parrot. [Dan Bradley, Douglas Thain]
  • Added two new versions of Elastic replica exchange under apps/. The replica_exchange_protomol_nobarrier is an optimized (faster) version of the original implementation (now renamed replica_exchange_protomol_barrier along with some modifications). The documentation (.m4) have been updated accordingly. [Dinesh Rajan]
  • Buffer overflow corrected in some regular expressions. [Peter Bui and Li Yu]
  • WorkQueue now gracefully handles more than 1024 workers by moving from select to poll. [Peter Bui and Li Yu]
  • Example Bioinformatics Makeflow applications for BLAST. [Rory Carmichael]
  • SAND Support for Celera 6.x and 7.x. [Andrew Thrasher]
  • Corrected an issue where Makeflow would crash with very long commands. [Patrick Donnelly]
  • Thanks goes to the contributors of other minor/bug fix corrections: Dinesh Rajan, Patrick Donnelly, Peter Bui, Li Yu, Douglas Thain, Andrew Thrasher, Csaba Kos, Dan Bradley.

    Tuesday, February 21, 2012 - Permalink


    Talk: CS Problems in Distributed Computing

    Prof. Thain gave a talk titled Unsolved Computer Science Problems in Distributed Computing at Grid Computing: The Next Decade in Zakopane, Poland.

    Tuesday, January 10, 2012 - Permalink


    CCTools 3.4.1 Released

    We are pleased to announce the release of version 3.4.1 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, All-Pairs, and other software.

    The software may be downloaded here:
    http://www.cse.nd.edu/~ccl/software/download

    This is a minor release which fixes several bugs:

  • Parrot-S3 support has been updated to the current Amazon address.
  • Chirp-FUSE (and other chirp tools) now correctly support Chirp ticket authentication.
  • Work Queue now permits renaming of directories during transfer.
  • Some missing header files are now installed correctly.
  • Manuals have been updated to to avoid triggering a bug in gnu m4-1.4.6.
  • Thanks to Michael Albrecht, Patrick Donnelly, Dinesh Pandiar, and Peter Bui, who contributed to this release.

    Monday, November 14, 2011 - Permalink


    Scientific Workflow Management Course

    Michael Albrecht will be teaching CSE 60145: Scientific Workflow Management in the spring of 2012.

    The goal of this course is to cover the tools and techniques necessary to manage large-scale scientific workflows, with an emphasis on the systems available for use at Notre Dame through the Cooperative Computing Lab and the Center for Research Computing. Students will be introduced to the difficulties involved in managing large datasets and complex workflows, as well as the methods frequently used to ameliorate them. This course is designed for graduate students from any college or discipline who deal with large and/or complex workflows (we currently work with fields ranging from computer vision to molecular biology to economics).

    Thursday, November 10, 2011 - Permalink


    Paper at PyHPC Workshop

    Peter Bui will be presenting Work Queue + Python: A Framework For Scalable Scientific Ensemble Applications at the Workshop on Python for High Performance and Scientific Computing at Supercomputing 2011.

    This paper describes how we have combined Python with the Work Queue framework to construct several scalable applications in collaboration with the Laboratory for Computational Life Sciences at Notre Dame. The software described here is also incorporated into the latest release of the CCTools software.

    Monday, November 07, 2011 - Permalink


    Talk at UAB

    Prof. Thain gave a talk titled High Throughput Scientific Computing with Condor: Computer Science Challenges in Large Scale Parallelism at the University of Alabama at Birmingham on October 27th, 2011.

    Thursday, October 27, 2011 - Permalink


    CCTools 3.4.0 Released

    We are pleased to announce the release of version 3.4.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, All-Pairs, and other software.

    The software may be downloaded here:
    http://www.cse.nd.edu/~ccl/software/download

    New features and improvements:

  • New comprehensive documentation and man pages for all commands.
  • New support for python and perl bindings to the Work Queue system.
  • New support for running work queue applications directly on EC2.
  • New example application of replica exchange using Work Queue.
  • Parrot-XRootD support is now statically compiled in the binary release.
  • Improved scalability of ticket-based authentication in the Chirp server.
  • Improved support for HDFS as storage driver in the Chirp server.
  • Improved Chirp server architecture uses multiple processes for robustness.
  • Improved build system now handles portability across multiple operating systems (linux, solaris, macos, freebsd, cygwin) and architectures (x86, x86_64, ia64, ppc)
  • Bug fixes:
  • Fixed bug in Chirp chdir() that was seen as a No such directory error when using FUSE.
  • Fixed bug in Chirp tickets triggered by variable output from openssl.
  • Fixed bug in Parrot relating to poll(), which would result in long timeouts when using python or mpich.
  • Fixed bug in Parrot relating to tc{get/set}pgrp(), which would result in no prompt displayed in interactive root.
  • Fixed bug in the catalog server that would result in a crash when under heavy load.
  • Many members of the CCL team contributed to this release:
  • Michael Albrecht contributed the MPI Work Queue implementation, and generic support for batch systems with a qsub-like interface.
  • Peter Bui contributed the SWIG Perl and Python Work Queue bindings, the Starch tool, found many bugs throughout the code, and generally wrangled the build system.
  • Patrick Donnelly contributed the ticket authentication system and the Chirp-HDFS support.
  • Dinesh Pandiar contributed replica_exchange_protomol, ec2_{submit/remove}_workers, and multiple improvements to Work Queue.
  • Li Yu contributed to the Work Queue implementation.
  • The entire CCL team worked to complete the documentation.
  • And we also thank:
  • Nabil Ghodbane for assisting with parrot_run and ROOT.
  • Vanessa Hamar for assisting with parrot_run and mpich.
  • Rodney Walker for assisting with the chirp_server.
  • RJ Nowling and Badi Abdul-Wahid for assisting with the Work Queue system.
  • Andrew Thrasher for assisting with the perl bindings to Work Queue.
  • Sunday, October 23, 2011 - Permalink


    Paper at CloudCom 2011

    Dinesh Pandiar wrote a paper titled Converting A High Performance Application to an Elastic Cloud Application, which was accepted to the IEEE CloudCom conference, which will be in Greece in November 2011.

    This paper describes some of our recent work in converting traditional high-performance message passing (MPI) applications into a more flexible form for cloud computing. MPI is great on dedicated clusters, but isn't designed to handle failures or wide performance variations. By recasting this molecular dynamics application into our Work Queue framework, we are able to break out of traditional clusters and run codes on hundreds of cores spanning our local Condor pool, and cloud service providers such as EC2 and Azure.

    This work was started by Anthony Canino, one of our REU students from summer 2010, and done in close collaboration with Badi Abdul-Wahid and Jesus Izaguirre, who are experts in the field of molecular dynamics. Another graduate student from the CCL, Li Yu, will travel to present the paper on their behalf.

    Friday, October 21, 2011 - Permalink


    CCTools 3.3.4 Released

    We are pleased to announce the release of version 3.3.4 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, All-Pairs, and other software.

    The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download

    This release addresses several bugs. Users of 3.3.3 are advised to upgrade.

  • Makeflow: Fixed a race condition that would occasionally result in a crash after running a local process.
  • Chirp: Modified Unix authentication to better tolerate the use of NFS filesystems.
  • Chirp: Fixed the behavior of the access() system call, and modified the FUSE client to accomodate older servers.
  • All-Pairs: Improved initial runtime estimation.
  • Several minor fixes to accomodate oddities in RHEL 6.
  • Thanks to the following people who contributed to this release: Patrick Donnelly, Peter Bui, Li Yu, and Pengqui Cheng.

    Monday, August 08, 2011 - Permalink


    CCTools 3.3.3 Released

    We are pleased to announce the release of version 3.3.3 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, All-Pairs, and other software. The software may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download

    This release includes the following:

  • Many enhancements to the Chirp-HDFS driver and the Chirp-FUSE module to hide several limitations of HDFS and make the system more easily deployed out of the box.
  • Added three new batch system drivers for Makeflow: Hadoop,Moab, and Work Queue with a shared filesystem.
  • Improved support for XrootD. The XrootD libraries are now entirely statically linked to Parrot and included in our binary distributions on Linux.
  • Added a timeout and retry to Unix filesystem authentication in the Chirp server, to accomodate propagation delays when used with NFS filesystems.
  • Improved configuration scripts to accomodate a greater variety of Linux distributions.
  • Updated Parrot to handle a variety of new system calls in RHEL 6.
  • Many minor bug fixes and improvements.
  • Many thanks to the following people who contributed to this release: Patrick Donnelly, Michael Albrecht, Peter Bui, and Dinesh Rajan.

    Wednesday, July 13, 2011 - Permalink


    Posters at CCA-11

    Three graduate students from the CCL -- Dinesh Rajan, Peter Sempolinksi, and Li Yu - presented their ongoing work at the Cloud Computing and Applications workshop:

    Wednesday, April 13, 2011 - Permalink


    CCTools 3.3.0 Released

    We are pleased to announce the release of version 3.3.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, All-Pairs, and other software. The software may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    This release includes the following:

  • Overall source packaging has been cleaned up to be more compatible with packaging standards for Linux distributions. This release has been updated to build on Windows with Cygwin. To avoid conflicts with several existing program names in Linux, the following programs have been renamed: parrot is now parrot_run, worker is now work_queue_worker
  • Parrot: Added support for the xrootd protocol. Improved support for the HDFS protocol, which can now be detected at run-time. Fixed bugs relating to multi-threaded programs and large numbers of processes.
  • Chirp: Improved support for HDFS as a backend storage device, which can now be detected at runtime. Chirp is now available as a protocol driver for the ROOT I/O system used in the high energy physics community.
  • Makeflow: Fixed a bug related to change detection on directories.
  • SAND: Fixed integer-size bug that would result in inconsistent results on platforms other than 64-bit Linux.
  • Work Queue: The work_queue_worker now identifies itself to a master by sharing a simple project name. The projcet name can also be used locate masters via the catalog, allowing masters to migrate across your cluster, cloud, or grid. work_queue_status lists the vital statistics of all work queue programs that report themselves to the catalog server. work_queue_pool can maintain a constant pool of workers in your cluster, cloud or grid. The python module is now available for writing Work Queue programs in python.
  • Thanks to the following people who contributed to this release: Michael Albrecht, Brian Bockelman, Peter Bui, Patrick Donnelly, Dinesh Rajan, Derek Weitzel, Li Yu

    Tuesday, April 12, 2011 - Permalink


    Talk at IDGA Cloud Computing

    Prof. Thain gave a talk titled Models and Frameworks for Data Intensive Cloud Computing at the IDGA Cloud Computing Summit in Washington DC.

    Wednesday, February 09, 2011 - Permalink


    Scalable Assembler Released

    We are pleased to announce the release of version 3.2.0 of SAND -- the Scalable Assembler at Notre Dame.

    SAND replaces the early stages of the Celera Assembler with scalable versions that can run on collections of commodity computers. By harnessing clusters, clouds, grids, or just random machines in your office, many bioinformatics tasks can be accelerated many times over. SAND is an open source project, and we invite the community to use and improve the software.

    This release features the following improvements:

  • Improved integration with Celera 5.4; simply use the provided sand_runCA script to start the workflow.
  • A significant performance increase in both the filtering and alignment stages.
  • More information about SAND is available here:
    http://www.cse.nd.edu/~ccl/software/sand

    Thanks to Scott Emrich, Andrew Thrasher, Li Yu, Christopher Moretti, and Michael Olson, who all made major contributions to the development of SAND.

    Wednesday, January 12, 2011 - Permalink


    CCTools 3.1.2 Released

    We are pleased to announce the release of version 3.1.2 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, All-Pairs, and other software. The software may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    This release fixes a number of minor bugs:

  • Work Queue: Sending of files larger than 2GB is now supported. Tasks that fail to produce the expected output files are now returned to the caller rather than retried.
  • Parrot: An application attempting a non-blocking network connection will now properly block instead of busy-waiting.
  • Chirp Server: Fixed bug relating to construction of directory names in calls such as chirp_localpath.
  • All-Pairs: Fixed bug that would result in crash at startup.
  • Thanks to Michael Albrecht, Peter Bui, Patrick Donnelly, Dinesh Rajan, and Li Yu, for their contributions to this release.

    Wednesday, November 10, 2010 - Permalink


    Paper at WORKS Workshop

    Andrew Thrasher will present Taming Complex Bioinformatics Workflows with Weaver, Makeflow, and Starch at the Fifth Workshop on Workflows in Support of Large Scale Science held with Supercomputing 2010 in New Orleans.

    Wednesday, November 03, 2010 - Permalink


    Papers at CloudCom

    Two graduate students from the CCL will be presenting their work at the IEEE CloudCom conference in Indianapolis.
  • Peter Sempolinski will present A Comparison and Critique of Eucalyptus, OpenNebula and Nimbus, which discusses some of his experience in installing and managing each of these software systems.
  • Patrick Donnelly will present his work titled Attaching Cloud Storage to a Campus Grid Using Parrot, Chirp, and Hadoop.
  • Monday, October 25, 2010 - Permalink


    Papers at HPDC Workshops

    Three graduate students from the CCL presented papers at workshops co-located with High Performance Distributed Computing this summer in Chicago:

  • Rory Carmichael presented a paper on the design and implementation of Biocompute, our bioinformatics web portal at the workshop on Emerging Methods in the Computational Life Sciences.
  • Peter Bui made the first presentation about Weaver, a high level workflow language, at the workshop on Challenges of Large Applications in Distributed Environments.
  • Hoang Bui presented his latest work on ROARS, our general purpose archival system for scientific data, at the workshop on Data Intensive Distributed Computing.
  • Hoang also presented a paper titled Toward Long-Term Data Quality in a Large Scale Biometrics Experiment at the workshop on Managing Data Quality in Collaborative Sciences.
  • Monday, October 25, 2010 - Permalink


    CCTools 3.1.1 Released

    We are pleased to announce the release of version 3.1.1 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, All-Pairs, and other software. The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download.shtml

    New features in this release include:

  • Makeflow: Fixed bugs relating to batch submission to Condor on MacOS, and to non-standard SGE installations. Improved support for whole directory transfer, and translation of file names when necessary.
  • Work Queue: Added support for workers to discover masters associated with a project via the catalog. Improved system throughput when running at full capacity. Added a detailed log file and new API entry points.
  • All-Pairs: Added support for execution-time sampling, concurrency control, and integration with sequence alignment functions. Documentation improved.
  • Chirp: Improved support for HDFS, including reporting server capacity, more robust initialization, and authentication pass-through.
  • An improved test suite for all components.
  • Lots of minor improvements throughout.
  • Thanks to all who contributed to this release: Michael Albrecht, Peter Bui, Anthony Canino, Patrick Donnelly, Li Yu.

    And, thanks to Donald Barre, Colin Dewey, and Rodney Walker for bug reports.

    Monday, October 18, 2010 - Permalink


    NSF Grant on Cloud Computing

    We have received a "Computing in the Cloud" grant from the National Science Foundation to study the possibilities of running large scale applications on the Windows Azure platform. This grant will support the porting of the Cooperative Computing Tools to Windows Azure, and the development of scalable bioinformatics and molecular dynamics codes on that platform.

    Wednesday, August 25, 2010 - Permalink


    CCTools 3.1.0 Released

    We are pleased to announce the release of version 3.1.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, and other software. The software may be downloaded here: http://www.cse.nd.edu/~ccl/software/download.shtml New features in this release include:
  • Makeflow and Work Queue: Multiple performance and scalability improvements, including reduced CPU consumption, better management of output files, and more
  • Parrot: New support for Amazon S3 as a remote filesystem. Multiple bug fixes and portability updates for Linux.
  • Chirp Server: New support for HDFS as a backend filesystem. This allows you to combine the capacity and scalability of HDFS with the wide-area security and accessibility provided by Chirp.
  • SAND Assembler: Improved performance, storage management, and integration with the Celera assembler.
  • Lots of little bug fixes and improvements.
  • Thanks to all who contributed to this release: Michael Albrecht, Peter Bui, Anthony Canino, Patrick Donnelly, Markiyan Samborskyy, Andrew Thrasher, Rodney Walker, and Li Yu.

    Tuesday, July 20, 2010 - Permalink


    Tutorial on Makeflow and Work Queue

    Presented by Li Yu and Peter Bui

    Tuesday, June 29th, 2010, 1-3PM, room 177 Fitzpatrick Hall

    Makeflow and Work Queue are frameworks that make it easy to construct applications that can scale up from a single CPU to hundreds or thousands of cores. These fault tolerant frameworks allow you to harness idle computers around your lab as well as large scale computing clusters. At Notre Dame, they have been used to solve large problems in biometrics, data mining, economics, genomics, and can be applied to many other fields.

    This class will consist of one half lecture and one half hands-on tutorial with Makeflow and Work Queue. After completing the class, students will be able to write simple applications that run on tens to hundreds of cores. Comfort with the Linux command-line interface is required.

    To reserve a seat in the class, please email lyu2@nd.edu and indicate your name and home department.

    For more information, see:

  • http://www.cse.nd.edu/~ccl/software/makeflow
  • http://www.cse.nd.edu/~ccl/software/workqueue
  • Tuesday, June 08, 2010 - Permalink


    Posters at CI-Days Workshop

    The Center for Research Computing at Notre Dame recently hosted an NSF sponsored "Cyberinfrastructure Days" workshop. Students from the CCL presented a variety of posters demonstrating large scale systems enabling data intensive scientific discovery. Click on the images below to see more.

    Scaling Up Bioinformatics Applications with Makeflow
    Li Yu
    Genome Assembly with 1024-Core Alignment on the Notre Dame Campus Grid
    Christopher Moretti
    The Parallel Shell
    Michael Albrecht
    ROARS: A Scalable Repository for Data Intensive Scientific Computing
    Hoang Bui
    Weaver: Simple Distributed Scientific Programming
    Peter Bui

    Saturday, May 01, 2010 - Permalink


    Ph.D. Defense: Christopher Moretti

    Congratulations to Dr. Christopher Moretti, who successfully defended his dissertation titled Abstractions for Scientific Computing on Campus Grids!

    Friday, April 30, 2010 - Permalink


    Talks at Condor Week

    Graduate students Li Yu and Peter Bui each gave talks at Condor Week in Madison, WI: Scaling Up Scientific Workflows with Makeflow, and Weaving Abstractions into Workflows.

    Friday, April 16, 2010 - Permalink


    CCTools 3.0.0 Released

    We are pleased to announce the release of version 3.0.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, SAND, and other software. The software may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    New features in this release include:

  • The Scalable Assembler at Notre Dame (SAND) is a set of portable genome assembly modules that work with the Celera assembler to dramatically improve runtimes. We have previously presented SAND in research publications.
  • The All-Pairs and Wavefront abstractions now have both multicore and Work Queue based implementations, and are fully documented and ready for use.
  • Makeflow has a number of improvements to parsing, maximum file sizes, dependency checking, and other usability issues.
  • Parrot now works around a critical bug found in Linux 2.6.22, which is widely used in Debian 5.
  • Many other small bug fixes and improvements.
  • Thanks to all who contributed to this release: Michael Albrecht, Peter Bui, Matthew Farallee, Chris Moretti, Kevin Partington, and Li Yu.

    Thursday, March 18, 2010 - Permalink


    Condor Log Analyzer Updated

    The Condor Log Analyzer is a web service that provides feedback on large Condor workloads. It has recently been updated to support a wider variety of log files, and allows users to browse previous results that have been made public.

    Thursday, February 11, 2010 - Permalink


    Job Openings Updated

    See the jobs page for new openings for undergraduate researchers as well as postdocs and professionals.

    Friday, January 15, 2010 - Permalink


    CCTools 2.6.0 Released

    We are pleased to announce the release of version 2.6.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, Work Queue, and other software. The software may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    New features in this release include:

  • Support for accessing the Hadoop Distributed File System (HDFS) within Parrot. This allows you to submit arbitrary Unix jobs to your cluster, cloud, or grid, and allow them to call home and access your HDFS installation. No special privileges or kernel support is needed at the execution site.
  • A number of new flags and features in Makeflow and Work Queue, which increase scalability and reliability for large workloads.
  • An improved configuration and build system which should be portable across a wider array of platforms, with more precise linking capabilities.
  • Thanks to Peter Bui, Christopher Moretti, Li Yu, and Michael Albrecht, who contributed to this release.

    Monday, December 07, 2009 - Permalink


    Two Teaching Fellowships

    Two graduate students in the CCL have received competitive teaching fellowships for the coming year:
  • Chris Moretti will be one of three fellows teaching Introduction to Engineering Systems, the foundation course for all students in the college.
  • Peter Bui received a GAANN Teaching Fellowship, and is currently teaching the Programming Challenges course.
  • Tuesday, November 17, 2009 - Permalink


    Genome Assembly at MTAGS 2009

    Christopher Moretti and Michael Olson will present their most recent work on Scalable Genome Assembly at the MTAGS Workshop held at Supercomputing 2009.

    Their (unnamed) scalable assembler allows the end user to plug in a variety of custom algorithms for the computationally intensive phase of sequence alignment, using the Work Queue software to manage a workforce of hundreds of computers harnessed via Condor.

    Our largest run so far used over 1000 nodes at three different institutions, reducing the time to perform alignments from over nine days to less than one hour.

    Friday, October 30, 2009 - Permalink


    CCTools 2.5.5 Released

    We are pleased to announce release 2.5.5 of the Cooperative Computing Tools, including Parrot, Chirp, Work Queue, Makeflow, and other tools which may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    This release includes a number of bug fixes, particularly relating to handling of symbolic links in Parrot and Chirp. Thanks to Yushu Yao, Francesco Prelz, Peter Bui, Hoang Bui, Michael Albrecht, and Christopher Moretti for their contributions to this release.

    Wednesday, October 21, 2009 - Permalink


    Energy Management at IEEE Grid

    Recent graduate Michael Lammie presented his work on managing energy in multicore clusters at the IEEE Grid conference in Banff, Canada. His paper titled Scheduling Grid Workloads on Multicore Clusters to Minimize Energy and Maximize Performance, describes how to reduce the energy consumed by a large multicore cluster through the careful application of node scaling, voltage scaling, and job assignment. This work was done in collabration with Paul Brenner in the Notre Dame Center for Research Computing.

    Tuesday, October 20, 2009 - Permalink


    Ph.D. Defense: Kyle Wheeler

    Congratulations to Dr. Kyle Wheeler, who successfully defended his dissertation titled Exploiting Shared Memory Topology with QThreads for Portable Parallel Performance. Dr. Wheeler will shortly take a postdoctoral position at Sandia National Labs.

    Monday, September 28, 2009 - Permalink


    Talk at Clemson University

    Prof. Thain gave a guest lecture at Clemson University titled Scaling up Data Intensive Science to Campus Grids.

    Saturday, September 26, 2009 - Permalink


    Talk at GeoClouds Workshop

    Prof. Thain gave the opening talk, Science in the Clouds: History, Challenges, and Opportunities, at the GeoClouds Workshop in Indianapolis.

    Thursday, September 17, 2009 - Permalink


    NSF Grant to Support Open Source Engineering

    A team of researchers at the University of Notre Dame has received a $1.4M grant from the National Science Foundation titled Open Sourcing the Design of Civil Infrastructure. This project will create a virtual organization and online collaborative facility that will enable new ways of designing and evaluating civil infrastructure by applying concepts from the open source software community. The faculty team leading the project consists of civil engineers Dr. Tracy Kijewski-Correa and Dr. Ahsan Kareem, computer scientists Dr. Greg Madey and Dr. Douglas Thain, and social scientist David Hachen.

    Monday, September 14, 2009 - Permalink


    NSF Grant to Build Collaborative Storage

    We have received a Collaborative Research Infrastructure grant from the National Science Foundation to build a wide area testbed for data intensive computing. The Distributed Research Testbed will establish interconnected computing nodes at the Universities of Chicago, Florida, Hawaii, Notre Dame, and Mississippi. This testbed will provide an infrastructure for creating and evaluating new mechanisms for cloud and grid computing.

    Monday, September 14, 2009 - Permalink


    Talk at HEC-FSIO

    Prof. Thain gave a talk titled "Getting Beyond the Filesystem" at the NSF/DOE High End Computing File Systems and I/O Workshop in Washington, DC.

    Wednesday, August 12, 2009 - Permalink


    Ph.D. Defense: Jeffrey Hemmes

    Congratulations to Dr. Jeffrey Hemmes, who successfully defended his dissertation titled Improving Data Availability in Mobile Applications Through Enhanced Cooperative Localization. Dr. Hemmes will return to a faculty position at the Air Force Institute of Technology.

    Friday, July 31, 2009 - Permalink


    MAJ Hemmes Returns Home

    Major Jeffrey Hemmes, USAF, recently returned to the United States from duty in Iraq. He is currently a PhD candidate in the CCL, and will assume teaching duties at the Air Force Institute of Technology in the fall. Welcome home, Jeff!

    Monday, July 06, 2009 - Permalink


    CCTools 2.5.3 Released

    We are pleased to announce release 2.5.3 of the Cooperative Computing Tools, including Parrot, Chirp, and other tools which may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    This release includes two new pieces of software: the Work Queue library and the Makeflow workflow engine.

    Friday, July 03, 2009 - Permalink


    CCTools 2.5.2 Released

    We are pleased to announce release 2.5.2 of the Cooperative Computing Tools, including Parrot, Chirp, and other tools which may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    This version includes a number of small fixes to bugs in Parrot that occur when running 32-bit executables on a 64-bit machine.

    Friday, June 26, 2009 - Permalink


    Talks at HPDC 2009

    Li Yu presented our work on Harnessing Parallelism in Multicore Clusters with the All-Pairs and Wavefront Abstractions at HPDC 2009 in Munich. Prof. Thain gave the keynote talk at the associated LSAP workshop, Scaling up Data Intensive Applications to Campus Grids

    Thursday, June 11, 2009 - Permalink


    Grid Heating Wins Green IT Award

    Paul Brenner's paper, Grid Heating Clusters: Transforming Cooling Constraints into Thermal Benefits won a "Green IT Award" from the Uptime Institute. Read more about grid heating here.

    Wednesday, June 10, 2009 - Permalink


    CCTools 2.5.0 Released

    We are pleased to announce release 2.5.0 of the Cooperative Computing Tools, including Parrot, Chirp, and other tools which may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    This version includes a technical preview of our recent published work on abstractions for distributed computing, as well as a number of minor bug and portability fixes. Thanks to Peter Bui, Rashid Mehdi, Chris Moretti, Francesco Prelz, and Li Yu for their contributions.

    Friday, June 05, 2009 - Permalink


    BXGrid Article in JCC

    Our article on the Biometrics Research Grid, Experience with BXGrid: A Data Repository and Computing Grid for Biometrics Research has been accepted to the Journal of Cluster Computing in a special issue on e-Science topics.

    Monday, June 01, 2009 - Permalink


    Parrot Flies on the LHC Computing Grid

    In a paper presented at CHEP 2009, a group of physicists describes how Parrot is used to distribute a large software package hosted at Fermilab in the United States to thousands of CPUs harnessed via the LHC Computing Grid across Europe.

    Friday, May 15, 2009 - Permalink


    Honors Defense: Patrick Braga-Henebry

    Patrick Braga-Henebry successfully defended his B.S. honors thesis title "Biocompute: Providing a Distributed Computing Model for Searching Genome Datasets." The Biocompute facility that Patrick constructed is used to carry out data intensive genome queries, parallelized across on a 64-core cluster. Congratulations, Patrick!

    Thursday, May 07, 2009 - Permalink


    Presentations at Condor Week 2009

    Chris Moretti and Hoang Bui gave presentations at Condor Week in Madison. Chris presented Abstractions for Data Intensive Computing on Condor and Hoang presented BXGrid: A Data Repository and Computing Grid for Biometrics Research.

    Wednesday, April 22, 2009 - Permalink


    Multicore Abstractions at HPDC 2009

    A paper by Li Yu and Christopher Moretti on our newest developments in distributed computing with abstractions has been accepted to HPDC 2009 in Munich. Harnessing Parallelism in Multicore Clusters with the All-Pairs and Wavefront Abstractions describes how to extend two abstractions to distributed systems that consist of multicore computers. With our collaborators Dr. Scott Emrich at Notre Dame and Dr. Kenneth Judd at Stanford, we demonstrate applications of the Wavefront abstraction to problems in bioinformatics and genomics.

    Wednesday, April 01, 2009 - Permalink


    Article on All-Pairs in TPDS

    Our most recent article on All-Pairs has been accepted to the IEEE Transactions on Parallel and Distributed Computing. This article presents new developments in data distribution, output management using really large matrices (60k by 60k elements), and a record breaking biometrics experiment.

    Wednesday, April 01, 2009 - Permalink


    Chirp on the Blue Gene/P at Supercomputing

    In a recent paper at IEEE/ACM Supercomputing, researchers at Argonne National Lab deployed our Chirp filesystem on hundreds of intermediate nodes to support applications running on tens of thousands of processors.

    Tuesday, February 17, 2009 - Permalink


    BXGrid Featured in ISGTW

    Our work on the Biometrics Research Grid (BXGrid), was the feature story in this week's issue of International Science Grid This Week.

    Thursday, February 12, 2009 - Permalink


    BXGrid at IEEE e-Science 2008

    At the IEEE e-Science conference held in Indianapolis in December 2008, Hoang Bui presented this poster on BXGrid, the Biometrics Research Grid. Prof. Thain gave a talk on Using Small Abstractions to Program Large Distributed Systems (and multicore computers) at the Workshop on Distributed Programming Abstractions.

    Monday, January 05, 2009 - Permalink


    CCL in the Indiana Diagrid

    Our 600-CPU Condor pool at Notre Dame forms a small part of the Indiana statewide DiaGrid, which exploits about twenty thousand CPUs all managed by the Condor distributed computing software. Here is more information about Condor at Notre Dame.

    Monday, January 05, 2009 - Permalink


    CCTools Release 2.4.6

    We are pleased to announce release 2.4.6 of the Cooperative Computing Tools, including Parrot, Chirp, and other tools which may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    This version rolls up a number of minor bug fixes in Parrot and Chirp. Thanks to Ismail Ataturk, Karen Hollingsworth, Nathan Regola, Yushu Yao, for their contributions.

    Friday, October 31, 2008 - Permalink


    Abstractions at CCA08

    Prof. Thain gave a talk titled "Programming Distributed Systems with High Level Abstractions" at the Cloud Computing and Applications Workshop held at the University of Chicago on October 23.

    Sunday, October 26, 2008 - Permalink


    ENAVis at LISA 2008

    Qi Liao will present a paper on ENAVis, a dynamic visualization of user, program, and network data collected by the Lockdown enterprise system management tool.

    Thursday, October 23, 2008 - Permalink


    Abstractions for Data Mining at ICDM

    Chris Moretti and Karsten Steinhauser recently had a paper Scaling Up Classifiers to Cloud Computers accepted at the International Conference on Data Mining in Pisa, Italy. This paper describes a high level abstractions for running stanrdard data mining algorithms on systems of hundreds of CPUs.

    Wednesday, October 22, 2008 - Permalink


    CCTools 2.4.4 Released

    We are pleased to announce release 2.4.4 of the Cooperative Computing Tools, including Parrot, Chirp, and other tools which may be downloaded here:

    http://www.cse.nd.edu/~ccl/software/download.shtml

    A recent version of the patched Red Hat kernel installed by up2date inhibits access to the special file /proc/X/mem, which caused previous versions of Parrot to stop functioning with the error "Permission denied."

    This release works around that bug in the kernel.

    Monday, August 25, 2008 - Permalink


    Troubleshooting at Grid 2008

    David Cieslak will present a paper titled Troubleshooting Thousands of Jobs on Production Computing Grids Using Data Mining Techniques at Grid 2008 in Japan. This work demonstrates techniques for drawing conclusions such as "Your jobs fail on Linux 2.4 with less than 16GB RAM" from complex workloads of thousands of jobs.

    Wednesday, June 25, 2008 - Permalink


    Datalab at HPDC 2008

    Brandon Rich presented a poster on DataLab: Active Storage for Data Drive Scientific Computing at High Performance Distributed Computing in Boston. DataLab is a system for robustly performing large data parallel workloads on hundreds of active storage nodes, using distributed transaction concepts to create a robust system.

    Wednesday, June 25, 2008 - Permalink


    CCTools 2.4.3 Released

    We are pleased to announce release 2.4.3 of the Cooperative Computing Tools, including Parrot, Chirp, and other tools which may be downloaded here.

    Major items in this release:

    1. New documentation for the Chirp APIs
    2. Improvements to the Chirp server:
      • New support for streaming I/O and server-side FIFOs
      • New support for complex path names using RFC 2396 encoding.
      • New P right that allows a user only to put new files.
    3. Improvements to the Chirp clients:
      • Strided I/O routines for array access.
      • Client interfaces for streaming I/O.
      • Improvement to the build flags that make it easier for other programs to compile against Chirp.
    4. Miscellaneous bug fixes:
      • Fixed case where server improperly returns "permission denied" instead of "file not found".es in the server.

    Wednesday, June 11, 2008 - Permalink


    NSF Summer REU Grant

    The CCL has received a grant from the National Science Foundation which will support two undergraduates in summer 2008 to participate in the construction of a novel repository for biometric data.

    Researchers at Notre Dame have collected tens of thousands of such images and videos, and design new algorithms for identifying and matching people based upon these measurements. Answering these questions is very computation and data intensive. A large scale study of a new matching algorithm could take many CPU years to complete. To attack these problems in a reasonable amount of time, we must enlist hundreds of CPUs to work on different portions of the problem. While we have demonstrated the practicality of this idea with some custom programming, the overall system is not (yet) easy to use for end researchers. To solve this problem, the participants in this program will construct a well-organized repository of biometric data, connect it to our campus distributed computing system, and create an interface that makes it easy to specify and execute large biometric jobs.

  • CCL REU Program
  • Distributed Computing for Biometrics
  • Sunday, June 01, 2008 - Permalink


    CCL to Participate in Google/IBM Cluster Pilot

    In the 2008-2009 school year, junior and senior students in the CSE department will have the opportunity to learn techniques for large scale computing on clusters used by large internet service providers. Google and IBM have announced the 2008 Academic Cluster Initiative, which will provide a 1000-node machine for use by university students around the country. Students in Professor Douglas Thain's distributed systems and operating systems classes will learn how to write large data intensive programs in languages such as Map-Reduce on this cluster.

    Tuesday, May 20, 2008 - Permalink


    Papers at IPDPS 2008

    At IPDPS 2008 in Miami, Chris Moretti presented All-Pairs: An Abstraction for Data Intensive Cloud Computing, and Kyle Wheeler presented QThreads: An API for Programming with millions of Lightweight Threads.

    Tuesday, April 01, 2008 - Permalink


    Parrot Flies at Fermilab

    Parrot and the GROW filesystem are in production use at Fermi National Laboratory. The CDF experiment exploits the open Science Grid to run a large number of monte carlo simulations. Because the simulation code is highly complex and not practical to install at all sites, Parrot and the GROW filesystem are used to access the software on demand from Fermilab. More details here.

    Tuesday, January 01, 2008 - Permalink