Topics in Bioinformatics: Introduction to Perl and BioPerl
Fall Semester 2007
This seven-week course module is an introduction to bioinformatics programming and scripting using the Perl language and the BioPerl toolkit. A list of topics to be covered is given below. Perl is an open-source computer programming language. BioPerl is a toolkit of Perl modules useful in building bioinformatics solutions in Perl. BioPerl can be used to parse sequence data retrieved from local and remote databases, to transform the formats of sequence data and files, to manipulate individual sequences, to search for patterns in sequences, to assist with creating and manipulating sequence alignments, and to search for genes, transposons, and other structures in genomic data.
This course will assume no knowledge of programming, although skills using a computer will be expected. Weekly programming assignments will be given. Students are strongly encouraged to bring a laptop to class so that they can apply material as it is introduced.
Class Meeting Times/Location:
322 Jordan Hall
MW 3:00 - 4:15, October 29 to December 10, 2007
Instructor:
Greg Madey, gmadey@nd.edu, (574)631-8752, 350 Fitzpatrick Hall
Office Hours:
By appointment (and whenever my office door is open!)
Teaching Assistant:
Matt Van Antwerp, mvanantw at cse.nd.edu, (574)631-7596, 206 Cushing
Textbook:
Beginning Perl for Bioinformatics
by James Tisdall
Paperback: 400 pages
Publisher: O'Reilly Media, Inc.; 1 edition (October 15, 2001)
Language: English
ISBN-10: 0596000804
ISBN-13: 978-0596000806
Course Goals:
In this course, students will learn how to and/or about:
Getting started with Perl
- Accessing and installing Perl and BioPerl
b Running Perl programs
c Editors
d Finding help
e Using modules, like BioPerl
- The Programming process
b Algorithms
- Variables
b Arrays
c Files
- Flow control
b String operators
c Writing files
- Scoping
b Arguments
c Command line arguments
d Passing data to subroutines
e Modules and Libraries
f Debugging
- Hashes
b Translating DNA into Proteins
c Working with the FASTA Format
d Reading frames
- Restriction Maps
b Restriction Enzyme Data
- Working with GenBank data
b Analyzing DNA
c Working with BLAST output
d BioPerl Modules
Grading:
Programming assignments 40%
Class participation 40%
Final project 20%