cd $HOME wget http://ccl.cse.nd.edu/software/files/cctools-7.0.4-source.tar.gz tar xvzf cctools-7.0.4-source.tar.gz cd cctools-7.0.4-source ./configure --prefix $HOME/cctools --tcp-low-port 9000 make make install cd $HOMEThe software is now installed in $HOME/cctools, so you must set your path appropriately:
setenv PATH ${PATH}:${HOME}/cctools/binIf you use bash instead, then do this:
export PATH=${PATH}:${HOME}/cctools/binNow double check that you can run the various commands, like this:
makeflow -v work_queue_worker -v work_queue_statusTo complete the assignment, you will need to become familiar with the manuals and other materials online. I recommend you read the Work Queue manual next and try running an example program.
If our goal is to assemble all of these reads into a complete genome, our first step would be to compare each sequence to every other one, to see which ones are similar, or overlap. This is known as alignment. The result of alignment is a line-up between the letters in each string, and an overall score indicating the quality of the alignment.
Download and unpack the program swaligntool.tar.gz, which compares two DNA strings on the command line using the Smith-Waterman algorithm. Note that the tool consists of a Python main program (swaligntool) and a directory containing a Python library (swalign). Python will find the library if it is in the current working directory.
Use it like this:
./swaligntool GCTCAGCCATCTACTACAAATCGGT TCTACTACAAATCGGGTCAACGATCT Query: cmdline (26 nt) Ref : cmdline (25 nt) Query: 0 TCTACTACAAATCGGGT 17 ||||||||||||| ||| Ref : 9 TCTACTACAAATC-GGT 25 Score: 31 Matches: 16 (94.1%) Mismatches: 1 CIGAR: 13M1I3MThe tool will also read sequences out of files, so if you wanted to compare a single sequence (say, in file 1.fasta) to all other sequences, you could do this:
head -n2 agambiae.small.fasta > 1.fasta ./swaligntool 1.fasta agambiae.small.fasta
./compareit agambiae.small.fasta Listening on port 9785... Top Ten Matches: 1: sequence 1101555423543 matches 1101897423223 with a score of 807 2: sequence 1101555423223 matches 1101555423223 with a score of 643 ... 10: sequence 1101557298657 matches 1101555400923 with a score of 35
/afs/nd.edu/courses/cse/cse40822.01/dropbox/YOURNAME/a2Turn in the following: