Bioperl Scripts of cDNA analysis tools (CLAT) [Biology] — Tank @ 7:48 pm May 25, 2007
Several weeks ago I wrote some perl and bioperl scripts to analysis a large quantity of cDNA seqs or ESTs. I named my scripts as CLAT which is short for cDNA libary analysis tools. These scripts are good so I would like to share them here.
When getting thousands of sequences that usually after a cDNA libary sequencing, CLAT uses EMBOSS vectorstrip program to automatically clean the vectors that your clone vector carrys, then applys NCBI blastall program to Blast the sequences against the NCBI databases to gets the result and extracts the interested information from them to form a well organized database (in the OpenOffice.org spreadsheet or Microsoft Office Excel sheet. ) that facilitate downstream analysis, as well as sets up your own Blast database against which each sequence of the library is then Blasted. Similar sequences are clustered in one file of fasta format that can be easily analyzed by Clustal program or other phylogenetic analysis.
CLAT includes the following scripts:
To download scripts, [Click here http://tanklao.blogsome.com/2007/05/25/bioperl-script-of-cdna-analysis-tools-clat/] .
0autoclat0.2 alone can do all the work, but sometimes it cause some problems unexpectedly, so I seperate the work into 6 parts. Scripts numbered 1-6 are standalone and they finish the same job with 0autoclat0.2
Make sure that you have installed perl and bioperl on your computer and install EMBOSS programs and NCBI Blast programs
This scripts are only tested under Ubuntu Linux if you use other OS your should make a little change.
I would like to invite you to visit my blog at [Tanklao's Blog http://tanklao.blogsome.com]
Tags: Bioperl; cDNA analysis; ESTs; Microsoft Office Excel sheet; local Blast; Bioperl scripts Examples; Ubuntu Linux.