Blat Scripts

From genomewiki
Revision as of 21:00, 15 November 2006 by Hartera (talk | contribs)
Jump to navigationJump to search

Here is a collection of Blat-related Perl scripts that perform functions that are frequently requested on the genome mailing list. If anyone finds a problem with these scripts then please notify me by selecting the e-mail user link from the side menu bar at: User:Hartera

BlatBot.pl: This is a script that takes a file of FASTA format sequences as input and then submits them to the web-based Blat on the UCSC Genome Browser web site. It obeys the site rules for the number of frequency of hits i.e. Program-driven use of the Genome Browser software is limited to a maximum of one hit every 15 seconds and Blats sequences in batches of 25 sequences at a time.

The script usage is: usage: BlatBot.pl <organism> <db> <searchType> <sortOrder> <input FASTA> <outputType> <output file>

       Specify organism using the common name with first lettercapitalized.
       e.g. Human, Mouse, Rat etc.
       Db is database or assembly name e.g hg17, mm5, rn3 etc.
       searchType can be BLATGuess, DNA, RNA, transDNA or transRNA
       sortOrder can be query,score; query,start; chrom,score;
       chrom,start; score.
       outputType can be pslNoHeader, psl or hyperlink.
       blats will be run in groups of 25 sequences, all
       output going to the specified output file.


parseBlatOutput.pl: This script parses html output from the BlatBot.pl script and produces either psl output or hyperlinks depending on the BlatBot output type. usage: parseBlat.pl <output type> <html output> [other html outputs...]

       output type is psl or hyperlink
       <html output> - file with html returned from blat request
       [other html outputs...] - more html file results
       output is to stdout