Automation

Why Automate?

You've seen one genome assembly, you've seen 'em all -- hardly! But there are some very predictable, repetitive things that developers need to do every time we build a genome annotation database on a new genome assembly. It is in our best interest to automate these steps when possible for these reasons:

it saves time
it reduces copy-paste and didn't-see-that-error-message errors
it helps to enforce naming conventions, which helps us use each other's data
it can produce detailed and accurate documentation of the data
it keeps our eyes from glazing over

Of course, nothing is for free. When something goes wrong in an automated process, we must work our way back from a usually cryptic error message through an additional level of code to the source of the problem. (Or if it's GenBank automation, bug MarkD. ;) But the hope is that developers will spend their time on more tasks that require critical thinking and fewer boring repetitive tasks.

The 5/30/06 genecats meeting was devoted to discussion and planning of build automation; Hiram transcribed the whiteboard notes from the meeting in High Throughput Genome Builds.

Automation Scripting Infrastructure

use of perl... interpreted, nice support for regexes, hashes, etc.

HgAutomate.pm
HgRemoteScript.pm
HgStepManager.pm

doTemplate.pl

Existing Automation Scripts

makeGenomeDb.pl
doRepeatMasker.pl
makeDownloads.pl
doSameSpeciesLiftOver.pl
doBlastzChainNet.pl
doHgNearBlastp.pl
makePushQSql.pl

MarkD's genbank scripts...

Automation Wish List

Repeat library generation (window masker?)
Brian's chained protein alignments
CpG islands
multiz
phastCons
meta-automation of all blastz's, multiz, phastCons?
meta-automation of all scripts that we always run?

Automation Troubleshooting

fileserver/machines out of sync
cluster job dies
cluster job hangs
ssh hangs

Automation

Contents

Why Automate?

Automation Scripting Infrastructure

Existing Automation Scripts

Automation Wish List

Automation Troubleshooting

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

related sites

hosted projects

Tools