Programmatic access to the Genome Browser

The UCSC API for retrieving data and uploading data is not REST driven but revolves around client-side C tools that convert to/from binary files.

Here are some common tasks that can be solved with calls from scripts to the UCSC Genome Browser, assuming that you know the standard Unix command line tools:

Get the chromosome sequence for a range

Download the tool twoBitToFa from http://hgdownload.cse.ucsc.edu/admin/exe/ e.g. with curl http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/twoBitToFa > twoBitToFa; chmod a+x twoBitToFa
To get the DNA sequence from e.g. the human genome hg19, run a command like twoBitToFa http://hgdownload.cse.ucsc.edu/gbdb/hg19/hg19.2bit stdout -seq=chr21 -start=1 -end=10000. You can replace stdout with a filename of your choice.
for best performance, download the 2bit file for your genome from http://hgdownload.cse.ucsc.edu/gbdb/<databaseId> to local disk.

Get the "wiggle" (x-y-plot) graph data for a chromosome range

Download bigWigToWig from http://hgdownload.cse.ucsc.edu/admin/exe/ as shown above
run a command like bigWigToWig http://hgdownload.cse.ucsc.edu/gbdb/hg19/bbi/wgEncodeBroadHistoneK562Cbx2Sig.bigWig -chrom=chr21 -start=0 -end=10000000 stdout. You can also replace stdout with a filename of your choice.

Download data stored in a database table

use Tools - Table Browser - "Describe schema" to browse the database schema. All fields have a human readable description and the links to other tables are shown.
mysql --no-defaults -h genome-mysql.cse.ucsc.edu -u genome -A -e 'select * from pubsBingBlat' -NB > out.txt

Get a copy of the current Genome Browser image from a script

use curl http://genome.ucsc.edu/cgi-bin/hgRenderTracks > test.png. hgRenderTracks understands the same parameters and options as the main hgTracks CGI, e.g. <internalTrackName>=pack
to get the internal track name of a track, mouse over the track and look at your internet browser status line or go to the track configuration page and look for the value of the variable called "g" in the current URL
to hide the default track when you use hgRenderTracks, make sure that the first track parameter is hideTracks=1
for example, to download the image for a chromosomal location with only the RefSeq transcripts and publications track to "pack" mode, use this command: curl 'http://genome.ucsc.edu/cgi-bin/hgRenderTracks?position=chr17:41570860-41650551&hideTracks=1&refGene=pack&pubs=pack' > temp.png

Upload a custom track and link to the genome browser with the track loaded

create a custom track file as documented here http://genome.ucsc.edu/goldenpath/help/customTrack.html, e.g.: printf 'track name="TestTrack" description="TestTrack with links on features" url="http://www.google.com/$$"\nchr1 1 1000 testIdForUrl' > temp.bed
upload your file with a command like this, it will print a string to stdout which we are calling $HGSID in the following curl -s -F db=hg19 -F 'hgct_customText=@temp.bed' http://genome.ucsc.edu/cgi-bin/hgCustom | grep -o 'hgsid=[0-9]*_[a-zA-Z0-9]*' | uniq | sed -e 's/hgsid=//'
you can link to a fresh genome browser session with only this track loaded with http://genome.ucsc.edu/cgi-bin/hgTracks?hgsid=$HGSID&position=chr1:1-1000
you can load more tracks into this session by adding the parameter hgsid=$HGSID to all future curl calls
you can download the image of this sessions with hgRenderTrack above, by supplying the $HGSID value to hgRenderTracks, like this curl "http://genome.ucsc.edu/cgi-bin/hgRenderTracks?hgsid=$HGSID" > test.png

Programmatic access to the Genome Browser

Contents

Get the chromosome sequence for a range

Get the "wiggle" (x-y-plot) graph data for a chromosome range

Download data stored in a database table

Get a copy of the current Genome Browser image from a script

Upload a custom track and link to the genome browser with the track loaded

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

related sites

hosted projects

Tools