Table Browser URL

From genomewiki
Revision as of 18:42, 13 November 2006 by Kuhn (talk | contribs)
Jump to navigationJump to search

How to create a command line script to fetch data from the table browser.

Please take note of the following notice found on the home page of the UCSC Genome Browser WEB site:

Program-driven use of this software is limited to a maximum of
one hit every 15 seconds and no more than 5,000 hits per day.

With that limitation in mind, consider the following procedure.

The trick is to use the table browser in the normal manner until it gives an example of the type of output desired.

Then use cartDump to obtain the cgi variables used by the table browser as it produced that output. Copy those cgi variables into a command line, and add the two special URL variables:

'submit=submit&hgta_doTopSubmit=1'

to trick hgTables into thinking it just got a submit button press.

With this process, you can get hgTables to produce any of its outputs with a URL fetch as in the examples here. It gets tricky if there are filters or intersections involved.

However, for extensive use of this type of function, it is most often much more convenient and efficient to simply download the actual MySQL table data from hgdownload, and use the kent source tree tools to manipulate and calculate with the actual data locally.

Here is an example of fetching genscan genes within a specified position:

#!/bin/sh
POSITION="chrX:151073054-151383976"
wget --progress=dot \
'http://genome.ucsc.edu/cgi-bin/hgTables?db=hg18&hgta_compressType=none&'\
'hgta_group=genes&hgta_outputType=gff&outGff=1&hgta_regionType=range&'\
'hgta_table=genscan&hgta_track=genscan&org=Human&position='${POSITION}\
'&submit=submit&hgta_doTopSubmit=1' \
    -O genscan.${POSITION}.gtf