Using hgWiggle without a database
hgWiggle used on local files
The hgWiggle command is used to extract the compressed data values from a "wiggle" type of data track in the genome browser. It is often useful to be able to run this command locally without a database. The following example explains how to use hgWiggle on local files only without a database.
Download files from hgdownload
The ".wig" files to use for this are actually the database table dumps available from the hgdownload system. Fetch the files you need to use from hgdownload. For example, the gc5Base track on the Stickleback organism:
Fetch the ".wig" file from the database dump:
ftp://hgdownload.cse.ucsc.edu/goldenPath/gasAcu1/database/gc5Base.txt.gz
And you need the compressed data values in the ".wib" file from the gbdb filesystem files:
ftp://hgdownload.cse.ucsc.edu/gbdb/gasAcu1/wib/gc5Base.wib
Place these files together in the same directory. The compressed gc5Base.txt.gz file is the so-called ".wig" file, make it appear as so:
$ gunzip gc5Base.txt.gz $ ln -s gc5Base.txt gc5Base.wig
The resulting files appear as:
$ ls -ogrt gcBase* lrwxrwxrwx 1 11 May 25 09:19 gc5Base.wig -> gc5Base.txt -rw-rw-r-- 1 9869820 May 25 09:36 gc5Base.txt -rw-rw-r-- 1 90820835 May 25 09:37 gc5Base.wib
The hgWiggle command
Then, using hgWiggle, for example, statistics on chrI:
$ hgWiggle -chr=chrI -doStats gc5Base looking for: gc5Base.wig # from file, Table: gc5Base # Chrom Data Data # Data Data Bases Minimum Maximum Range Mean Variance Standard # start end values span covered deviation chrI 1 28185910 5512103 5 27560515 0 100 100 44.4915 533.509 23.0978
To get statistics on a set of genomic regions, create a BED file containing the regions (chrom, chromStart, chromEnd), and supply this to hgWiggle, using the -bedFile option.
What is special about this process
The database dump file is slightly different than an actual ".wig" file. It has an extra "bin" column at the beginning. The hgWiggle command ignores this extra column. The "file" column of this file has a fully qualified file name to a /gbdb/gasAcu1/wib/gc5Base.wib file. The hgWiggle command ignores this fully qualified name, and finds the gc5Base.wib file in the current directory.
Multiple .wib files
Some older assembly databases have per-chromosome .wib files in the gbdb wib directory. In this case, download each of those files for your chromosome of interest. The process described here will work in the same manner.