Using hgWiggle without a database: Difference between revisions
(changed cse to soe) |
|||
(4 intermediate revisions by one other user not shown) | |||
Line 12: | Line 12: | ||
Fetch the ".wig" file from the database dump: | Fetch the ".wig" file from the database dump: | ||
<PRE> | <PRE> | ||
rsync -aP rsync://hgdownload.soe.ucsc.edu/goldenPath/gasAcu1/database/gc5Base.txt.gz . | |||
</PRE> | </PRE> | ||
Line 18: | Line 18: | ||
the gbdb filesystem files: | the gbdb filesystem files: | ||
<PRE> | <PRE> | ||
rsync -aP rsync://hgdownload.soe.ucsc.edu/gbdb/gasAcu1/wib/gc5Base.wib . | |||
</PRE> | </PRE> | ||
Line 26: | Line 26: | ||
<PRE> | <PRE> | ||
gunzip gc5Base.txt.gz | |||
ln -s gc5Base.txt gc5Base.wig | |||
</PRE> | </PRE> | ||
The resulting files appear as: | The resulting files appear as: | ||
<PRE> | <PRE> | ||
$ ls -ogrt | $ ls -ogrt gc5Base* | ||
lrwxrwxrwx 1 11 May 25 09:19 gc5Base.wig -> gc5Base.txt | lrwxrwxrwx 1 11 May 25 09:19 gc5Base.wig -> gc5Base.txt | ||
-rw-rw-r-- 1 9869820 May 25 09:36 gc5Base.txt | -rw-rw-r-- 1 9869820 May 25 09:36 gc5Base.txt | ||
Line 49: | Line 49: | ||
</PRE> | </PRE> | ||
==Using the UCSC public MySQL server== | ==Using the UCSC public MySQL server== | ||
To operate the hgWiggle command using the public MySQL server, place the following three lines into a special file in your home directory by the name of '''.hg.conf''' and set its permissions to 600: '''chmod 600 .hg.conf''' | To operate the hgWiggle command using the public MySQL server, place the following three lines into a special file in your home directory by the name of '''.hg.conf''' and set its permissions to 600: '''chmod 600 .hg.conf''' | ||
db.host=genome-mysql. | db.host=genome-mysql.soe.ucsc.edu | ||
db.user=genomep | db.user=genomep | ||
db.password=password | db.password=password | ||
Line 69: | Line 66: | ||
The database dump file is slightly different than an actual ".wig" file. It has an extra "bin" column at the beginning. The hgWiggle command ignores this extra column. The "file" column of this file has a fully qualified file name to a <em>/gbdb/gasAcu1/wib/gc5Base.wib</em> file. The hgWiggle command ignores this fully qualified name, and finds the gc5Base.wib file in the current directory. | The database dump file is slightly different than an actual ".wig" file. It has an extra "bin" column at the beginning. The hgWiggle command ignores this extra column. The "file" column of this file has a fully qualified file name to a <em>/gbdb/gasAcu1/wib/gc5Base.wib</em> file. The hgWiggle command ignores this fully qualified name, and finds the gc5Base.wib file in the current directory. | ||
==Using hgWiggle options== | |||
To get statistics on a set of genomic regions, create a BED file | |||
containing the regions (chrom, chromStart, chromEnd), and | |||
supply this to hgWiggle, using the -bedFile option. | |||
To the .hg.conf file located in your home directory add the following lines: | |||
gbdbLoc1=/path/to/relevant/wib/file | |||
gbdbLoc2=http://hgdownload.soe.ucsc.edu/gbdb/ | |||
This will requiring creating a directory for the file path in your directory. For example, if you were working on the phastCons46way for the hg19 database you would take the following steps: | |||
<br>1. Add a line to .hg.conf to point to where you are working | |||
<pre> | |||
gbdbLoc1=/home/usrName/work/ | |||
</pre> | |||
2. Obtain the wib file | |||
<pre> | |||
rsync -aP rsync://hgdownload.soe.ucsc.edu/gbdb/hg19/multiz46way/phastCons46way.wib . | |||
</pre> | |||
3. From that directory you pointed to, create a directory for the wib file | |||
<pre> | |||
mkdir -p hg19/multiz46way/ | |||
</pre> | |||
4. Move the file to that location. | |||
<pre> | |||
mv phastCons46way.wib hg19/multiz46way/ | |||
</pre> | |||
5. You can now perform operations like | |||
<pre> | |||
hgWiggle -db=hg19 -chr=chrM phastCons46way | |||
hgWiggle -db=hg19 -bedFile=bedFile phastCons46way | |||
</pre> | |||
==Multiple .wib files== | ==Multiple .wib files== | ||
Some older assembly databases have per-chromosome <em>.wib</em> files in the gbdb wib directory. In this case, download each of those files for your chromosome of interest. The process described here will work in the same manner. | Some older assembly databases have per-chromosome <em>.wib</em> files in the gbdb wib directory. In this case, download each of those files for your chromosome of interest. The process described here will work in the same manner. | ||
[[Category:Technical FAQ]] | [[Category:Technical FAQ]] |
Latest revision as of 07:27, 1 September 2018
hgWiggle used on local files
The hgWiggle command is used to extract the compressed data values from a "wiggle" type of data track in the genome browser. It is often useful to be able to run this command locally without a database. The following example explains how to use hgWiggle on local files only without a database.
If you do have access to the internet you can use the UCSC public database server to dramatically speed up these types of queries. For this case, you only need to download the .wib files. Note comments in instructions below for this alternative.
Download files from hgdownload
If you want to use the UCSC public MySQL server, you only need to download the .wib files. You do not need to download the database .txt.gz files.
The ".wig" files to use for this are actually the database table dumps available from the hgdownload system. Fetch the files you need to use from hgdownload. For example, the gc5Base track on the Stickleback organism:
Fetch the ".wig" file from the database dump:
rsync -aP rsync://hgdownload.soe.ucsc.edu/goldenPath/gasAcu1/database/gc5Base.txt.gz .
And you need the compressed data values in the ".wib" file from the gbdb filesystem files:
rsync -aP rsync://hgdownload.soe.ucsc.edu/gbdb/gasAcu1/wib/gc5Base.wib .
Place these files together in the same directory. The compressed gc5Base.txt.gz file is the so-called ".wig" file, make it appear as so:
gunzip gc5Base.txt.gz ln -s gc5Base.txt gc5Base.wig
The resulting files appear as:
$ ls -ogrt gc5Base* lrwxrwxrwx 1 11 May 25 09:19 gc5Base.wig -> gc5Base.txt -rw-rw-r-- 1 9869820 May 25 09:36 gc5Base.txt -rw-rw-r-- 1 90820835 May 25 09:37 gc5Base.wib
The hgWiggle command
Then, using hgWiggle, for example, statistics on chrI:
$ hgWiggle -chr=chrI -doStats gc5Base looking for: gc5Base.wig # from file, Table: gc5Base # Chrom Data Data # Data Data Bases Minimum Maximum Range Mean Variance Standard # start end values span covered deviation chrI 1 28185910 5512103 5 27560515 0 100 100 44.4915 533.509 23.0978
Using the UCSC public MySQL server
To operate the hgWiggle command using the public MySQL server, place the following three lines into a special file in your home directory by the name of .hg.conf and set its permissions to 600: chmod 600 .hg.conf
db.host=genome-mysql.soe.ucsc.edu db.user=genomep db.password=password central.db=hgcentral
The password indicated here is indeed password which is not a secret.
With this file in place, and the .wib file present in the directory you want to work in, use the hgWiggle command with the -db argument:
hgWiggle -db=ce6 -chr=chrI -doStats gc5Base
What is special about this process
The database dump file is slightly different than an actual ".wig" file. It has an extra "bin" column at the beginning. The hgWiggle command ignores this extra column. The "file" column of this file has a fully qualified file name to a /gbdb/gasAcu1/wib/gc5Base.wib file. The hgWiggle command ignores this fully qualified name, and finds the gc5Base.wib file in the current directory.
Using hgWiggle options
To get statistics on a set of genomic regions, create a BED file containing the regions (chrom, chromStart, chromEnd), and supply this to hgWiggle, using the -bedFile option.
To the .hg.conf file located in your home directory add the following lines:
gbdbLoc1=/path/to/relevant/wib/file gbdbLoc2=http://hgdownload.soe.ucsc.edu/gbdb/
This will requiring creating a directory for the file path in your directory. For example, if you were working on the phastCons46way for the hg19 database you would take the following steps:
1. Add a line to .hg.conf to point to where you are working
gbdbLoc1=/home/usrName/work/
2. Obtain the wib file
rsync -aP rsync://hgdownload.soe.ucsc.edu/gbdb/hg19/multiz46way/phastCons46way.wib .
3. From that directory you pointed to, create a directory for the wib file
mkdir -p hg19/multiz46way/
4. Move the file to that location.
mv phastCons46way.wib hg19/multiz46way/
5. You can now perform operations like
hgWiggle -db=hg19 -chr=chrM phastCons46way hgWiggle -db=hg19 -bedFile=bedFile phastCons46way
Multiple .wib files
Some older assembly databases have per-chromosome .wib files in the gbdb wib directory. In this case, download each of those files for your chromosome of interest. The process described here will work in the same manner.