Download All Genomes: Difference between revisions

From genomewiki
Jump to navigationJump to search
No edit summary
(changed cse to soe)
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Sometimes one wants to blat/blast on all Genomes. In this case, all of them have to be downloaded first to the local machine.
The easiest way is the following bash shell command:
mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -N \
      -e "select name from dbDb where active=1;" hgcentral | while read D
do
    rsync -a --progress \
        rsync://hgdownload.soe.ucsc.edu/gbdb/${D}/${D}.2bit ./${D}.2bit
done
Sometimes the 2bit files are in the nib directory, and thus an rsync command of:
  rsync -a --progress \
        rsync://hgdownload.soe.ucsc.edu/gbdb/${D}/nib/${D}.2bit ./${D}.2bit
is more appropriate.
----
The following solution is a bit more flexible but in most cases unnecessary long:
This script will download the most current version for all genomes that can be rsynced from hgwdownload. It will only download 2bit files.  
This script will download the most current version for all genomes that can be rsynced from hgwdownload. It will only download 2bit files.  
It is written in python and is using the rsync program.
It is written in python and is using the rsync program.
Note the -f parameter to override the selection of the genomes. use -h for help. Downloaded files go into the current directory.
Note the -f parameter to override the selection of the genomes. use -h for help. Downloaded files go into the current directory.


[[Image:RetrUcscGenomes.txt]]
[[Image:RetrUcscGenomes.txt]]
[[Category:User_Developed_Scripts]]

Latest revision as of 07:33, 1 September 2018

Sometimes one wants to blat/blast on all Genomes. In this case, all of them have to be downloaded first to the local machine.

The easiest way is the following bash shell command:

mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -N \
      -e "select name from dbDb where active=1;" hgcentral | while read D
do
   rsync -a --progress \
       rsync://hgdownload.soe.ucsc.edu/gbdb/${D}/${D}.2bit ./${D}.2bit
done

Sometimes the 2bit files are in the nib directory, and thus an rsync command of:

 rsync -a --progress \
       rsync://hgdownload.soe.ucsc.edu/gbdb/${D}/nib/${D}.2bit ./${D}.2bit

is more appropriate.


The following solution is a bit more flexible but in most cases unnecessary long:

This script will download the most current version for all genomes that can be rsynced from hgwdownload. It will only download 2bit files. It is written in python and is using the rsync program. Note the -f parameter to override the selection of the genomes. use -h for help. Downloaded files go into the current directory.

File:RetrUcscGenomes.txt