Genbank updates: Difference between revisions

From Genecats
Jump to navigationJump to search
(New page: To disable genbank updates to an assembly: In addition to removing the assembly name from the hgwbeta.dbs and rr.dbs files, the files here need to be removed: /cluster/data/genbank/data...)
 
(Removed instructions to delete from /cluster/data/genbank/data/ftp (per Braney). Made the instructions generic for enabling and disabling.)
Line 1: Line 1:
To disable genbank updates to an assembly:
To enable/disable genbank updates to an assembly for hgwbeta and the RR:


In addition to removing the assembly name from the hgwbeta.dbs and rr.dbs files, the files here need to be removed:
    -> in the source tree, add/remove the assembly name from these files:  
/cluster/data/genbank/data/ftp/
 
Otherwise the genbank files will keep reappearing. 
 
So, to disable a genome:
 
    -> on hgwdev remove the assembly name from these files:  
           ~/kent/src/hg/makeDb/genbank/etc/hgwbeta.dbs
           ~/kent/src/hg/makeDb/genbank/etc/hgwbeta.dbs
           ~/kent/src/hg/makeDb/genbank/etc/rr.dbs
           ~/kent/src/hg/makeDb/genbank/etc/rr.dbs
       and commit.  Please remove the name, don't comment it out, git keeps the file edit history.
       and commit.  (Don't comment out names; remove them.  Git keeps the file edit history.)
     -> ssh hgwbeta
     -> ssh hgwbeta
     -> cd ~/kent/src/hg/makeDb/genbank
     -> cd ~/kent/src/hg/makeDb/genbank
     -> git pull
     -> git pull
     -> make etc-update-rr
     -> make etc-update-rr
    -> cd to /cluster/data/genbank/data/ftp/${assembly}
 
    -> remove /cluster/data/genbank/data/ftp/${assembly}
If you are disabling updates and you also want to remove the downloads files that are updated by the GenBank process (in the bigZips and multiz* directories on hgdownload), the genbank person will need to remove this assembly from the /cluster/data/genbank/data/ftp directory.  The genbank person can also drop the files from hgdownload.
    -> remove files on hgdownload by sending a push request to drop this directory and its contents:
          /usr/local/apache/htdocs/goldenPath/${assembly}


==Some extra notes about Genbank tables==
==Some extra notes about Genbank tables==


The current list of Genbank tables (curated by Mark Diekhans) is located at hgwdev:/cluster/data/genbank/etc/genbank.tbls (also located at hgwbeta:/genbank/etc/genbank.tbls). All tables in the list up to 'gbLoaded' must exist; those after 'gbLoaded' are optional. To get a list of those tables included in a database (using hg18 as an example), do:
The current list of Genbank tables is located at hgwdev:/cluster/data/genbank/etc/genbank.tbls (also located at hgwbeta:/genbank/etc/genbank.tbls). All tables in the list up to 'gbLoaded' must exist; those after 'gbLoaded' are optional. To get a list of those tables included in a database (using hg18 as an example), do:


   hgsql -N -e 'SHOW TABLES' hg18 | egrep -f /cluster/data/genbank/etc/genbank.tbls  (hgwdev)
   hgsql -N -e 'SHOW TABLES' hg18 | egrep -f /cluster/data/genbank/etc/genbank.tbls  (hgwdev)

Revision as of 22:33, 2 July 2012

To enable/disable genbank updates to an assembly for hgwbeta and the RR:

   -> in the source tree, add/remove the assembly name from these files: 
          ~/kent/src/hg/makeDb/genbank/etc/hgwbeta.dbs
          ~/kent/src/hg/makeDb/genbank/etc/rr.dbs
      and commit.  (Don't comment out names; remove them.  Git keeps the file edit history.)
   -> ssh hgwbeta
   -> cd ~/kent/src/hg/makeDb/genbank
   -> git pull
   -> make etc-update-rr

If you are disabling updates and you also want to remove the downloads files that are updated by the GenBank process (in the bigZips and multiz* directories on hgdownload), the genbank person will need to remove this assembly from the /cluster/data/genbank/data/ftp directory. The genbank person can also drop the files from hgdownload.

Some extra notes about Genbank tables

The current list of Genbank tables is located at hgwdev:/cluster/data/genbank/etc/genbank.tbls (also located at hgwbeta:/genbank/etc/genbank.tbls). All tables in the list up to 'gbLoaded' must exist; those after 'gbLoaded' are optional. To get a list of those tables included in a database (using hg18 as an example), do:

 hgsql -N -e 'SHOW TABLES' hg18 | egrep -f /cluster/data/genbank/etc/genbank.tbls  (hgwdev)
 hgsql -N -e 'SHOW TABLES' hg18 | egrep -f /genbank/etc/genbank.tbls  (hgwbeta)

The two tables 'gbCdnaInfo' and 'gbStatus' are main tables that should contain all entries for a database.