Genbank updates
From Genecats
Jump to navigationJump to search
To disable genbank updates to an assembly:
In addition to removing the assembly name from the hgwbeta.dbs and rr.dbs files, the files here need to be removed:
/cluster/data/genbank/data/ftp/
Otherwise the genbank files will keep reappearing.
So, to disable a genome:
-> on hgwdev remove the assembly name from these files:
~/kent/src/hg/makeDb/genbank/etc/hgwbeta.dbs
~/kent/src/hg/makeDb/genbank/etc/rr.dbs
and commit. Please remove the name, don't comment it out, git keeps the file edit history.
-> ssh hgwbeta
-> cd ~/kent/src/hg/makeDb/genbank
-> git pull
-> make etc-update-rr
-> cd to /cluster/data/genbank/data/ftp/${assembly}
-> remove /cluster/data/genbank/data/ftp/${assembly}
-> remove files on hgdownload by sending a push request to drop this directory and its contents:
/usr/local/apache/htdocs/goldenPath/${assembly}
Some extra notes about Genbank tables
The current list of Genbank tables (curated by Mark Diekhans) is located at hgwdev:/cluster/data/genbank/etc/genbank.tbls (also located at hgwbeta:/genbank/etc/genbank.tbls). All tables in the list up to 'gbLoaded' must exist; those after 'gbLoaded' are optional. To get a list of those tables included in a database (using hg18 as an example), do:
hgsql -N -e 'SHOW TABLES' hg18 | egrep -f /cluster/data/genbank/etc/genbank.tbls (hgwdev) hgsql -N -e 'SHOW TABLES' hg18 | egrep -f /genbank/etc/genbank.tbls (hgwbeta)
The two tables 'gbCdnaInfo' and 'gbStatus' are main tables that should contain all entries for a database.