New Assembly Release Process Details: Difference between revisions

From genomewiki
Jump to navigationJump to search
(adding step to check blat servers and .2bit files)
(changing file path for downloads on hgwdev to be htdocs-hgdownload)
Line 352: Line 352:


*These don't need to be pushed to hgwbeta or Round Robin – just straight from hgwdev to hgdownload.
*These don't need to be pushed to hgwbeta or Round Robin – just straight from hgwdev to hgdownload.
*Note that they are being pushed from hgwdev: /usr/local/apache/'''htdocs-hgdownload'''/goldenPath/$db/bigZips to hgdownload: /usr/local/apache/'''htdocs'''/goldenPath/$db/bigZips. Be sure to specify this in your push request


*Make sure that the permissions for these two directories are group protein writable (at least chmod 664). The developer who created this assembly will probably be the owner of the directory and the files in it; you may need to ask him/her to change the permissions. Ask the pushers to be sure to keep the permissions as they are when they push the files (especially making sure that they are group protein writable).
*Make sure that the permissions for these two directories are group protein writable (at least chmod 664). The developer who created this assembly will probably be the owner of the directory and the files in it; you may need to ask him/her to change the permissions. Ask the pushers to be sure to keep the permissions as they are when they push the files (especially making sure that they are group protein writable).
<PRE>
<PRE>
       hgwdev> /usr/local/apache/htdocs/goldenPath/$db/bigZips
       hgwdev> /usr/local/apache/htdocs-hgdownload/goldenPath/$db/bigZips
       hgwdev> /usr/local/apache/htdocs/goldenPath/$db/database
       hgwdev> /usr/local/apache/htdocs-hgdownload/goldenPath/$db/database
</PRE>
</PRE>
The easiest way to ensure that the directory is group protein writable is to ask for a push of an empty directory with the appropriate permissions before you fill it with stuff. Another possibility is if you want to push a directory with stuff it helps if they can push the whole directory and all of its contents and not just certain items in the directory. For example, if you ask to push:
The easiest way to ensure that the directory is group protein writable is to ask for a push of an empty directory with the appropriate permissions before you fill it with stuff. Another possibility is if you want to push a directory with stuff it helps if they can push the whole directory and all of its contents and not just certain items in the directory. For example, if you ask to push:
<PRE>
<PRE>
       /usr/local/apache/htdocs/goldenPath/foo/foobar.gz
       /usr/local/apache/htdocs-hgdownload/goldenPath/foo/foobar.gz
       /usr/local/apache/htdocs/goldenPath/foo/foobaz.gz
       /usr/local/apache/htdocs-hgdownload/goldenPath/foo/foobaz.gz
       /usr/local/apache/htdocs/goldenPath/foo/barbaz.gz
       /usr/local/apache/htdocs-hgdownload/goldenPath/foo/barbaz.gz
</PRE>     
</PRE>     
where usr/local/apache/htdocs/goldenPath/foo is a new directory and contains only those three files, then you should instead just ask to push the directory.
where usr/local/apache/htdocs-hgdownload/goldenPath/foo is a new directory and contains only those three files, then you should instead just ask to push the directory.
If you tell them to push /usr/local/apache/htdocs/goldenPath/foo/*, they strip off the /* for this very reason.
If you tell them to push /usr/local/apache/htdocs-hgdownload/goldenPath/foo/*, they strip off the /* for this very reason.


*Before requesting this push, check the md5sum: /usr/local/apache/htdocs/goldenPath/$db/bigZips/md5sums.txt (for each file – double-check download against size).
*Before requesting this push, check the md5sum: /usr/local/apache/htdocs-hgdownload/goldenPath/$db/bigZips/md5sums.txt (for each file – double-check download against size).


*/usr/local/apache/htdocs/goldenPath/$db/"*"
*/usr/local/apache/htdocs-hgdownload/goldenPath/$db/"*"


Check that we have READMEs at top level (optional), and for bigZips, chromosomes, liftOvers and comparatives (multiz, phastCons, vsXXX). And read them!
Check that we have READMEs at top level (optional), and for bigZips, chromosomes, liftOvers and comparatives (multiz, phastCons, vsXXX). And read them!
Line 377: Line 379:
In bigZips, look for upstream*.zip -- check that they unzip into same number of records. Note that some of the files mentioned in the README are generated by the Genbank process.
In bigZips, look for upstream*.zip -- check that they unzip into same number of records. Note that some of the files mentioned in the README are generated by the Genbank process.


Look for liftOver files in other assemblies of the same org. e.g., from /usr/local/apache/htdocs/goldenPath/, try: find . -name "*ToHg17*" OR try: ls */liftOver/*ToRn4.over.chain.gz. Push xspecies liftOver files in /liftOver directory (do NOT push inside vsXXX directories). Notify Donna that new liftOver files are ready to be linked in docs. When ready, push hgwdev:/usr/local/apache/htdocs/goldenPath/$db/* to hgdownloads. Also push md5sum.txt files for liftOvers. They may need to be edited (at least temporarily) to include only the files on hgdownload.
Look for liftOver files in other assemblies of the same org. e.g., from /usr/local/apache/htdocs-hgdownload/goldenPath/, try: find . -name "*ToHg17*" OR try: ls */liftOver/*ToRn4.over.chain.gz. Push xspecies liftOver files in /liftOver directory (do NOT push inside vsXXX directories). Notify Donna that new liftOver files are ready to be linked in docs. When ready, push hgwdev:/usr/local/apache/htdocs-hgdownload/goldenPath/$db/* to hgdownloads. Also push md5sum.txt files for liftOvers. They may need to be edited (at least temporarily) to include only the files on hgdownload.


       Push pairwise alignments (vsXXX) for:
       Push pairwise alignments (vsXXX) for:
Line 383: Line 385:
       - all the species featured in the conservation track
       - all the species featured in the conservation track


       Also push pairwise alignments in other assembly databases to the new assembly. To find them, from /usr/local/apache/htdocs/goldenPath, try: ls */vsXXX
       Also push pairwise alignments in other assembly databases to the new assembly. To find them,  
      from /usr/local/apache/htdocs-hgdownload/goldenPath, try: ls */vsXXX
      
      
*<span id="autodump"></span>Request autodump -- manually now, and ongoing for later. Ask the pushers to dump the mysql tables from the RR to .txt.gz and .sql files on hgdownload:/usr/local/apache/htdocs/goldenPath/$db/database, and to start the autodump for this database so that the files will be updated with RR tables.
*<span id="autodump"></span>Request autodump -- manually now, and ongoing for later. Ask the pushers to dump the mysql tables from the RR to .txt.gz and .sql files on hgdownload:/usr/local/apache/htdocs/goldenPath/$db/database, and to start the autodump for this database so that the files will be updated with RR tables.
      
      
*<span id="currentgenomes"></span>Add symlink to hgwdev> /usr/local/apache/htdocs/goldenPath/currentGenomes and request a push to hgdownloads. This is for ftp users (when they press on the organism's name, they will go to the newest assembly download files).
*<span id="currentgenomes"></span>Add symlink to hgwdev> /usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes and request a push to hgdownloads. This is for ftp users (when they press on the organism's name, they will go to the newest assembly download files).
<PRE>
<PRE>
   hgwdev/usr/local/apache/htdocs/goldenPath/currentGenomes> rm Drosophila_ananassae
   hgwdev/usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes> rm Drosophila_ananassae
   hgwdev/usr/local/apache/htdocs/goldenPath/currentGenomes> ln -s ../droAna2 Drosophila_ananassae
   hgwdev/usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes> ln -s ../droAna2 Drosophila_ananassae
</PRE>
</PRE>


Line 445: Line 448:
</PRE>
</PRE>
      
      
*Check to see that these files have been made by the genbank automatic download building process:
*Check to see that these files have been made by the genbank automatic download building process on hgdownload:
<PRE>
<PRE>
               htdocs/goldenPath/$db/bigZips/
               htdocs/goldenPath/$db/bigZips/

Revision as of 21:50, 21 May 2010

Go to the Push Checklist

Stage and test on hgwbeta

  1. If you are releasing an update assembly, check to see if chromosome sizes have changed significantly.
  2. a) Output chromosome sizes from the old and new assemblies into two files
    hgwdev > hgsql -Ne "select chrom, size from chromInfo" oldDbName > oldChromSizes
    hgwdev > hgsql -Ne "select chrom, size from chromInfo" newDbName > newChromSizes
    b) Compare them side-by-side and see if the sizes have changed significantly
    hgwdev > sdiff -s oldChromSizes newChromSizes
  3. Rsync database from hgwdev to hgwbeta
  4. (pruning the unneeded tables trackDb_$user and hgFindSpec_$user) a) Create the database on hgwbeta
    hgwbeta > hgsql
    hgwbeta > SHOW DATABASES; (you will need to pick a random database from this list in order to create a new database)
    hgwbeta > exit
    hgwbeta > hgsql randomDatabaseName
    mysql > CREATE DATABASE dbName;
    b) Create a list of tables to create inside the database from the tables listed in the push queue for the new assembly. There should be a table called <new_db> in the qapushq database on hgwbeta which can be used to get all of the tables at once:
    hgwbeta > hgsql -Ne "SELECT tbls FROM dbName WHERE dbs='dbName'" qapushq > tableList
    To convert spaces to newlines for the tableList:
    awk '{ for (i=1;i<=NF;i++) print $i }' infileName > outfileName
    The following tables will need to be removed from the tableList:
    hgFindSpec, trackDb, tableDescriptions
    If your assembly has a Chain/Nets to/from an assembly that is *not* on the RR (and not in the pushQ as another new assembly), you do not need to QA those Chain/Nets, or push them to the RR. You will need to remove those 3 tables (chainName, netName, and chainNameLink) from your tableList and drop those three rows from your sub-pushQ completely. To remove from the pushQ go to the chain/net track entry and click lock and then click the delete button. c) Push tables to hgwbeta
    hgwdev > bigPush.csh dbName tableList
    gives size of the push at the end, which you can compare to original size, should be "similar" to the original size from hgwdev. You can compare sizes in the main pushQ by putting a "*" in the tables field, selecting hgwdev from the "Current Location:", and then clicking on "show sizes" button.
  5. Update hgcentralbeta: dbDb, blatServers, genomeClade, gdbPdb (for KnownGenes), liftOverChain
  6. a) Create (or update) hgcentralbeta.dbDb METADATA (ie. one thing this does is adds default position for the assembly to gateway page on beta so you can verify that you have done this correctly) Check to make sure your row doesn't exist in hgcentralbeta:
    mysql> select * from dbDb where name = 'dbName'\G (make sure you do this from hgwbeta)
    Check to make sure the row exists on hgcentraltest (make sure to exit from mysql and go to hgwdev):
    hgwdev> hgsql -N -e "select * from dbDb where name = 'dbName'" hgcentraltest

    Note: Check to make sure that the assembly date is correct under the description column. One way to to do this to compare it to the previous assembly.

    hgwdev> hgsql -N -e "select name, description from dbDb where organism='OrganismName'; " hgcentraltest

    If the date needs to be changed, contact the developer to verify that the date is correct before changing it. The database will need to be changed along with associated documents. To change the gateway dropdown here is an example:

    update dbDb set columnName=replace(columnName,'string that is wrong','string that is right') where name = 'dbName';
    update dbDb set description=replace(description,'June 2007','March 2009') where name = 'calJac3';


    If the above looks correct (should just be for the database specified), then redirect it to a file:

    hgwdev> hgsql -N -e "select * from dbDb where name = 'calJac3'" hgcentraltest > hgcentraltest.dbDb

    Check the newly created file:

    hgwdev> cat hgcentraltest.dbDb

    Load onto hgcentralbeta:

    hgwdev> hgsql -h mysqlbeta -e "LOAD DATA LOCAL INFILE 'hgcentraltest.dbDb' INTO TABLE dbDb" hgcentralbeta

    Check from hgwdev to see if hgcentralbeta has been updated with the new row:

    hgwdev> hgsql -h mysqlbeta -e "select * from dbDb where name = 'calJac3'" hgcentralbeta

    b) Request a blat server and create 2 lines in hgcentraltest.blatServers and hgcentralbeta.blatServers METADATA

    This may have already been done by the developer. Check whether there are lines in hgcentraltest.blatServers for the new assembly. If so, try using blat on hgwdev. If blat is already running successfully, copy the pertinent lines from hgcentraltest.blatServers to hgcentralbeta.blatServers.

    hgsql -Ne "SELECT * FROM blatServers WHERE db = 'danRer6'" hgcentraltest > blat.dev

    check if on beta, then load if not:

    hgsql -h mysqlbeta -Ne "SELECT * FROM blatServers WHERE db = 'danRer6'" hgcentralbeta
    hgsql -h mysqlbeta -e "LOAD DATA LOCAL INFILE 'blat.dev' INTO TABLE blatServers" hgcentralbeta

    If blat servers are not already running, you will need to send an email to cluster-admin asking for a blatServer to be assigned to the new assembly. They will give you the name of the blatServer and the port numbers for the isTrans and canPcr. Then you can add two new lines to the blatServer table for this information on both the hgcentraltest database (on hgwdev) and the hgcentralbeta database (on hgwbeta). If this is a '2' or more assembly (e.g. droVir2), you will want to leave the entries for the previous assemblies in the table (e.b. droVir1). Do not just update the '1' entry to a '2' entry (that is, there will still be a blat server for the '1' entry until it is archived.)

    Remember to test BLAT and isPCR for the new assembly!

    c) Also move data for hgcentralbeta/genomeClade (select * from genomeClade where genome='dbName';), hgcentralbeta/gdbPdb (for Known Genes), etc. as needed using same methods.

    If this is not the first assembly for an organism, genomeClade will already be fine.

    d) Pushing LiftOvers: Do not push liftOvers to assemblies that have not been released yet. Do push liftOver files (the *over.chain* files) to released assemblies. To find the other-organism liftover files, cd to /gbdb on hgwdev and use this command: ls -d */liftOver/*<new_db>* . Make sure that you're not moving in liftOver lines to assemblies that aren't on the RR - check the push Q. Also make sure that there aren't lines in liftOverChain that should be in the pushQ but aren't (e.g. lift ups from old orgs, etc). Email the sponsor and ask them to add them to the pushQ if necessary.

    hgsql -Ne "SELECT * FROM liftOverChain WHERE fromDb = 'danRer6' OR toDb = 'danRer6'" \
    hgcentraltest > chain.dev

    check beta, load if not present and recheck:

    hgsql -h mysqlbeta -Ne "SELECT * FROM liftOverChain WHERE fromDb = 'danRer6' OR toDb = 'danRer6'" hgcentralbeta
    hgsql -h mysqlbeta -e "LOAD DATA LOCAL INFILE 'chain.dev' INTO TABLE liftOverChain" hgcentralbeta

    NOTE: you can use the script checkMetaData.csh to make sure that all of the metadata is the same on hgwdev and on hgwbeta. Run this script in a temporary folder because it creates several files.

    NOTE: Do not change the value for defaultDb (leave it set to the previous assembly for this organism) until you are ready for the final push to the RR. (But if this is the first assembly for an organism, you will need the defaultDb entry in order for the assembly to appear on hgwbeta.)

  7. Push all of /gbdb/$db including html/description.html
  8. Send email to push-request@soe.ucsc.edu and ask the pushers to
    1. add the new assembly to the mirror exclude list (to the gbdb and mysql rsync download targets) at hgdownload:/opt/csw/etc/rsyncd.conf and
    2. Extract all of the gbdb files from the pushQ for your org and those for the other orgs as well.
    3. ask for a push of the list of /gbdb files above from hgwdev to hgnfs1
    Remind the pushers that items that are symlinked on hgwdev should become real files on hgnfs1. To see how big these files are (so you can check to see if the mirrors should be warned) use this command in the hgwdev:/gbdb directory: du -hscL `ls -d */liftOver/*CalJac1*` .
  9. Generate trackDb in strict mode
  10. Remake the trackDb with strict to clean up the stuff from hgwdev. Will likely need to be done again as track descriptions are updated.
    hgwbeta> cd kent/src/hg/makeDb/trackDb
    hgwbeta> make strict DBS=<new_db>
  11. The pesky image file
  12. The image file that appears on the gateway page should reside in the browser CVS tree in:
    ~/browser/images/<image_name>
    and a copy should exist at:
    hgwdev:/usr/local/apache/htdocs/images/<image_name>
    If there is a previous assembly, it is possible that it is using the same image on the gateway page. Check on hgwbeta to see if the image is missing. If it isn't, you don't need to ask for the image to be pushed. To get the image to appear on hgwbeta and the RR, ask for a push of the file from hgwdev:/usr/local/apache/htdocs/images/<image_name> to hgwbeta and the RR. It's a good idea to ask for the push of the image to the RR during the staging process, as you will inevitably forget to push it when it's time to release the assembly. If there are any other images for this assembly (for instance, the phylo image that goes with the Conservation track), you can push them, too.
  13. GenBank updates
  14. a) To turn on GenBank updates on hgwbeta (do this before 4:30 p.m., when the daily updates start):
    add new assembly to list here:
    hgwdev:~/kent/src/hg/makeDb/genbank/etc/hgwbeta.dbs
    and commit the change
    ssh hgwbeta
    cd /genbank/etc
    cvs up -dP
    Note: the new assembly should already be listed in the files align.dbs and hgwdev.dbs. If it is not, check with Mark Diekhans. b) To see whether updates have run (at least a day after the *.dbs files were updated), check the update times of the table 'gbLoaded':
    hgwdev> updateTimes.csh <new_db> gbLoaded
    The update times will be out of sync between machines, but not by more than 24 hours or so if updates are running. The gbLoaded table will be updated regardless of whether changes to other GenBank tables were picked up. More genbank update instructions are available here.
  15. Check the .nib/.2bit files
    1. File exists on /gbdb/dbName and is of reasonable size (not "0")
    2. Data is real, not a symlink for hgnfs1 (hgwbeta, Round Robin). (OK if hgwdev is a symlink)
    3. If nib, should be one per chrom in nib subdirectory
    4. If 2bit, should be one file and not in any subdirectory
  16. Check default position and default tracks
  17. From the gateway page, press 'Click here to reset' (your browser). Go back to your assembly, then press 'submit'. You will be taken to the default position for your assembly. Make sure that the resulting area is scientifically interesting and aesthetically pleasing! You can edit the default location here: hgcentralbeta.dbDb.defaultPos and the default tracks here: /kent/src/hg/makeDb/trackDb/<new_db>/trackDb.ra.
  18. Review orderKey values
  19. Check to see if your new genome is in the the right place evolutionarily. Find a map or get help if unclear about proper location.
    hgwbeta>hgsql hgcentralbeta
    mysql>select name, orderKey from dbDb order by orderKey;
  20. Check all sample queries on hgGateway page
  21. From the gateway page, check all of the sample queries listed in the assembly details. Remember to refresh cart after each one! (note from brooke: Is a cart refresh really necessary? If not, let's take this instruction out...)
  22. Run joinerCheck
  23. a) Check that common keys between tables are in sync:
    hgwbeta> cd ~/kent/src/hg/makeDb/schema
    hgwbeta> joinerCheck -database=<new_db> -keys all.joiner
    (joinerCheck can be run from beta or dev). Additional options:
    • for whole DB, single key: -identifier="actual_key"
    • for single track: run Bob's script from hgwdev: runjoiner.csh <database> b)Check that all tables in this database are mentioned in all.joiner:
      hgwbeta> cd ~/kent/src/hg/makeDb/schema
      hgwbeta>joinerCheck -database=<new_db> -tableCoverage all.joiner
      If not all of the tables are listed, email the developer asking him to add those tables to the tablesIgnored $dbName. According to Hiram it is probably ok for us to edit all.joiner ourselves.
    • Check indices (from pushQ)
      You can either do it the easy way: using pushQ, click on "show sizes button" or the sql way:
      mysql> show index from <table_name>;
      note: sizes are included in output
    • Verify makedoc (from pushQ)
      Find the make file for your target dataset and check inside that the tables listed in section “Tables” (in PushQ content, beware that may not be complete!) are included. Remember to update your cvstree before you start anything! Things that probably won't be in the makedoc explicitly are the supporting tables, genbank tables, assembly, and gap. According to Hiram: gc5Base is created by makeGenomeDb.pl and therefore is not in the make doc and nestedRepeats is created by the RepeatMasker script and is therefore not in the make doc.
      hgwdev> /cluster/home/<uid>/checkout/kent/src/hg/makeDb/doc/<new_db.txt>
      If everything is there, be sure to click on “Y” in pushQ next to MakeDoc Verified.
    • Run featureBits
      (Editorial questions from Brooke: Is this really necessary at this stage? Don't we run featureBits a lot as part of individual track testing?)
      • Run code alone on some tables and then between tables. Expected results would be that gap does not have much overlap with annotation tables like genscan.
      hgwdev> featureBits <database> <table> <table>
      
    • Check to make sure that none of the table names have underscores(_).
      There are some older tables that have underscores (e.g. all_est) -- these are OK. What is definitely *not* OK is for split tables (tables that start with chr) to have more than one underscore in their name. The MySQL "%" (percent) wildcard matches any number of characters, and "_" (underscore) matches a single character. To find table names that include underscores, the underscore symbol must be escaped with a backslash, like this (the second query would find table names with two underscores):
            	mysql> show tables like "%\_%";
            	+--------------------------+
            	| Tables_in_calJac1 (%\_%) |
            	+--------------------------+
            	| all_est                  |
            	| all_mrna                 |
            	+--------------------------+
            	2 rows in set (0.00 sec)
      
      mysql> show tables like "%\_%\_%";
      Empty set (0.00 sec)
    • Push net and chain table in other organisms that point to this new one (if any). This will involve these tables:
      otherOrg.(chrN_)chainYourOrg
      otherOrg.(chrN_)chainLinkYourOrg
      otherOrg.netYourOrg
      To do this you can go to the pushQ entries for each track. Create a file with the 3 files and use bigPush.csh. Verify that the assembly is on the RR before pushing the chain/net files to it. To do this use the script getAssemblies.csh with gbCdnaInfo as the table name.
      ie. hgwdev >bigPush.csh dbName tableListYourOrg
      After pushing the tables you will need to make beta in trackDB on hgwbeta. (do not push the trackDb or hgFindSpec for these yet. Before releasing to the RR you will make name changes to trackDb.chainNet.ra)
    • Make sure that there is a liftOver file from the previous assembly to this assembly. This is the number one request after a new release. These files are located here:
      /gbdb/[from database]/liftOver/[from database]To[to database].over.chain.gz
    • If the new assembly is an update to the human, mouse, rat, zebrafish, D. melanogaster, C. elegans, or S. cerevisiae genomes, make sure that the appropriate *blastTab tables to this assembly are built.
    • Review all tracks in the sub pushQ as usual!
    • Check that all of the MySQL tables are in good repair:
      hgwbeta> sudo dbCheck.sh $db
      This will do a myisamchk on all tables (files) in that $db and repair any that need repairing (noted in the output by the words "REPAIR needed").
    • Check that the .2bit files in /gbdb/<db>/ and /usr/local/apache/htdocs-hgdownload/goldenPath/<db>/bigZips/ and on the blat server (/scratch/<db>) are the same using md5sum. Get the blat server from hgcentral and ssh into the machine:
      hgwdev> ssh qateam@blat#.cse.ucsc.edu
      This will let you on to the blat machine after which you can look in /scratch/<db> to see the .2bit file. If it is not the same as the other .2bit files ask the pushers to restart the assembly and to pull the newest .2bit file from /gbdb.
    • Review the gateway page last (description.html) after Donna has had a chance to edit it.

      Push Data to Round Robin from hgwbeta

      • Make sure nothing needs to be repushed from hgwdev to hgwbeta (you can use hgwdev> updateTimesDb.csh to compare table update times between hgwdev and hgwbeta)
      • If you are going to repush any genbank tables (see list of genbank tables here), you must push ALL genbank tables together (not just some)
      • send warning email to genome-mirror 24 hours before you release, send an email to genome-mirror (mirror site managers) to let them know you are about to dump a bunch of data on them. The way to find out how much data is:
            Size of entire assembly database:
            hgwbeta> cd /tmp
            hgwbeta> dbSnoop -unsplit $db $db.dbSnoop
            hgwbeta> head $db.dbSnoop
      
            Size of entire assembly gbdb:
            hgwbeta> cd /gbdb
            hgwbeta> du -hsc $db
      
      • Adjust the release log: Compile a list of the tracks being released on this assembly and paste it into the release log box of the main pushQ entry for the initial release of the assembly. You can fetch the list from the assembly pushQ (Note that for the genbank tracks you will need to get the names manually.):
                 ssh hgwbeta
                 hgsql -Ne "SELECT track from $db" qapushq > releaselog
      
      • Request rsync of entire database from push-request
      • rsync /gbdb again as necessary. Remind the pushers to remove this assembly from the mirror's "exclude" list (so that the mirror sites can now rsync the /gbdb for this assmebly). Make sure the mirrors know that the assembly will now be removed from the "exlude" list.(htmlPath field in hgcentral.dbDb points to /gbdb/$db/html/description.html to make Gateway text page)
      • push chains/nets for other species if there are chains and nets to other species, make sure that the tables in the other databases are pushed, along with the trackDbs

      Update hgcentral

      Copy entries from hgcentralbeta (on hgwbeta) to hgcentral (on genome-centdb). You can use hgwdev> checkMetaData.csh to compare the metadata tables on any two machines (e.g. checkMetaData.csh <db_name> hgwbeta hgw1). This script will produce several files. You can then edit the file for each table (remove the column header line) then load them into the hgcentral database.

      • You can log into genome-centdb and view the hgcentral database like so:
                  hgwdev> hgsql -h genome-centdb
                  mysql> USE hgcentral;
      
      • Or, you can load the edited output file from the checkMetaData.csh script directly into hgcentral by doing the following:
        hgwdev> hgsql -h genome-centdb -e 'LOAD DATA LOCAL INFILE "'dbDb.$db.common'" 
                     INTO TABLE dbDb' hgcentral
         
      
      • dbDb
        • with active column set to 0 (don't set active = 1 until you are ready for the assembly to go live on the RR).
        • hgNearOk = 1 is still OK for older assembly of same organism.
        • edit orderKey if necessary to reflect the order as listed for this assembly on hgcentralbeta. Note: the orderKey information may be overridden by some of the CGIs, so it is not always apparent that the orderKey needs to be changed. One good place to check the order in the Browser is the drop-down menus on the PCR page (hgPcr).
        • blatServers
        • genomeClade (this only needs to be edited if this is a “1” assembly – first assembly for this organism)
        • make gdbPdb entry to point to proteins database, if not default (for Known Genes)
        • set liftOverChain. Do not change hgcentral.liftOverChain for assemblies on the RR going to the new db until the new db is active on the RR. Prepare a file to load into hgcentral immediately after setting active = 1.


      Enable Assembly on Round Robin

      • Test the assembly tracks, BLAT, PCR, etc. by forcing db=$org and position= into the hgTracks URL (e.g. view an older assembly, then edit the URL so that you are actually viewing your new assembly).
      • When you know that everything is working, set the assembly to active:
            hgwdev> hgsql -h genome-centdb
            mysql> USE hgcentral;
            mysql> UPDATE dbDb SET active = 1 WHERE name = "$db";
      
      • defaultDb (set your assembly as the default assembly for this organism). You can do this for hgcentraltest and hgcentralbeta now too.

      Push Downloads from hgwdev to hgdownload

      • These don't need to be pushed to hgwbeta or Round Robin – just straight from hgwdev to hgdownload.
      • Note that they are being pushed from hgwdev: /usr/local/apache/htdocs-hgdownload/goldenPath/$db/bigZips to hgdownload: /usr/local/apache/htdocs/goldenPath/$db/bigZips. Be sure to specify this in your push request
      • Make sure that the permissions for these two directories are group protein writable (at least chmod 664). The developer who created this assembly will probably be the owner of the directory and the files in it; you may need to ask him/her to change the permissions. Ask the pushers to be sure to keep the permissions as they are when they push the files (especially making sure that they are group protein writable).
            hgwdev> /usr/local/apache/htdocs-hgdownload/goldenPath/$db/bigZips
            hgwdev> /usr/local/apache/htdocs-hgdownload/goldenPath/$db/database
      

      The easiest way to ensure that the directory is group protein writable is to ask for a push of an empty directory with the appropriate permissions before you fill it with stuff. Another possibility is if you want to push a directory with stuff it helps if they can push the whole directory and all of its contents and not just certain items in the directory. For example, if you ask to push:

            	/usr/local/apache/htdocs-hgdownload/goldenPath/foo/foobar.gz
            	/usr/local/apache/htdocs-hgdownload/goldenPath/foo/foobaz.gz
            	/usr/local/apache/htdocs-hgdownload/goldenPath/foo/barbaz.gz
      

      where usr/local/apache/htdocs-hgdownload/goldenPath/foo is a new directory and contains only those three files, then you should instead just ask to push the directory. If you tell them to push /usr/local/apache/htdocs-hgdownload/goldenPath/foo/*, they strip off the /* for this very reason.

      • Before requesting this push, check the md5sum: /usr/local/apache/htdocs-hgdownload/goldenPath/$db/bigZips/md5sums.txt (for each file – double-check download against size).
      • /usr/local/apache/htdocs-hgdownload/goldenPath/$db/"*"

      Check that we have READMEs at top level (optional), and for bigZips, chromosomes, liftOvers and comparatives (multiz, phastCons, vsXXX). And read them!

      (.../$db/database will be empty except for README.txt -- Admins fill this on Round Robin (not on hgwdev or hgwbeta) by autodump after push. Send request to cluster-admin.)

      In bigZips, look for upstream*.zip -- check that they unzip into same number of records. Note that some of the files mentioned in the README are generated by the Genbank process.

      Look for liftOver files in other assemblies of the same org. e.g., from /usr/local/apache/htdocs-hgdownload/goldenPath/, try: find . -name "*ToHg17*" OR try: ls */liftOver/*ToRn4.over.chain.gz. Push xspecies liftOver files in /liftOver directory (do NOT push inside vsXXX directories). Notify Donna that new liftOver files are ready to be linked in docs. When ready, push hgwdev:/usr/local/apache/htdocs-hgdownload/goldenPath/$db/* to hgdownloads. Also push md5sum.txt files for liftOvers. They may need to be edited (at least temporarily) to include only the files on hgdownload.

           Push pairwise alignments (vsXXX) for:
           - all the chain/net tracks in an assembly
           - all the species featured in the conservation track
      
           Also push pairwise alignments in other assembly databases to the new assembly. To find them, 
           from /usr/local/apache/htdocs-hgdownload/goldenPath, try: ls */vsXXX
         
      
      • Request autodump -- manually now, and ongoing for later. Ask the pushers to dump the mysql tables from the RR to .txt.gz and .sql files on hgdownload:/usr/local/apache/htdocs/goldenPath/$db/database, and to start the autodump for this database so that the files will be updated with RR tables.
      • Add symlink to hgwdev> /usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes and request a push to hgdownloads. This is for ftp users (when they press on the organism's name, they will go to the newest assembly download files).
        hgwdev/usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes> rm Drosophila_ananassae
        hgwdev/usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes> ln -s ../droAna2 Drosophila_ananassae
      

      Push Static Content from hgwdev to hgwbeta and Round Robin

      Pages to update:

      • /usr/local/apache/htdocs/indexNews.html
      • /usr/local/apache/htdocs/goldenPath/newsarch.html
      • /usr/local/apache/htdocs/goldenPath/credits.html
      • /usr/local/apache/htdocs/FAQ/FAQreleases.html
      • /usr/local/apache/htdocs/downloads.html
      • /gbdb/$db/html/description.html

      For more information go to: http://genomewiki.ucsc.edu/index.php/Static_content_for_new_assemblies

      If new types of tables: goldenPath/gbdDescriptions.html and goldenPath/help/hgTracksHelp.html

      Edit Other Organisms

      • Now push the trackDb/hgFindSpec as needed for the OtherOrgs that have chains and nets pointing to YourOrg so that these tracks will be turned on. Be sure to run compareTrackDbAll.csh and compareHgFindSpec.csh. Resolve issues as usual.
      • Drop the old chains and nets that these are replacing, if any. After you drop the chains/nets, be sure to send an email to genome-mirror letting them know what has been dropped (so that they can drop from their mirror sites).

      Announce the Release

      • Send an email to Donna letting her know that the assembly is released and working on the RR. She will send announcements to:
        • genome mailing list (genome@soe.ucsc.edu)
        • genecats mailing list (genecats@soe.ucsc.edu)
        • OR, edit the documentation yourself and send announcements as above. see "Push Static Content" section.
      • Notify cluster-admin that the new assembly is available and needs to be released to genome-mysql. Permissions should be made for users "genome" and "genomep". The admins also need to update the mysql.db table permissions. (Jorge says we can ask them to follow the instructions in their wiki for "Mirror_Server".)
      • If this is a new species, then send an email to Branwyn (bwagman@ucsc.edu) so she can determine whether or not she wants to announce it on the CBSE website.


      Maintenance

      • Make sure Genbank daily updates are running on Round Robin. You can do this by viewing the dates on the i download files (they should be more recent than the ones you pushed with your release).
      • The day after you press “done!” in the main push queue for your assembly, the Release Log on the website will be updated with the information about the new release (from whatever you entered into the Release Log field of the main push queue). The day after your release, be sure the check the Release Log on the website to make sure that it is present and that it reads correctly.
      • Check the downloads against the md5sum size. You can download each one and then run
            hgwdev> md5sum <filename>
      
      • Note, you may need to push a fresh all.joiner to RR for hgTables
      • Check that genome-mysql is working. From hgwdev:
            mysql -h genome-mysql -A -u genome
      
      • Check to see that these files have been made by the genbank automatic download building process on hgdownload:
                    htdocs/goldenPath/$db/bigZips/
            		est.fa.gz
            		mrna.fa.gz
            		refMrna.fa.gz
            		xenoMrna.fa.gz
            		refGene.fa.gz
            		xenoRefGene.fa.gz