Assembly QA Part 4 RR Steps

From Genecats
Jump to navigationJump to search

This page is currently a draft in progress. For now, use Releasing an assembly instead.


Navigation Menu

Home Page
Assembly QA Part 1: DEV Steps
Assembly QA Part 2: Track Steps
Assembly QA Part 3: BETA Steps
Assembly QA Part 4: RR Steps
Assembly QA Part 5: Post Release Steps

RR: Populate spreadsheet steps

  • We need to create a checklist for your RR steps.
  • You can add a new tab to track RR steps, or you can pick up where you left off on the same tab as your "dev" and "beta" steps.
To populate the "wiki link" for each step, add this formula to cell A2 (or the row after your last "dev" step) in your new "track checklist" spreadsheet and drag the formula down:
A2 (or the row after your last "beta" step)
=HYPERLINK("http://genomewiki.ucsc.edu/genecats/index.php/Assembly_QA_Part_4_RR_Steps#"&SUBSTITUTE(B2," ", "_"),"link")
IMPORTANT: Drag the formula for "A" down the spreadsheet to populate the other rows.
To populate the "track checklist steps," add this formula to cell B2 ((or the row after your last "dev" step) in your new "track checklist" spreadsheet. Do NOT drag the formula down.
B2
=IMPORTXML("http://genomewiki.ucsc.edu/genecats/index.php/Assembly_QA_Part_4_RR_Steps#", "/html/body/div/div/div/div/div/h4/span/span")

This formula will populate all the rows below it with the wiki section titles. You do no need to drag this formula down.


RR: Check that tables don't need to be re-pushed

You can use

hgwdev > updateTimesDb.sh -d $db

to compare table update times between hgwdev and hgwbeta. Everything but hgFindSpec, history, tableDescriptions, trackDb and the genbank tables should have the same update times.

To see all of the tables in the assembly that are related to genbank do this:

hgwdev > hgsql -Ne 'show tables' $db | egrep -f /cluster/data/genbank/etc/genbank.tbls

RR: Push Request: rsync complete database

  • Request an rsync of the entire database from mysqlbeta to mysqlrr/euro/asia.
  • Request drop of trackDb_public and hgFindSpec_public from mysqlrr/euro/asia
  • Request push of trackDb and friends
  • See an example push request

RR: Check for Ensembl tracks

Review the Ensembl QA wiki for special procedures related to Ensembl tracks, you may ned to push tables in the hgFixed database. Skip this step if your assembly does not have Ensembl tracks.

RR: Push Downloads

If there are files associated with the track that are to be pushed to hgdownload, check to see that there is a README that makes sense and the files have an md5sum.txt file that goes with them and is correct. Check the file itself to make sure it is not corrupted and that it contains what is expected. If it is a gzipped file, you can do "zcat file.gz | head" and "zcat file.gz | tail" to look at it. Looking at the last part of the file can sometimes catch corruption that can't be seen by only looking at the first part.

Data files destined for hgdownload are organized on hgwdev at:

/usr/local/apache/htdocs-hgdownload/goldenPath/*

and can be viewed in a browser from http://hgdownload-test.cse.ucsc.edu/downloads.html. Non-data files (such as downloads.html) are in the "hgdownload" git repository. See the Static_Page_Protocol for instructions on checking out that repository. Push requests for downloads should look like something like this:

Please push files from here on hgwdev:
    /usr/local/apache/htdocs-hgdownload/goldenPath/$db/file
To here on hgdownload:
    /usr/local/apache/htdocs/goldenPath/$db/file
(in the path, "htdocs-hgdownload" should become "htdocs") 

RR: Push Request: start dump/autodump

  • Database rsync should be complete before doing this.
  • Request start of dump/autodump for your assembly on rr/euro/asia.

See this example push request.

Notes:

genome-mysql syncs with hgdownload every night, so when you requested the autodump, then genome-mysql will automatically sync that night (anything new). If the autodump was completed at least 1 day ago, your new assembly should be available on genome-mysql, and a push request is not needed. If you do not want to wait 1 day for the nightly sync, you can request that the admins "make the $db database available on genome-mysql."

In the past, we would also request, "Links and​ ​permissions should be made for user, "genome" and "genomep".​ ​(Jorge says to follow​ ​the instructions in the wiki page for​ ​"Mirror_Server".)​"​ This information is no longer needed in the push request.

RR: Review copyHgcentral steps

You can copy items from hgcentraltest to hgcentral with the copyHgcentral script. For the usage statement, run:

hgwdev > copyHgcentral -h
  • The copyHgcentral script must be run in test mode first.
  • Test mode will show you the state of hgcentraltest, hgcentralbeta and hgcentral.
  • Once test mode has been run and reviewed, run execute mode to copy from hgcentralbeta to hgcentral.
  • Note that test mode generates output files which must be manually deleted afterward. Be sure to run copyHgcentral in hive or your home directory and not in a directory where temp files should not be.
  • Note that copyHgcentral can be run for "all" (blatServers, dbDb, defaultDb, genomeClade):
hgwdev > copyHgcentral test $db all beta rr

RR: copyHgcentral test $db blatServers beta rr

Generates files, run in hive:

hgwdev > copyHgcentral test $db blatServers beta rr
hgwdev > copyHgcentral execute $db blatServers beta rr

You can also check on mysql:

hgsql -h genome-centdb

use hgcentral;

select * from blatServers where db='manPen1';

RR: copyHgcentral test $db dbDb dev beta

Generates files, run in hive:

hgwdev > copyHgcentral test $db dbDb beta rr
hgwdev > copyHgcentral execute $db dbDb beta rr

You can also check on mysql:

hgsql -h genome-centdb

use hgcentral;

select * from dbDb where name='manPen1' \G;

RR: copyHgcentral test $db defaultDb beta rr

Generates files, run in hive:

hgwdev > copyHgcentral test $db defaultDb beta rr
hgwdev > copyHgcentral execute $db defaultDb beta rr

You can also check on mysql:

hgsql -h genome-centdb

use hgcentral;

select * from defaultDb where name="manPen1"limit 1;

RR: copyHgcentral test $db genomeClade beta rr

NOTE: This table probably will not need to be updated. It contains records like this:

mysql> select * from genomeClade order by rand() limit 5;
+-----------------+------------+----------+
| genome          | clade      | priority |
+-----------------+------------+----------+
| GRCh38.p2       | haplotypes |      134 |
| C. japonica     | worm       |       70 |
| Atlantic cod    | vertebrate |      125 |
| D. melanogaster | insect     |       10 |
| D. persimilis   | insect     |       55 |
+-----------------+------------+----------+

Generates files, run in hive:

hgwdev > copyHgcentral test $db genomeClade beta rr
hgwdev > copyHgcentral execute $db genomeClade beta rr

RR: copyHgcentral: liftOverChain (manual move)

liftOverChain is not copied with the copyHgcentral script, it needs to be copied manually.

  • Only copy lines from liftOverChain on hgcentralbeta to hgcentral if there are liftOver files listed in the pushQ and if the assemblies they go to/from exist on the RR.
  • Check for lines in liftOverChain that should be in the pushQ, but aren't (e.g., the liftOver from a previous assembly).
  • Add lines related to your assembly, any previous versions of your organism, and any other organisms that are associated with liftOver files and your assembly.
  • More details on the Chain and Net QA wiki page.
 hgsql -Ne "SELECT * FROM liftOverChain WHERE fromDb = '$db' OR toDb = '$db'" hgcentralbeta > chain.dev 

Check public mysql, load if not present and recheck:

hgsql -h genome-centdb -Ne "SELECT * FROM liftOverChain WHERE fromDb = '$db' OR toDb = '$db'" hgcentral 

Example: hgsql -h genome-centdb -Ne "SELECT * FROM liftOverChain WHERE fromDb = 'manPen1' OR toDb = 'manPen1'" hgcentral

hgsql -h genome-centdb -e "LOAD DATA LOCAL INFILE 'chain.dev' INTO TABLE liftOverChain" hgcentral

RR: checkMetaData

After completing copyHgcentral steps, run checkMetaData.csh $db

  • This checks that all of the metadata is the same on hgcentraltest, hgcentralbeta, and hgcentral.
  • Run this script in a temporary folder or hive; it creates some comparison files that can be deleted after the check.


RR: Check all tracks on the RR

Check your track on the RR. Check that searches work (if not, you probably need to push the hgFindSpec_public table). Also check that all default tracks still display. If you filled in the "Release Log URL" field in the push queue, check the next day to be sure that the link from the release log works as expected.

Beta: Turn on GenBank updates

  • Once your assembly is listed in align.dbs, turn on GenBank updates on the rr before 4:30 p.m.
  • Add the new assembly to ~/kent/src/hg/makeDb/genbank/etc/rr.dbs in alphabetical order.
  • Be sure to save, git add, git commit, and git push the file.

Beta: GenBank updates: make libs & run make

After committing the change, make sure your libs are up to date:

cd ~/kent/src ; make libs

then go ahead and run the make:

cd ~/kent/src/hg/makeDb/genbank/ 
git pull 
make install-rr install-server


RR: GenBank updates: check Genbank update times

To see whether updates have run (at least a day after the *.dbs files were updated), check the update times of the table 'gbLoaded'.

hgwdev > updateTimes.csh $db gbLoaded verbose

For example, you'll see updates for dev/beta/rr/euro/asia):

updateTimes.csh manPen1 gbLoaded verbose

ADD EXAMPLE OUTPUT HERE

The update times will be out of sync between machines, but not by more than 24 hours or so if updates are running. The gbLoaded table will be updated regardless of whether changes to other GenBank tables were picked up. More genbank update instructions are available at Genbank updates.

The etc-update-server part of the make will cause the downloads mentioned below in the "Verify downloads" section to be created.


RR: hgCentral.dbDb, set 'active=1'

  • Go to hgcentral and see what the 'active' field is set to (0=not visible, 1 = visible for gateway assembly version drop-down.)
hgwdev > hgsql -h genome-centdb hgcentral
mysql > UPDATE dbDb SET active = 1 WHERE name = "$db";

RRs: Edit downloads.html

Check to see if the downloads files ought to have a link from downloads.html. If so, add the link and push downloads.html (after the files are already pushed!). NOTE: If you are pushing ENCODE tracks, when using a second/third/fourth version of the data there is often a "releaseLatest" directory that has the latest files. Be sure that you are not pushing the entire releaseLatest directory, only the files from there. Be sure to add a helpful sentence in your push-request to tip off the admin about this unusual push.

RR: Push chain/nets for other assemblies

  • At the start of the RR steps, you asked for an rsync of your database from beta>rr/euro/asia, so you have already pushed the chain/nets within your database. Now we need to do this for any other databases (aka, other assemblies that your assembly has chain/net alignments to).
  • Push trackDb & friends for any other databases
  • Push chain/chainLink/net tables for each database.
  • See this example push request .

RR: Chain chain/nets tracks for other assemblies

QA chain/nets for the other assemblies.


RR: Write Google Groups announcement

See announcement example.

RR: Write release log text in Redmine & close

🔵 Done with RR steps? Go to Assembly QA Part 5: Post Release Steps