Assembly QA Part 3 BETA Steps: Difference between revisions
m (→liftOverChain) |
mNo edit summary |
||
Line 222: | Line 222: | ||
After you have completed the steps above, use the script checkMetaData.csh to make sure that all of the metadata is the same on hgwdev and on hgwbeta. Run this script in a temporary folder; it creates some comparison files that can be deleted after the check. | After you have completed the steps above, use the script checkMetaData.csh to make sure that all of the metadata is the same on hgwdev and on hgwbeta. Run this script in a temporary folder; it creates some comparison files that can be deleted after the check. | ||
====<span style="color:dodgerblue">Beta: joinerCheck common keys==== | |||
</span> | |||
Check that common keys between tables are in sync: | |||
<pre> | |||
hgwdev > cd ~/kent/src/hg/makeDb/schema | |||
hgwdev > joinerCheck -database=$db -keys all.joiner | |||
</pre> | |||
If there are errors related to genbank identifiers, it is likely because of the genbank load process, and not an issue with your database. Run joinerCheck once the tables are on beta to confirm: | |||
<pre> | |||
hgwdev > HGDB_CONF=~/.hg.conf.beta joinerCheck -keys -identifier=$identifier all.joiner | |||
</pre> | |||
====<span style="color:dodgerblue">Beta: joinerCheck table times==== | |||
</span> | |||
Check table update times: | |||
<pre> | |||
hgwdev > joinerCheck -database=$db -times all.joiner | |||
</pre> | |||
====<span style="color:dodgerblue">Beta: joinerCheck tableCoverage==== | |||
</span> | |||
Check that all tables in this database are mentioned/referenced in all.joiner | |||
<pre> | |||
hgwdev > joinerCheck -database=$db -tableCoverage all.joiner | |||
</pre> | |||
If not all of the tables are listed, email the developer asking him to add those tables to the tablesIgnored $db. According to Hiram it is probably ok for us to edit all.joiner ourselves. | |||
Line 240: | Line 268: | ||
Once you verify that the track looks good on hgwbeta, remove the release tag from trackDb.ra. | Once you verify that the track looks good on hgwbeta, remove the release tag from trackDb.ra. | ||
🔵 Done with BETA steps? Go to [[Assembly QA Part 4 RR Steps | Assembly QA Part 4: RR Steps]] | 🔵 Done with BETA steps? Go to [[Assembly QA Part 4 RR Steps | Assembly QA Part 4: RR Steps]] |
Revision as of 23:05, 17 April 2017
This page is currently a draft in progress. For now, use Releasing an assembly instead.
Navigation Menu |
NOTE TO SELF ADD LINKS TO ALL CHAIN-NET STEPS http://genomewiki.ucsc.edu/genecats/index.php/Chains_and_Nets_QA
Tracks: Populate spreadsheet steps
- We need to create a checklist for your beta steps.
- You can add a new tab to track beta steps, or you can pick up where you left off on the same tab as your "dev" steps.
- To populate the "wiki link" for each step, add this formula to cell A2 (or the row after your last "dev" step) in your new "track checklist" spreadsheet and drag the formula down:
- A2 (or the row after your last "dev" step)
=HYPERLINK("http://genomewiki.ucsc.edu/genecats/index.php/Assembly_QA_Part_3_BETA_Steps#"&SUBSTITUTE(B2," ", "_"),"link")
- IMPORTANT: Drag the formula for "A" down the spreadsheet to populate the other rows.
- To populate the "track checklist steps," add this formula to cell B2 ((or the row after your last "dev" step) in your new "track checklist" spreadsheet. Do NOT drag the formula down.
- B2
=IMPORTXML("http://genomewiki.ucsc.edu/genecats/index.php/Assembly_QA_Part_3_BETA_Steps#", "/html/body/div/div/div/div/div/h4/span/span")
This formula will populate all the rows below it with the wiki section titles. You do no need to drag this formula down.
Beta: Check for Ensembl tracks
Before pushing Ensembl tracks to beta, review the Ensembl wiki page and the Ensembl_QA script.
Beta: Make clean table list
Step 1. Verify table list location In Redmine for your assembly, look at the field, "Table List." The engineer should have provided a path to redmine.$db.table.list E.g., /hive/data/genomes/manPen1/redmine5515/redmine.manPen1.table.list
Your file contents should be something like this:
head redmine.manPen1.table.list hg38.chainManPen1 hg38.chainManPen1Link hg38.netManPen1 manPen1.augustusGene
"'Step 2. Copy table list to your hive dir From hive, copy the file list to your assembly dir:
cp /hive/data/genomes/manPen1/redmine5515/redmine.manPen1.table.list .
Step 3. Make a clean file containing only the file names cut -d'.' -f2 redmine.manPen1.table.list > cleanTableList
Beta: Remove certain tables from cleanTableList
Remove tables from the list that start with trackDb or hgFindSpec:
sed -i.bak '/trackDb/d' cleanTableList
sed -i.bak '/hgFindSpec/d' cleanTableList
You'll be using this cleanTableList when doing your push.
Beta: Remove 'seq' and 'extFile' tables
Follow the steps above to remove seq or extFile tables from cleanTableList. Do not push seq or extFile tables from dev to beta.
You must use the copyExtSeqRows.csh script to move only the rows needed.
More information can be found here.
Beta: Push all tables to beta
Push all tables (EXCEPT seq, extFile, hgFindSpec and trackDb tables) from hgwdev to hgwbeta:
This command will push one table from dev to beta for your database/assembly:
sudo mypush $db $table mysqlbeta
Or, push them all in a loop:
for table in $(cat cleanTableList); do sudo mypush $db $tbl mysqlbeta; done
Beta: Do a 'make beta' in trackDb for your assembly
Do make beta on hgwdev in kent/src/hg/makeDb/trackDb like so:
make beta DBS=$db
Example to make beta on more than one db at a time:
make beta DBS='$db1 $db2 $db3 $db4 etc'
Running multiple dbs in parallel to save time
Multiple assemblies can be run in parallel by using the make -j option (as of 2/10/17, thanks to Mark Diekhans). Updating all dev dbs used to take about 50 minutes, and now it can take about 5 minutes (at 16 in parallel). While Mark has safely run 16 dbs at a time on dev, it is recommended to only run 8 or less at a time on beta or the RR. Use make -j # beta and make -j # public, where the number (make -j 16 alpha) represents how many parallel processes (16) are running.
- For example, if you do:
make -j 8 alpha
- it updates everything, 8 at a time. If you do:
make -j 2 DBS="hg19 hg38 mm10 felCat5"
- it updates those 4 databases, 2 at a time .
- Note: the 'make in parallel' process creates and removes temporary files:
- The tmp dirs are found with:
kent/src/inc/portable.h: char *getTempDir(void); /* get temporary directory to use for programs. This first checks TMPDIR environment * variable, then /data/tmp, /scratch/tmp, /var/tmp, /tmp. Return is static and * only set of first call */
Examples:
make beta -j 4 DBS="dm6 ce11 sacCer3 droEre1 droSec1 droSim1 droYak2 droAna2 dp3 droMoj2 droVir2 droGri1 droPer1" make public -j 4 DBS="dm6 ce11 sacCer3 droEre1 droSec1 droSim1 droYak2 droAna2 dp3 droMoj2 droVir2 droGri1 droPer1"
Beta: Do a 'make beta' in trackDb for chain/net organisms
If your assembly has alignments to other organisms, such as chain/net alignments to hg38 or to the previous assembly version of your organism, be sure to also do a 'make beta' for those assemblies.
Beta: Do a 'make public' in trackDb for your assembly
Make your track public by using the "make public" command on hgwdev while in the trackDb directory (src/hg/makeDb/trackDb):
[user@hgwdev trackDb]$ make public DBS=$db
Example to make beta on more than one db at a time:
make beta DBS='$db1 $db2 $db3 $db4 etc'
Your track should now be visible on the hgwbeta-public server.
If your track is not visible, you may want to check that your track has the correct release tag. Also see [Three State TrackDb] for more information.
Beta: Do a 'make public' in trackDb for chain/net organisms
If your assembly has alignments to other organisms, such as chain/net alignments to hg38 or to the previous assembly version of your organism, be sure to also do a 'make public' for those assemblies.
Beta: Check release tags: compare dev and beta tracks side-by-side
Compare your tracks by bringing up a dev and a beta browser window side-by-side. If some beta tracks can't be seen, you may need to edit the release tag. See this page for more information: http://genomewiki.ucsc.edu/index.php/ThreeStateTrackDb
Beta: Review copyHgcentral steps
Beta: 1. copyHgcentral test $db blatServers dev beta
Beta: 2. copyHgcentral test $db dbDb dev beta
Beta: 3. copyHgcentral test $db defaultDb dev beta
Beta: 4. copyHgcentral test $db genomeClade dev beta
Beta: 5. copyHgcentral: liftOverChain (manual move)
liftOverChain is not copied with the copyHgcentral script, it needs to be copied manually.
- Only copy lines from liftOverChain on hgcentraltest to hgcentralbeta if there are liftOver files listed in the pushQ and if the assemblies they go to/from exist on the RR.
- Check for lines in liftOverChain that should be in the pushQ, but aren't (e.g., the liftOver from a previous assembly).
- Email the developer and ask them to add them to the pushQ if necessary.
hgsql -Ne "SELECT * FROM liftOverChain WHERE fromDb = '$db' OR toDb = '$db'" hgcentraltest > chain.dev Check beta, load if not present and recheck: hgsql -h mysqlbeta -Ne "SELECT * FROM liftOverChain WHERE fromDb = '$db' OR toDb = '$db'" hgcentralbeta hgsql -h mysqlbeta -e "LOAD DATA LOCAL INFILE 'chain.dev' INTO TABLE liftOverChain" hgcentralbeta
Beta: checkMetaData
After you have completed the steps above, use the script checkMetaData.csh to make sure that all of the metadata is the same on hgwdev and on hgwbeta. Run this script in a temporary folder; it creates some comparison files that can be deleted after the check.
Beta: joinerCheck common keys
Check that common keys between tables are in sync:
hgwdev > cd ~/kent/src/hg/makeDb/schema hgwdev > joinerCheck -database=$db -keys all.joiner
If there are errors related to genbank identifiers, it is likely because of the genbank load process, and not an issue with your database. Run joinerCheck once the tables are on beta to confirm:
hgwdev > HGDB_CONF=~/.hg.conf.beta joinerCheck -keys -identifier=$identifier all.joiner
Beta: joinerCheck table times
Check table update times:
hgwdev > joinerCheck -database=$db -times all.joiner
Beta: joinerCheck tableCoverage
Check that all tables in this database are mentioned/referenced in all.joiner
hgwdev > joinerCheck -database=$db -tableCoverage all.joiner
If not all of the tables are listed, email the developer asking him to add those tables to the tablesIgnored $db. According to Hiram it is probably ok for us to edit all.joiner ourselves.
END OF SECTIONS
Request a push of any listed supporting files in /gbdb from hgwdev to hgnfs1 and check on hgwbeta. Note that hgwbeta and the RR share the files on hgnfs1, so once these files are in place, there is not another push required when the track is released to the RR. Be sure to send a push request to have the gbdb files pushed to hgdownload in advance of the usual Sunday sync, if this is necessary for your track.
If there are images associated with any track description pages, be sure to run a make beta from within kent/src/hg/htdocs/ to get the images to beta.
Tracks: Remove release tag for big*/vcf track types
Once you verify that the track looks good on hgwbeta, remove the release tag from trackDb.ra.
Tracks: Remove release tag for big*/vcf track types
Once you verify that the track looks good on hgwbeta, remove the release tag from trackDb.ra.
🔵 Done with BETA steps? Go to Assembly QA Part 4: RR Steps