New track checklist

From Genecats
Jump to navigationJump to search

This is the checklist for QA to follow when releasing Genome Browser tracks. General rule: stop testing when you come to the first non-trivial error, and bounce the track to the track sponsor for fixes. Check out Jairo's Into to the QA Release Process.

Assign the Redmine ticket to yourself and set the status to Reviewing. For more info on the Redmine workflow see this wiki: http://redmine.soe.ucsc.edu/projects/genomebrowser/wiki/Track

There now exists a beta stage, git tracked trackQaHelper tool that makes all of the below into a shorter, semi-automated, and linear process. It's most effective for non-grouped tracks.

 ~/genecats/qa/testTools/trackQaHelper

Note: If this is QA for a GENCODE or knownGene track, follow the expedited process outlined here: https://genomewiki.ucsc.edu/genecats/index.php?title=GENCODEqa


Background familiarity

If you haven't worked with a track like this one, and it already exists on the RR for another assembly (use 'getAssemblies.csh tablename' on hgwdev to check), use it a bit on the RR to become familiar with it. For complex tracks, look for genomewiki pages on it, and/or look through the push queue for previous problems encountered on similar tracks. Familiarize yourself with the tables and files needed for this track.

Makedoc

Look for an entry for your track in src/hg/makeDb/doc/$db.txt or for hg38, src/hg/makeDb/doc/$db/$db.txt. If there is no mention of this track, make a request to the track sponsor. Basic tracks that are loaded automatically with a new assembly may only have a tiny reference in the makedoc. UCSC Genes has its own document.

All Ensembl tables are generated by the Ensembl update procedure which is maintained in doc/makeEnsembl.txt without mention of explicit tracks or tables.

In the makeDb/doc/ directory there are some ENCODE specific folders and .txt files to look at, for example encodeAwgHg19.txt that has the Analysis Working Group based uniform processing makedoc information.

  • The following tables don't need to be referenced in the makedoc: chromInfo, gap, gc5BaseBw, gold, grp, hgFindSpec, history, trackDb, grcIncidentDb

Release Log and URL

The "Release Log Text" field in Redmine ticket should generally contain the shortLabel of the track, or a short description of what is being released if it does not make sense to put a shortLabel here. The entry in this field will be added to the auto-generated release log page. If it makes sense to link to the hgTrackUi page for this track, make an entry in the "Release Log URL" field like so

 ../cgi-bin/hgTrackUi?db=[DATABASE]&g=[TRACK]

An actual example:

 ../cgi-bin/hgTrackUi?db=hg19&g=affyU133

Note that this requires selecting a database, so it is the convention to leave a Release Log URL empty when there are multiple databases.

An entry is not always required, for example, for tracks in a new assembly.

The BIG QA Script: qaGbTracks

You can use qaGbTracks to run many of the qa scripts in one go on hgwdev, including:

 checks for underscores in table names
 checks for the existence of table descriptions
 checks shortLabel and longLabel length
 positionalTblCheck
 checkTblCoords
 genePredCheck
 pslCheck
 featureBits
 (a version of) countPerChrom

Example usage:

qaGbTracks papHam1 gc5Base output.gc5

Label lengths

Check that the shortLabel is 20 characters or less and that the longLabel is 80 characters or less. The shortLabel is visible in the main hgTracks display if you turn the track to dense. If it is 20 characters or less, it won't be cut off in this part of the display. The longLabel is visible in hgTracks, and it can also be copied from the configuration page (hgTrackUi).

Tables are sorted and internally consistent

Run any of these that apply to your tables:

genePredCheck - Checks that tables in the genePred format are valid. (Tables of this type are usually in the Genes and Gene Prediction group. Usually only the primary table will need this check. Some examples: knownGene, ensGene, refGene.)

pslCheck - Checks that psl tables are valid. (PSL tables show up in nearly any track where alignments are used. Some examples: mrna, est, refSeqAli.)

checkTableCoords - Checks that the genomic coordinates in positional tables are legal (e.g., coordinates are not off the end of a chromosome). If the table passes there will be no output. (Almost every table will need this check, unless it has no chromosome positions in it.)

positionalTblCheck - Checks to see that the positional table is ordered by chrom and chromStart. A positional table being out of order can cause a huge slowdown in display speed. If the table passes there will be no output. If the table does not pass there will be an error message like:

table hg18.snp129 not sorted starting at row 4867: chr1:387005

Alert the track sponsor if there is an error. He/she may determine that the items are sorted enough to be released. (As long as the items are almost all in order, it will not affect performance.) Also, note that Genbank tables are not expected to be in order after updates have run. (Almost every table will need this check.)

FeatureBits and Gaps

Run featureBits, or use the runBits.csh script to run featureBits. runBits.csh checks for coverage and overlap with gap, and also checks for undbridged gaps.

 runBits.csh $db $table

Alert the track sponsor if there are suspicious unbridged gaps. If previous assembly also has this track, compare featureBits between current assembly and previous assembly -- if there are big differences between the old and new tracks, alert the track sponsor.

If there is a similar track that this one can be compared to, use either "featureBits -enrichment" or getYield.csh to compare the tracks. Alert the track sponsor if the difference seems unreasonable.

getYield.csh can be used to see how well a new track captures the footprint of existing tracks, such as refGene or xenoRefGene, e.g.,

 getYield.csh hg19 ensGene refGene

output includes:

 yield       = 93.8% (intersection / refGene)
 enrichment  = 24.7x ((intersection / ensGene) / (refGene / genome))

shows that 93.8% of refGene is present in ensGene and that compared to the refGene footprint on the genome, ensGene is 24x enriched for refGenes. (Enrichment is the amount of table1 that covers table2 vs. the amount of table1 that covers the genome. It's how much denser table1 is in table2 than it is genome-wide.)

"featureBits -enrichment" does almost the exact same thing as getYield.csh, except the coverage amount is the number of bases in the intersection of the two tables divided by the first table instead of the second table.

If your track is a bigBed based track, then featureBits will not work. First you will need to turn your bigBed into a bed file, then, if necessary, account for tabs v. spaces, and lastly, if working with a bed12(+), split your bed into exons before running through featureBits:

$ bigBedToBed bed.bb out.bed
$ # if necessary:
$ # sed -i 's/ /_/g' out.bed
$ # split into exons (must be bed12 or less, change cut command if you have a bed4+ or 8+, etc)
$ cut --complement -f13- out.bed | bedToExons stdin out.exons.bed
$ featureBits [db] -countGaps out.exons.bed [gap]

If your track is a composite or super track of many bigBeds, you can run the above steps in a loop like so:

$ for file in ../*.bed; do printf "%s\n" "$file"; sed 's/ /_/g' $file | \
cut --complement -f13- | bedToExons stdin $file.exons.bed; \
featureBits hg38 -countGaps $file.exons.bed; \
featureBits hg38 -countGaps $file.exons.bed gap; done

Chromosome coverage

Check the count of items on each chromosome. Bigger chromosomes (the biggest is usually chr1) should have more items. Look for chromosomes that have suspiciously few or no items on them. Note: this script must be run on dev:

 countPerChrom.csh $db $table

or as a histogram:

countPerChrom.csh $db $table histogram
Example: 
countPerChrom.csh hg19 refGene histogram

You can also view two tables' counts side by side with the pr command, make your terminal window relatively wide, and then run:

pr -mt -w 120 <(countPerChrom.csh $db $table1 histogram) <(countPerChrom.csh $db $table2 histogram)

For example, to compare the counts of the crisprRanges and flyBaseGene tables on drosophila:

$ pr -mt -w 155 <(countPerChrom.csh dm6 flyBaseGene histogram) <(countPerChrom.csh dm6 crisprRanges histogram)
									      
  M									      	M
  X xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx				      	X xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  Y									      	Y
 2L xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx				       2L xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

If a "regular" chrom (not random or haplotype) has no data, a note like this should appear in place of the track drop-down on hgTracks:

 [No data-chr21]

(This is controlled by the "chromosomes" setting in trackDb.ra. The chromosomes line in trackDb.ra specifies the chromosomes that DO have data. This line is the same as the "restrictList" field in the trackDb table.)

OR use these alternative ways to see chromosome coverage:

  • Import the table into Genome Graphs.
  • Select the table in the Table Browser, hit the "describe table schema" button, and click the "values" link for the chrom field.
    • If the table is very large, there may not be an "info" column to display the "values" link.

joinerCheck

The runJoiner.csh script is a shortcut for this section, but beware that wildcards in tablesIgnored sections are not recognized, and if the problem above occurs, then you need to run joinerCheck directly. The noTimes option is necessary to prevent false positive errors about the old genes build process.

runJoiner.csh $db $table noTimes

Note: these joiner identifiers do not apply to big* files. If your track is based in a big file, you can ignore this step.

The document kent/src/hg/makeDb/schema/joiner.doc describes what all.joiner is. In a nutshell, all.joiner is a file that describes joinable fields in the UCSC Genome Databases. Identifiers are engineer-named columns from different tables with largely 95+% the same data. With each sandbox, alpha, beta build (not make alpha DBS=) all.joiner uses its definitions of relationships through its identifiers to link tables. You can see the results in the Table Browser when you click the describe schema button and see the "Connected Tables and Joining Fields" section for tables that have all.joiner definitions. The tool you want to focus on is joinerCheck, which is used to check that the rules in all.joiner are being followed.

Look for your table names in src/hg/makeDb/schema/all.joiner and find the identifiers associated with those tables. Then, for each identifier, run joinerCheck like so:

 joinerCheck -keys -identifier=$identifier -database=$db all.joiner

Note, the above is run from all.jointer's home kent/src/hg/makeDb/schema/, otherwise the location of all.joiner needs to be spelled out in the command line. If there are errors or the table is not mentioned in all.joiner, notify the track sponsor. An entry in the tablesIgnored section of all.joiner is sufficient if there are no table relationships to check.

Be aware of this problem with joinerCheck. Basically, if you get output that looks like this:

 Checking keys on database hg18

that is NOT followed by lines like this:

 anoCar1.blastHg18KG.qName - hits 45332 of 45332 ok

then the rule didn't really run, and you need to remove the -database parameter.

Comparison

Compare your track to a similar track if possible. Look to see that the features in your track are more or less in the same position as similar tracks. Look for a lot of items that have a chromStart or chromEnd that is one position different from existing tracks. This could indicate an off-by-one error in the new track. Also use getYield.csh and/or featureBits to compare tracks (discussed next).

Searching

If the track item names are relatively unique, check to see if search works by pasting an item name in the position/search box. (For example, to check if search is enabled for the "Common SNPs" track, choose a SNP from the track, say "rs17885219," paste it in the box, and hit "jump." If search is enabled, you will either be taken directly to the position of the item, and it will be highlighted in the display, or you will get a list of all of the tracks that contain the item (and your track should be included). One way to find out if your track should be searchable is to use the assembly $db and table name $table in the following command:

hgsql -Ne 'select searchName,searchTable,searchMethod,termRegex from hgFindSpec where searchTable like "%$table%";' $db;

If when searching you get an error, or if you get a list of tracks but your track isn't included in the list, search is not enabled.

If the track item names are relatively similar (e.g., the items in RNA Genes, and TFBS) we don't want to enable search, as it would return too many matches. If search isn't enabled and you think it should be, make that request to the track sponsor.

Finally, if search is enabled, make sure that all of the item names in the track can be found. Do this by checking this page for your assembly and table:

http://genecats.soe.ucsc.edu/qa/test-results/checkHgFindSpec/hgwdevOutput

If your assembly and table appear in this list, there is a problem with searches for some identifiers in the track. You'll see an error message like this one:

 Error: mm9.jaxQtl.name value "Idd21.1" doesn't match termRegex "^[a-z0-9-]+$" for search jaxQtl

This means that the item with the name "Idd21.1" in the jaxQtl (MGI QTL) track on mm9 (mouse) doesn't work. Alert the track sponsor.

Track description

Read the track description and edit for clarity, spelling, and grammar. Be sure our conventions are followed.

Ensure references are in the correct format and in alphabetical order (by first author listed). Links to journal articles should go directly to the journal rather than PubMed if the journal article is open access (i.e., doesn't require a subscription). For articles that are not open access, links can go either to the journal or to PubMed, and they should go to the abstract, not the full text. To make life easier use the getTrackReferences program, where you feed the script a PMID and it outputs the html text. Example usage to get three references: getTrackReferences 24972169 26780180 26322839

Ensure quotes, ampersands, and less than and greater than signs are represented with their html names.

Make sure that any email addresses given on the details page have been through Hiram's sanitizer (encodeEmail.pl). It turns the address into an encrypted HREF "mailto:" address that makes it harder for spammers to use.

Ensure there is a Data Access Section

See this RM ticket for history and some historical example text. All new tracks should have a Data Access section with a link to our existing Track Data Access FAQ (the FAQ link helps capture new enhancements like the JSON API). Note that *wuhCor1* tracks are exempt according to Max 6/26/20 #25644.

The Data Access section should include links to the Table Browser, Data Integrator, REST API, and Variant Annotation Integrator (if applicable). See the following wiki page for examples.

All details: 1 data point

Choose a representative data point for the track. Check all details for this data point, including all links. Make sure information from the table is displaying correctly (e.g., if a color is used in the table, make sure that color appears for the item.)

For links that are hard-coded to a particular server, there are some tricks that are used to make them testable on hgwdev and hgwbeta. See the Static_Page_JS_Protocol page for more details.

One method of obtaining an item's data to check is to click the "View table schema" link and use the Sample Rows as entry coordinates to navigate to and review the data.

Another option is to use a MySQL query to pull the information from the table for an item you maybe looking at that you are curious to learn more about. Here is an example query followed by a general method:

hgsql -Ne 'select * from gold where frag like "%AMGL01129756.1%";' oviAri3
hgsql -Ne 'select * from (tableName -displayed in schema) where (fieldName -find appropriate field from schema) like "%(yourSearchTerm -from tableName.fieldName)%";' $db

Performance and Display

The full chromosome view (chr1) should display within 20 seconds. If that's not the case, you should use either the statements "maxWindowCoverage <size>" or "maxWindowToDraw <size>". The first one will switch this track to coverage mode, which is can be 10-100 times faster, the second one switches off drawing entirely. Both can be combined. A track with a ton of features, e.g. dbSNP is not useful to draw at all at high zoom levels, it will just show dark even in dense mode. You can use coverage mode in most cases, except for extreme dense tracks where even coverage mode is slow.

It's good to not only check the chromosome view, but go to an exon, then do a 10x zoomout a few times. All zoom levels should be fast. For example, if you set "maxWindowCoverage 1000000" then the chromosome level will be fast, but the track can still be very slow at a window size of 999999. In this and most other cases, the maxWindowCoverage value should probably be lower.

You can use the 'measureTiming' cart variable to get accurate load times. To enable this timing option, add &measureTiming=0 to the end of your hgTracks URL. If you did it correctly, your URL should look like this: http://hgwdev.gi.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr21%3A33296501-33297001&measureTiming=0. The total time it took to load the page will be next to the label 'Overall total time:' somewhere on the page. You can deactivate this timing option by either doing a cartReset or adding 'measureTiming=', with no value specified to your hgTracks URL.

Turn on to full display mode a track that is located physically below your track in the display. Make sure that when your track is in full display mode, that the items in the track below it are still mapping correctly. Sometimes there can be an off-by-one error which is caused by your track. If this is happening, you should not push your track. This is only likely to be an issue with new track types.

Hit the Reverse button and ensure your track displays correctly.

In early 2015, we added a feature that displays the exon number on mouseover/hover. This is on by default for a few different track types. Be sure that if your track is displaying the exon number on mouseover, that it makes sense for your track to be displaying it. If it doesn't make sense for your track, then add exonNumbers off to the trackDb stanza for your track.

Check Data in Tools

Be sure to check the tracks works in various tools, especially important when it may involve a new data type:

Check Data in API

If appropriate (not VCF or bigGenePred), check that the data works on the API. The API can be queried on dev using the following url:

https://api-test.gi.ucsc.edu/

Refer to the help page for examples and commands (http://genome-asia.ucsc.edu/goldenPath/help/api.html), but generally you will want to make sure that your track shows up when queried, for example looking at all tracks in bosTau9:

https://api-test.gi.ucsc.edu/list/tracks?genome=bosTau9

And if the track is in one of the supported formats, see that data can be properly extracted:

https://api-test.gi.ucsc.edu/getData/track?genome=hg38;track=genscan;chrom=chr1

Track Settings (hgTrackUi)

Ensure that track settings work as expected.

You can adjust the track settings in one of two ways. First, by clicking on the track name or the mini-button to the left of the track (in hgTracks). Or by right-clicking the track and selecting "Configure <track name>". When using the right-click menu to adjust the tracks settings, you can immediately view your changes by using the "Apply" button.

Request (filter activated) label if appropriate

If the tracks under review has filtering options please request engineers add the (filter activated) label option (tell engineers to see 29792e6f4b5ef for examples of where MarkD added this to long labels for mRNA and EST tracks if filters are configured). Example Dev session where mRNA and ESTs display "(filter activated)" as the tracks are filtered to include tissue: brain.

Table Descriptions

Hit the "view table schema" button (on hgTrackUi, hgc, or hgTables) and make sure there is a description column present with descriptions of the table fields. If a track has more than one table, be sure to check for table descriptions on each of them. The description column uses the 'tableDescriptions' table to display this information.

The descriptions in the 'tableDescriptions' table are built from autoSql (or .as files), and are added to that table via the script buildTableDescriptions.pl (kent/src/test/buildTableDescriptions.pl). This script first looks for .as files that match a database table by name or trackDb type. If it doesn't find a matching .as that way, it can match .as with table by comparing the set of fields in a .as versus the set of fields in the database table. This means that if one .as has the same fields as several database tables, the .as name doesn't matter -- the script will still match them up. The 'tableDescriptions' table is built nightly on hgwdev and hgwbeta, and must be pushed to the RR if it contains descriptions for a new type of table (things like psl will already be out there). A cron job sends a push request email to the admins to push the tableDescriptions tables to the RR once a week.

Having as few .as files as possible is a good thing because duplicated content is harder to maintain, and hg/lib is already enormous so it's good to reduce the number of new files. To help reduce the number of files added to hg/lib, try to do the following when QA-ing a new track:

  • Check tables that are similar to your track's tables to see if there are any where all of the fields are identical to yours
  • Check kent/src/hg/lib/ for any .as files that have the same name as yours, or may have been created for your track's tables
  • If there are new .as files for your track's tables that are unnecessary, notify the track sponsor that these should be removed

More information on the tableDescriptions table and autoSql can be found here:


Be sure to check the following: If there is another table like the one you are reviewing that has a different schema, be sure that the track type is also different (i.e. don't use the the same track type name for tables with two different schemas). For example, see these two tables in the hg19 database:

mysql> select tableName, type from trackDb where tableName like "wgEncodeRegTfbsClusteredV%";
+----------------------------+--------------+
| tableName                  | type         |
+----------------------------+--------------+
| wgEncodeRegTfbsClusteredV3 | factorSource |
| wgEncodeRegTfbsClusteredV2 | factorSource |
+----------------------------+--------------+
2 rows in set (0.04 sec)

And note that although they both use type = factorSource, the schemas are different. This is not OK.

mysql> desc wgEncodeRegTfbsClusteredV3;
+------------+----------------------+------+-----+---------+-------+
| Field      | Type                 | Null | Key | Default | Extra |
+------------+----------------------+------+-----+---------+-------+
| bin        | smallint(5) unsigned | NO   |     | NULL    |       |
| chrom      | varchar(255)         | NO   | MUL | NULL    |       |
| chromStart | int(10) unsigned     | NO   |     | NULL    |       |
| chromEnd   | int(10) unsigned     | NO   |     | NULL    |       |
| name       | varchar(255)         | NO   | MUL | NULL    |       |
| score      | int(10) unsigned     | NO   |     | NULL    |       |
| expCount   | int(10) unsigned     | NO   |     | NULL    |       |
| expNums    | longblob             | NO   |     | NULL    |       |
| expScores  | longblob             | NO   |     | NULL    |       |
+------------+----------------------+------+-----+---------+-------+
9 rows in set (0.00 sec)

mysql> desc wgEncodeRegTfbsClusteredV2;
+-------------+----------------------+------+-----+---------+-------+
| Field       | Type                 | Null | Key | Default | Extra |
+-------------+----------------------+------+-----+---------+-------+
| bin         | smallint(5) unsigned | NO   |     | NULL    |       |
| chrom       | varchar(255)         | NO   | MUL | NULL    |       |
| chromStart  | int(10) unsigned     | NO   |     | NULL    |       |
| chromEnd    | int(10) unsigned     | NO   |     | NULL    |       |
| name        | varchar(255)         | NO   | MUL | NULL    |       |
| score       | int(10) unsigned     | NO   |     | NULL    |       |
| strand      | char(1)              | NO   |     | NULL    |       |
| thickStart  | int(10) unsigned     | NO   |     | NULL    |       |
| thickEnd    | int(10) unsigned     | NO   |     | NULL    |       |
| reserved    | int(10) unsigned     | NO   |     | NULL    |       |
| blockCount  | int(10) unsigned     | NO   |     | NULL    |       |
| blockSizes  | longblob             | NO   |     | NULL    |       |
| chromStarts | longblob             | NO   |     | NULL    |       |
| expCount    | int(10) unsigned     | NO   |     | NULL    |       |
| expIds      | longblob             | NO   |     | NULL    |       |
| expScores   | longblob             | NO   |     | NULL    |       |
+-------------+----------------------+------+-----+---------+-------+
16 rows in set (0.00 sec)

Is this a default track?

If it is a default track, then it needs to be added to the GBiB browserbox script and to hgMirror. Check that the engineer has done this.

For reference here is the git commit where Chris Lee added gtexV8:

> https://genecats.gi.ucsc.edu/git-reports-history/v418/branch/user/chmalee/index.html

Should there be related tracks?=

Keep in mind if the track is related to any other tracks, or should otherwise be promoted alongside another track using the related tracks featuer: https://redmine.soe.ucsc.edu/issues/27550

Track type-specific QA

Certain track types, such as SNP or Conservation tracks, have additional QA steps that are specific to that track type.

Here's list of track type-specific QA wiki pages:

If the track you're QA-ing is one of these track types, look over the wiki page and ensure you've carried out these additional QA steps.

Downloads

All tables are automatically populated onto hgDownloads through a SQL Table Dump every Sunday. They go into the "database" directory for that assembly.

If there are (big*)files associated with the track to be pushed to hgdownload, check to see that there is a README that makes sense and the files have an md5sum.txt file that goes with them and is correct. Check the file itself to make sure it is not corrupted and that it contains what is expected. If it is a gzipped file, you can do "zcat file.gz | head" and "zcat file.gz | tail" to look at it. Looking at the last part of the file can sometimes catch corruption that can't be seen by only looking at the first part.

Data files destined for hgdownload are organized on hgwdev at:

/usr/local/apache/htdocs-hgdownload/goldenPath/*

and can be viewed in a browser from http://hgdownload-test.soe.ucsc.edu/downloads.html. Non-data files (such as downloads.html) are in the "hgdownload" git repository. See the Static_Page_Protocol for instructions on checking out that repository. Push requests for downloads should look like something like this:

Please push files from here on hgwdev:
    /usr/local/apache/htdocs-hgdownload/goldenPath/$db/file
To here on hgdownload:
    /usr/local/apache/htdocs/goldenPath/$db/file
(in the path, "htdocs-hgdownload" should become "htdocs") 

Finally, check to see if the downloads files ought to have a link from downloads.html. If so, add the link and push downloads.html (after the files are already pushed!). NOTE: If you are pushing ENCODE tracks, when using a second/third/fourth version of the data there is often a "releaseLatest" directory that has the latest files. Be sure that you are not pushing the entire releaseLatest directory, only the files from there. Be sure to add a helpful sentence in your push-request to tip off the admin about this unusual push. See related page Push-Request_Etiquette.

Archive

Check to see if the track (both previous and current) should be archived. See RM for more: https://redmine.soe.ucsc.edu/issues/21825#note-56

Push to hgwbeta

If it's a bigBed based track, you push to beta with the gbdbPush script

 sudo gbdbPush

If it's a table based track, Push all tables (EXCEPT seq and extFile tables, see note below) from hgwdev to hgwbeta:

 sudo mypush $db $table hgwbeta

Alternatively, you can use the 'bigPush.sh' script to push multiple tables for one or more assemblies:

 bigPush.sh $db $tableListFile
 OR
 bigPush.sh $dbList $tableListFile
  • If bigPush doesn't work for any reason, replicate it like so:
 ~$ cat trioTables | xargs -I % sh -c 'sudo mypush hg38 % hgwbeta'
  • Note- you can create a table list from the tableList.txt provided in the /hive/data/genomes/ directory for your organism, using the following sed command:
 cat dm6.124way.tableList.txt | sed s/dm6.//g > dm6final

You might need to change release tags on trackDb.ra to make your track appear on beta and public if it has some tag limiting release. The symptoms might be a broken TrackUi page or incomplete Table Browser tables. Proper release on all 3 servers should look like the following and be git added, committed, and pushed before making.

 include joinedRmskComposite.ra

Next, make beta on hgwdev in kent/src/hg/makeDb/trackDb like so:

 make beta DBS=$db

or

 make beta DBS='$db1 $db2 $db3 $db4 etc'




Running multiple dbs in parallel to save time

Multiple assemblies can be run in parallel by using the make -j option (as of 2/10/17, thanks to Mark Diekhans). Updating all dev dbs used to take about 50 minutes, and now it can take about 5 minutes (at 16 in parallel). While Mark has safely run 16 dbs at a time on dev, it is recommended to only run 8 or less at a time on beta or the RR. Use make -j # beta and make -j # public, where the number (make -j 16 alpha) represents how many parallel processes (16) are running.

For example, if you do:
  make -j 8 alpha
it updates everything, 8 at a time. If you do:
  make -j 2 DBS="hg19 hg38 mm10 felCat5"
it updates those 4 databases, 2 at a time .
Note: the 'make in parallel' process creates and removes temporary files:
The tmp dirs are found with:
 kent/src/inc/portable.h:
   char *getTempDir(void);
   /* get temporary directory to use for programs.  This first checks TMPDIR environment
    * variable, then /data/tmp, /scratch/tmp, /var/tmp, /tmp.  Return is static and
    * only set of first call */

Examples:

 make beta -j 4 DBS="dm6 ce11 sacCer3 droEre1 droSec1 droSim1 droYak2 droAna2 dp3 droMoj2 droVir2 droGri1 droPer1"
 make public -j 4 DBS="dm6 ce11 sacCer3 droEre1 droSec1 droSim1 droYak2 droAna2 dp3 droMoj2 droVir2 droGri1 droPer1"



The QA team can push directly to hgwbeta's /gbdb location with the gbdbPush script.

 sudo /cluster/bin/scripts/gbdbPush

The script gbdbPush will ask QAers for a list of files to push (just like the list you cluster-admin in push e-mails, so be sure you get the path right). You enter the list, one at a time, and end by entering a "."

Be sure to send a push request to have the gbdb files pushed to hgdownload in advance of the usual Sunday sync, if this is necessary for your track. If there are images associated with any track description pages, be sure to run a make beta from within kent/src/hg/htdocs/ to get the images to beta.


Note for Ensembl gene tracks:

  • Before pushing to beta, use the Ensembl_QA script.

Special case for 'seq' and 'extFile' and 'trackVersion' tables:

  • Do not push seq or extFile tables from dev to beta. You must use the copyExtSeqRows.csh script to move only the rows needed. More information can be found in this section on the Conervation track QA page about extFile and seq tables.
  • The hgFixed.trackVersion table needs special attention see the note in the Ensembl_QA page.

Notes on existing tracks:

  • If this is an update to an to an existing track, you may want to hold off on this step so that you can compare old and new tracks on hgwdev and hgwbeta.
  • Open the track on hgwbeta before staging it to make sure that the update won't cause a cart clash for users currently looking at the track (as evidenced by a completely blank screen, for instance). If you need to do a cartReset to get the track to show up correctly, something is wrong

Remove release tag for big*/vcf track types

Once you verify that the track looks good on hgwbeta, remove the release tag from trackDb.ra.

Make Public

Announce you're gonna do a make public to Browser QA.

Make your track public by using the "make public" command on hgwdev while in the trackDb directory (src/hg/makeDb/trackDb):

   [user@hgwdev trackDb]$ make public DBS=$db

Your track should now be visible on the hgwbeta-public server.

If your track is not visible, you may want to check that your track has the correct release tag. Also see [Three State TrackDb] for more information.

Pennant Icons

Add a "New" or "Updated" pennantIcon in trackDb with your track. This references the newsarch anchor, if you have no newsarch item, you still need to put a URL if you want to put the text release date. These go out with the TrackDb and Friends Push. Here are copy-paste-edit examples:

pennantIcon New red ../goldenPath/newsarch.html#110320 "Released Nov. 3, 2020"
pennantIcon Updated red ../goldenPath/newsarch.html#042920 "Updated Apr. 29, 2020"

If you want to do two icons, use a semicolon like so:

pennantIcon p12 black http://genome.ucsc.edu/blog/patches/ "Includes annotations on GRCh38.p12 patch sequences"; Updated red ../goldenPath/newsarch.html#052120 "Updated May. 21, 2020"

Push Requests (Data via email; TrackDb via QA script)

  • NOTE:* If you are just pushing gbdb files, you can push them from hgwbeta to the RR by sshing into hgwbeta as qateam and running:
sudo /root/gbdbPush

The steps below are kept for historical reasons, or if you need to push tables.

When the track is ready to be released, you must send a Push-Request for the Gbdb data, then run a pushTrackDb script for the assembly or assemblies.

When the track is ready to be released, Send a "Push Gbdb files" email to push-request <at> soe <dot> ucsc <dot> edu(and cc Developer) to push:

  • data tables/ gbDb files from hgwbeta to mysqlrr.
  • associated downloads files
  • associated static docs, including images.

THEN, you can push trackDb directly to the RR with the trackDbPush script.

trackDbPush

The script will ask you to enter your username, so the logs will track who did what. Then, will ask for a list of DBs, which you type in one per line and end with a '.'

OUTDATED BELOW. We used to ask admins to do a trackDb push-request:

Push trackDb and Friends for that assembly. See related page Push-Request_Etiquette.

This doesn't happen very often, but if the track already exists on the RR you may need to selectively push the tables/trackDb. Refer to Replacing old tables with new ones for more information. If this is a repush of existing data that was found to be problematic, add a note to repush.html. The file to edit is in the genecats tree, at genecats/qa/repush.html.

If the push involves pushing a database please inform admin to include genome-euro (their scripts are not expecting databases) and check on genome-euro after the final release.

Validate on the RR

Check your track on the RR. Check that searches work (if not, you probably need to push the hgFindSpec table). Also, check that all default tracks still display. If not completed for you, fill in the"Release Log Text" field in the Redmine ticket, check the next day to be sure that the entry shows up in the release log and if you filled in the "Release Log URL" field in the Redmine ticket, that check the link works as expected.

Archive Tracks if Tagged

Some tracks will be tagged to be archived and a script like bigPush.sh called archiveTracks.csh will take a name of a table or a file (or a list of tables and list of files) to trigger an archive of the data. More to come, but the general gist will be to run this command for those tracks.

Announce on indexNews, newsArch, Genome-announce, FB, Twitter

Edit the two html pages, commit, make, validate on Beta, push request, validate on RR.

vim ~/kent/src/hg/htdocs/indexNews.html
vim ~/kent/src/hg/htdocs/goldenPath/newsarch.html

If you want to add an image to newsarch.html, put it in htdocs/images as a PNG and make sure it is under 2.2Mb, else it will cause Git problems (also large image files are not needed and will slow the site for slower connections).

  • If you send an image in an email to genome-announce use smaller images: 600x400 keeping size around 140KB (think screenshot). Use the JPEG format.

See announcement example. Be sure to get it proofread.

Also post a FB and Twitter update. See info here. Be sure that credits.html has proper information and if possible find tags like @SoAndSosLab if space permits on the tweet. MarkD notes, "While most users don't care [about credits.html/source attribution], they are very important for the people developing the data, as this helps them demonstrate they have been doing good stuff with their grant money"

If you want to use a saved session for your announcement, Dan recommends you don't use a personal account and instead use the "view" account. Short and clear URL is nice. Here's where you find the password:

ssh qateam@hgwdev
grep -A 1 "View" googleHangoutStepsPassword