New track checklist: Difference between revisions

From Genecats
Jump to navigationJump to search
(took out pre-QA labels)
Line 27: Line 27:
* The following tables don't need to be referenced in the makedoc: chromInfo, gap, gc5BaseBw, gold, grp, hgFindSpec, history, trackDb
* The following tables don't need to be referenced in the makedoc: chromInfo, gap, gc5BaseBw, gold, grp, hgFindSpec, history, trackDb


==Release Log and URL (pre-QA)==
==Release Log and URL==


The "Release Log" field in the push queue should generally contain the shortLabel of the track, or a short description of what is being released if it does not make sense to put a shortLabel here.  The entry in this field will be added to the auto-generated [http://genome.ucsc.edu/goldenPath/releaseLog.html release log page].  If it makes sense to link to the hgTrackUi page for this track, make an entry in the "Release Log URL" field like so:
The "Release Log" field in the push queue should generally contain the shortLabel of the track, or a short description of what is being released if it does not make sense to put a shortLabel here.  The entry in this field will be added to the auto-generated [http://genome.ucsc.edu/goldenPath/releaseLog.html release log page].  If it makes sense to link to the hgTrackUi page for this track, make an entry in the "Release Log URL" field like so:
Line 140: Line 140:
"featureBits -enrichment" does almost the exact same thing as getYield.csh, except the coverage amount is the number of bases in the intersection of the two tables divided by the '''first''' table instead of the second table.
"featureBits -enrichment" does almost the exact same thing as getYield.csh, except the coverage amount is the number of bases in the intersection of the two tables divided by the '''first''' table instead of the second table.


==Searching (pre-QA)==
==Searching==


If the track item names are relatively unique, check to see if search works by pasting an item name in the position/search box.  (For example, to check if search is enabled for the "Common SNPs" track, choose a SNP from the track, say "rs17885219," paste it in the box, and hit "jump."  If search is enabled, you will either be taken directly to the position of the item, and it will be highlighted in the display, or you will get a list of all of the tracks that contain the item (and your track should be included). If you get an error, or if you get a list of tracks but your track isn't included in the list, search is not enabled.
If the track item names are relatively unique, check to see if search works by pasting an item name in the position/search box.  (For example, to check if search is enabled for the "Common SNPs" track, choose a SNP from the track, say "rs17885219," paste it in the box, and hit "jump."  If search is enabled, you will either be taken directly to the position of the item, and it will be highlighted in the display, or you will get a list of all of the tracks that contain the item (and your track should be included). If you get an error, or if you get a list of tracks but your track isn't included in the list, search is not enabled.
Line 156: Line 156:
This means that the item with the name "Idd21.1" in the jaxQtl (MGI QTL) track on mm9 (mouse) doesn't work.  Alert the track sponsor.
This means that the item with the name "Idd21.1" in the jaxQtl (MGI QTL) track on mm9 (mouse) doesn't work.  Alert the track sponsor.


==Track description (pre-QA)==
==Track description==


Read the track description and edit for clarity, spelling, and grammar.  Be sure our [[Details_pages_--_conventions | conventions]] are followed.
Read the track description and edit for clarity, spelling, and grammar.  Be sure our [[Details_pages_--_conventions | conventions]] are followed.
Line 166: Line 166:
Make sure that any email addresses given on the details page have been through Hiram's sanitizer (encodeEmail.pl). It turns the address into an encrypted HREF "mailto:" address that makes it harder for spammers to use.
Make sure that any email addresses given on the details page have been through Hiram's sanitizer (encodeEmail.pl). It turns the address into an encrypted HREF "mailto:" address that makes it harder for spammers to use.


==All details: 1 data point (pre-QA)==
==All details: 1 data point==


Choose a representative data point for the track. Check all details for this data point, including all links.  Make sure information from the table is displaying correctly (e.g., if a color is used in the table, make sure that color appears for the item.)
Choose a representative data point for the track. Check all details for this data point, including all links.  Make sure information from the table is displaying correctly (e.g., if a color is used in the table, make sure that color appears for the item.)
Line 172: Line 172:
For links that are hard-coded to a particular server, there are some tricks that are used to make them testable on hgwdev and hgwbeta.  See the [[Static_Page_JS_Protocol]] page for more details.
For links that are hard-coded to a particular server, there are some tricks that are used to make them testable on hgwdev and hgwbeta.  See the [[Static_Page_JS_Protocol]] page for more details.


==Performance and Display (pre-QA)==
==Performance and Display==


For tracks displayed by default, the full chromosome view (chr1) should display within 20 seconds. For tracks which are not displayed by default, the full chromosome view should display within a minute.
For tracks displayed by default, the full chromosome view (chr1) should display within 20 seconds. For tracks which are not displayed by default, the full chromosome view should display within a minute.
Line 180: Line 180:
Hit the Reverse button and ensure your track displays correctly.
Hit the Reverse button and ensure your track displays correctly.


==Track Settings (hgTrackUi) (pre-QA)==
==Track Settings (hgTrackUi)==


Click on the track name or the mini-button to the left of the track (in hgTracks) to get to the track settings page. Make sure that the track settings work as expected.
Click on the track name or the mini-button to the left of the track (in hgTracks) to get to the track settings page. Make sure that the track settings work as expected.


==Table Descriptions (pre-QA)==
==Table Descriptions==


Hit the "view table schema" button (on hgTrackUi, hgc, or hgTables) and make sure there is a description column present with descriptions of the table fields. If a track has more than one table, be sure to check for table descriptions on each of them. The description column uses the 'tableDescriptions' table to display this information. This table is built nightly on hgwdev and hgwbeta, and must be pushed to the RR if it contains descriptions for a new type of table (things like psl will already be out there).
Hit the "view table schema" button (on hgTrackUi, hgc, or hgTables) and make sure there is a description column present with descriptions of the table fields. If a track has more than one table, be sure to check for table descriptions on each of them. The description column uses the 'tableDescriptions' table to display this information. This table is built nightly on hgwdev and hgwbeta, and must be pushed to the RR if it contains descriptions for a new type of table (things like psl will already be out there).
Line 192: Line 192:
A cron job sends a push request email to the admins to push the tableDescriptions tables to the RR once a week.
A cron job sends a push request email to the admins to push the tableDescriptions tables to the RR once a week.


==Label lengths (pre-QA)==
==Label lengths==


Check that the shortLabel is 17 characters or less and that the longLabel is 80 characters or less.  The shortLabel is visible in the main hgTracks display if you turn the track to dense.  If it is 17 characters or less, it won't be cut off in this part of the display.  The longLabel is visible in hgTracks, and it can also be copied from the configuration page (hgTrackUi).
Check that the shortLabel is 17 characters or less and that the longLabel is 80 characters or less.  The shortLabel is visible in the main hgTracks display if you turn the track to dense.  If it is 17 characters or less, it won't be cut off in this part of the display.  The longLabel is visible in hgTracks, and it can also be copied from the configuration page (hgTrackUi).

Revision as of 01:20, 5 October 2012

This is the checklist for QA to follow when releasing Genome Browser tracks. General rule: stop testing when you come to the first non-trivial error, and bounce the track to the B queue until the track sponsor fixes it.

Claiming a track: pushQ and redmine

Once you've decided to QA a track you will need to put your name down as the "reviewer" in the related pushQ entry:

http://hgwbeta.cse.ucsc.edu/cgi-bin/qaPushQ

Note that you will need to check and/or "mark as checked" the following pushQ items:

  • sizes: click the "show sizes" button to check that the files and tables associated with the track have sizes that make sense
  • makedoc: mark this as "yes" once you've checked the makedoc (see below)
  • joiner: mark this as "yes" once you've run joinerCheck (see below) or "X" if this track is ignored
  • index: we no longer check this

and assign the redmine ticket to yourself and set the status to "reviewing". For more info on the redmine workflow see this wiki: http://redmine.soe.ucsc.edu/projects/genomebrowser/wiki/Track

Background familiarity

If you haven't worked with a track like this one, and it already exists on the RR for another assembly (use 'getAssemblies.csh tablename' on hgwdev to check), use it a bit on the RR to become familar with it. For complex tracks, look for genomewiki pages on it, and/or look through the push queue for previous problems encountered on similar tracks. Familiarize yourself with the tables and files needed for this track.

Makedoc

Look for an entry for your track in src/hg/makeDb/doc/$db.txt. If there is no mention of this track, make a request to the track sponsor. Basic tracks that are loaded automatically with a new assembly may only have a tiny reference in the makedoc. UCSC Genes has its own document.

All Ensembl tables are generated by the Ensembl update procedure which is maintained in doc/makeEnsembl.txt without mention of explicit tracks or tables.

  • The following tables don't need to be referenced in the makedoc: chromInfo, gap, gc5BaseBw, gold, grp, hgFindSpec, history, trackDb

Release Log and URL

The "Release Log" field in the push queue should generally contain the shortLabel of the track, or a short description of what is being released if it does not make sense to put a shortLabel here. The entry in this field will be added to the auto-generated release log page. If it makes sense to link to the hgTrackUi page for this track, make an entry in the "Release Log URL" field like so:

 ../../cgi-bin/hgTrackUi?db=[DATABASE]&g=[TRACK]

An actual example:

 ../../cgi-bin/hgTrackUi?db=hg19&g=affyU133

The BIG QA Script: qaGbTracks

You can use qaGbTracks to run many of the qa scripts in one go, including:

 checks for underscores in table names
 checks for the existence of table descriptions
 checks shortLabel and longLabel length
 positionalTblCheck
 checkTblCoords
 genePredCheck
 pslCheck
 featureBits
 (a version of) countPerChrom

Tables are sorted and internally consistent

Run any of these that apply to your tables:

genePredCheck - Checks that tables in the genePred format are valid. (Tables of this type are usually in the Genes and Gene Prediction group. Usually only the primary table will need this check. Some examples: knownGene, ensGene, refGene.)

pslCheck - Checks that psl tables are valid. (PSL tables show up in nearly any track where alignments are used. Some examples: mrna, est, refSeqAli.)

checkTableCoords - Checks that the genomic coordinates in positional tables are legal (e.g., coordinates are not off the end of a chromosome). If the table passes there will be no output. (Almost every table will need this check, unless it has no chromosome positions in it.)

positionalTblCheck - Checks to see that the a positional table is ordered by chrom and chromStart. A positional table being out of order can cause a huge slowdown in display speed. If the table passes there will be no output. If the table does not pass there will be an error message like:

table hg18.snp129 not sorted starting at row 4867: chr1:387005

Alert the track sponsor if there is an error. He/she may determine that the items are sorted enough to be released. (As long as the items are almost all in order, it will not affect performance.) Also note that Genbank tables are not expected to be in order after updates have run. (Almost every table will need this check.)

joinerCheck

The document kent/src/hg/makeDb/schema/joiner.doc describes what all.joiner is. joinerCheck is the tool that is used to check that the rules in all.joiner are being followed.

Look for your table names in src/hg/makeDb/schema/all.joiner and find the identifiers associated with those tables. Then, for each identifier, run joinerCheck like so:

 joinerCheck -keys -identifier=$identifier -database=$db all.joiner

If there are errors or the table is not mentioned in all.joiner, notify the track sponsor. An entry in the tablesIgnored section of all.joiner is sufficient if there are no table relationships to check.

Be aware of this problem with joinerCheck. Basically, if you get output that looks like this:

 Checking keys on database hg18

that is NOT followed by lines like this:

 anoCar1.blastHg18KG.qName - hits 45332 of 45332 ok

then the rule didn't really run, and you need to remove the -database parameter.

You can also run joinercheck with the -times flag:

 joinerCheck -times -database=$db all.joiner

Look for any errors that are relevant to your track.

The runJoiner.csh script is a shortcut for all of the above, but beware that wildcards in tablesIgnored sections are not recognized, and if the problem above occurs, then you need to run joinerCheck directly.

Chromosome coverage

Check the count of items on each chromosome. Bigger chromosomes (the biggest is usually chr1) should have more items. Look for chromosomes that have suspiciously few or no items on them. Note: this script must be run on dev:

 countPerChrom.csh $db $table

or as a histogram:

countPerChrom.csh $db $table histogram
Example: 
countPerChrom.csh hg19 refGene histogram

If a "regular" chrom (not random or haplotype) has no data, a note like this should appear in place of the track drop-down on hgTracks:

 [No data-chr21]

(This is controlled by the "chromosomes" setting in trackDb.ra. The chromosomes line in trackDb.ra specifies the chromosomes that DO have data. This line is the same as the "restrictList" field in the trackDb table.)

OR use these alternative ways to see chromosome coverage:

  • Import the table into Genome Graphs.
  • Select the table in the Table Browser, hit the "describe table schema" button, and click the "values" link for the chrom field.

Comparison

Compare your track to a similar track if possible. Look to see that the features in your track are more or less in the same position as similar tracks. Look for a lot of items that have a chromStart or chromEnd that is one position different from existing tracks. This could indicate an off-by-one error in the new track. Also use getYield.csh and/or featureBits to compare tracks (discussed next).

FeatureBits and Gaps

Run featureBits, or use the runBits.csh script to run featureBits. runBits.csh checks for coverage and overlap with gap, and also checks for undbridged gaps.

 runBits.csh $db $table

Alert the track sponsor if there are unbridged gaps and this is a track created at UCSC. Put featureBits results in the push queue. If previous assembly also has this track, compare featureBits between current assembly and previous assembly -- if there are big differences between the old and new tracks, alert the track sponsor.

If there is a similar track that this one can be compared to, use either "featureBits -enrichment" or getYield.csh to compare the tracks. Alert the track sponsor if the difference seems unreasonable.

getYield.csh can be used to see how well a new track captures the footprint of existing tracks, such as refGene or xenoRefGene, e.g.,

 getYield.csh hg19 ensGene refGene

output includes:

 yield       = 93.8% (intersection / refGene)
 enrichment  = 24.7x ((intersection / ensGene) / (refGene / genome))

shows that 93.8% of refGene is present in ensGene and that compared to the refGene footprint on the genome, ensGene is 24x enriched for refGenes. (Enrichment is the amount of table1 that covers table2 vs. the amount of table1 that covers the genome. It's how much denser table1 is in table2 than it is genome-wide.)

"featureBits -enrichment" does almost the exact same thing as getYield.csh, except the coverage amount is the number of bases in the intersection of the two tables divided by the first table instead of the second table.

Searching

If the track item names are relatively unique, check to see if search works by pasting an item name in the position/search box. (For example, to check if search is enabled for the "Common SNPs" track, choose a SNP from the track, say "rs17885219," paste it in the box, and hit "jump." If search is enabled, you will either be taken directly to the position of the item, and it will be highlighted in the display, or you will get a list of all of the tracks that contain the item (and your track should be included). If you get an error, or if you get a list of tracks but your track isn't included in the list, search is not enabled.

If the track item names are relatively similar (e.g., the items in RNA Genes, and TFBS) we don't want to enable search, as it would return too many matches. If search isn't enabled and you think it should be, make that request to the track sponsor.

Finally, if search is enabled, make sure that all of the item names in the track can be found. Do this by checking this page for your assembly and table:

http://genecats.cse.ucsc.edu/qa/test-results/checkHgFindSpec/hgwdevOutput

If your assembly and table appear in this list, there is a problem with searches for some identifiers in the track. You'll see an error message like this one:

 Error: mm9.jaxQtl.name value "Idd21.1" doesn't match termRegex "^[a-z0-9-]+$" for search jaxQtl

This means that the item with the name "Idd21.1" in the jaxQtl (MGI QTL) track on mm9 (mouse) doesn't work. Alert the track sponsor.

Track description

Read the track description and edit for clarity, spelling, and grammar. Be sure our conventions are followed.

Ensure references are in the correct format and in alphabetical order (by first author listed). Links to journal articles should go directly to the journal rather than PubMed if the journal article is open access (i.e., doesn't require a subscription). For articles that are not open access, links can go either to the journal or to PubMed, and they should go to the abstract, not the full text.

Ensure quotes, ampersands, and less than and greater than signs are represented with their html names.

Make sure that any email addresses given on the details page have been through Hiram's sanitizer (encodeEmail.pl). It turns the address into an encrypted HREF "mailto:" address that makes it harder for spammers to use.

All details: 1 data point

Choose a representative data point for the track. Check all details for this data point, including all links. Make sure information from the table is displaying correctly (e.g., if a color is used in the table, make sure that color appears for the item.)

For links that are hard-coded to a particular server, there are some tricks that are used to make them testable on hgwdev and hgwbeta. See the Static_Page_JS_Protocol page for more details.

Performance and Display

For tracks displayed by default, the full chromosome view (chr1) should display within 20 seconds. For tracks which are not displayed by default, the full chromosome view should display within a minute.

Turn on to full display mode a track that is located physically below your track in the display. Make sure that when your track is in full display mode, that the items in the track below it are still mapping correctly. Sometimes there can be an off-by-one error which is caused by your track. If this is happening, you should not push your track.

Hit the Reverse button and ensure your track displays correctly.

Track Settings (hgTrackUi)

Click on the track name or the mini-button to the left of the track (in hgTracks) to get to the track settings page. Make sure that the track settings work as expected.

Table Descriptions

Hit the "view table schema" button (on hgTrackUi, hgc, or hgTables) and make sure there is a description column present with descriptions of the table fields. If a track has more than one table, be sure to check for table descriptions on each of them. The description column uses the 'tableDescriptions' table to display this information. This table is built nightly on hgwdev and hgwbeta, and must be pushed to the RR if it contains descriptions for a new type of table (things like psl will already be out there).

Background on the tableDescriptions table is discussed in [AutoSql].

A cron job sends a push request email to the admins to push the tableDescriptions tables to the RR once a week.

Label lengths

Check that the shortLabel is 17 characters or less and that the longLabel is 80 characters or less. The shortLabel is visible in the main hgTracks display if you turn the track to dense. If it is 17 characters or less, it won't be cut off in this part of the display. The longLabel is visible in hgTracks, and it can also be copied from the configuration page (hgTrackUi).

Downloads

If there are files associated with the track that are to be pushed to hgdownload, check to see that there is a README that makes sense and the files have an md5sum.txt file that goes with them and is correct. Check the file itself to make sure it is not corrupted and that it contains what is expected. If it is a gzipped file, you can do "zcat file.gz | head" and "zcat file.gz | tail" to look at it. Looking at the last part of the file can sometimes catch corruption that can't be seen by only looking at the first part.

Data files destined for hgdownload are organized on hgwdev at:

/usr/local/apache/htdocs-hgdownload/goldenPath/*

and can be viewed in a browser from http://hgdownload-test.cse.ucsc.edu/downloads.html. Non-data files (such as downloads.html) are in the "hgdownload" git repository. See the Static_Page_Protocol for instructions on checking out that repository. Push requests for downloads should look like something like this:

Please push files from here on hgwdev:
    /usr/local/apache/htdocs-hgdownload/goldenPath/$db/file
To here on hgdownload:
    /usr/local/apache/htdocs/goldenPath/$db/file
(in the path, "htdocs-hgdownload" should become "htdocs") 

Finally, check to see if the downloads files ought to have a link from downloads.html. If so, add the link and push downloads.html (after the files are already pushed!).

Push to hgwbeta

Note for Ensembl gene tracks:

  • Before pushing to beta, use the Ensembl_QA script.

Special case for 'seq' and 'extFile' tables:

  • Do not push seq or extFile tables from dev to beta. You must use the copyExtSeqRows.csh script to move only the rows needed.

Notes on existing tracks:

  • If this is an update to an to an existing track, you may want to hold off on this step so that you can compare old and new tracks on hgwdev and hgwbeta.
  • Open the track on hgwbeta before staging it to make sure that the update won't cause a cart clash for users currently looking at the track (as evidenced by a completely blank screen, for instance). If you need to do a cartReset to get the track to show up correctly, something is wrong.

Push all tables (EXCEPT seq and extFile tables, see note below) from hgwdev to hgwbeta. There are two ways:

 sudo mypush $db $table mysqlbeta

or, for a list of tables:

 bigPush.csh $db $tableListFile

make beta on hgwbeta in kent/src/hg/makeDb/trackDb like so:

 make beta DBS=$db

To make beta on more than one db at a time:

 make beta DBS='$db1 $db2 $db3 $db4 etc'

Request a push of any listed supporting files in /gbdb from hgwdev to hgnfs1. Note that hgwbeta and the RR share the files on hgnfs1, so once these files are in place, there is not another push required when the track is released to the RR.

Make Public

Make your track public by using the "make public" command on hgwbeta while in the trackDb directory (src/hg/makeDb/trackDb):

   [user@hgwbeta trackDb]$ make public DBS=$db

Your track should now be visible on the hgwbeta-public server.

Also see [Three State TrackDb] for more information.

Push Request

When the track is ready to be released, ask admins (push-request at soe) for a push of the tables (including trackDb_public and hgFindSpec_public, if needed) from mysqlbeta to mysqlrr. Push any associated downloads. Push any associated static docs. Notify (cc:) the track sponsor on the push request.

If the track already exists on the RR you may need to selectively push the tables/trackDb. Refer to Replacing old tables with new ones for more information.

If this is a repush of existing data that was found to be problematic, add a note to repush.html. The file to edit is in the genecats tree, at genecats/qa/repush.html. (This doesn't happen very often.)

Validate on the RR

Check your track on the RR. Check that searches work (if not, you probably need to push the hgFindSpec_public table). Also check that all default tracks still display. If you filled in the "Release Log URL" field in the push queue, check the next day to be sure that the link from the release log works as expected.