New track checklist

From genomewiki
Revision as of 22:30, 16 July 2010 by Marygoldman (talk | contribs) (adding link to good page as well as Checking Length Labels)
Jump to navigationJump to search

This is the checklist for QA to follow when checking releasing tracks. General rule: Stop testing when you come to the first non-trivial error, and mark the track on-hold in the queue until the track sponsor fixes it.


Background familiarity

If you haven't worked with a track like this one, and it already exists on the RR for another assembly, use it a bit on the RR to become familar with it. For complex tracks, look for genomewiki pages on it, and/or look through the push queue for previous problems encountered on similar tracks. Familiarize yourself with the tables and files needed for this track.

Push to hgwbeta

(If this is an update to an to an existing track, you may want to hold off on this step so that you can compare old and new tracks on hgwdev and hgwbeta.)

Push all tables from hgwdev to hgwbeta. There are two ways:

 sudo mypush $db $table mysqlbeta

or, for many tables:

 bigPush.csh $db $tableList

Make beta on hgwbeta in kent/src/hg/makeDb/trackDb like so:

 make beta DBS=$db

Request a push of all supporting files in /gbdb from hgwdev to hgnfs1. Note that both hgwbeta and the RR share the files on hgnfs1.

Makedocs

Look for an entry for your track in src/hg/makeDb/doc/$db.txt. If there is no mention of this track, make a request to the track sponsor. Basic tracks that are loaded automatically with a new assembly may only have a tiny reference in the makedoc. UCSC Genes has its own document.

Release Log and URL

The "release log" field in the qaPushQ should generally contain the shortLabel of the track. The entry in this field will be added to the auto-generated release log page. If it makes sense to link to the hgTrackUi page for this track, make an entry in the releaseLogUrl field like so:

 ../.../ (need to fill this out)

Index

Check the indices for the primary table using the mySQL command "show index from ". Expect to find chrom/chromStart and chrom/chromEnd. Press show sizes in the push queue and look at the indexes. Make sure no tables are missing and all return valid values.

Check that tables are sorted and internally consistent

  • checkTableCoords

For positional tables, checks that the genomic coordinates are legal (e.g., coordinates are not off the end of a chromosome).

  • positionalTblCheck

For any positional table, checks to see that the table is ordered by chrom and chromStart. The utility positionalTblCheck does this. A positional table being out of order can cause a huge slowdown in display speed. If the table passes there will be no output. If the table does not pass there will be an error message like:

 table hg18.snp129 not sorted starting at row 4867: chr1:387005

Note that it is not important that the items be sorted exactly in order - as long as the items are almost all in order, it will not affect performance. Genbank tables are not expected to be in order after updates have run.

  • genePredCheck/pslCheck

For gene prediction tracks (with tables in the genePred format, run genePredCheck (run the command with no arguments for instructions). For psl tables, run pslCheck.

joinerCheck

Look in src/hg/makeDb/schema/all.joiner for references to your tables and grab the identifiers associated with those tables. Then, for each identifier, run joinerCheck like so:

 joinerCheck -keys -identifier=$identifier -database=$db all.joiner

If there are errors or the table is not mentioned in all.joiner (it could be in tablesIgnored), notify the track sponsor.

You can also run joinercheck with the -times flag:

 joinerCheck -times -database=$db all.joiner

Look for any errors that are relevant to your track.

Chromosome coverage

Some tracks don't have data on all chromosomes. To check which ones do have data:

 countPerChrom.csh $db $table

Look for chromosomes with no data that should have data. If a "regular" chrom (not random or haplotype) has no data (and this is expected), a note like this should appear in place of the track drop-down:

 [No data-chr21]

This is controlled by the restrictList setting in trackDb. In trackDb.ra, the chromosomes line specifies the contents that later becomes restrictList after doing a "make beta" on beta. e.g. chromosomes chr2, chr4, chr5. (Need to check that this is right.)

EASY GRAPHICAL WAYS TO CHECK COVERAGE: (1) Import the table into Genome Graphs. (2) To see a count plus a histogram of chrom coverage, select the table in the Table Browser, hit the "describe table schema" button, and click the "values" link for the chrom field.

FeatureBits and Gaps

Run featureBits, or use the runBits.csh script to run featureBits. runBits.csh checks for coverage and overlap with gap, and also checks for undbridged gaps. Alert the track sponsor if there are unbridged gaps and this is a track created at UCSC. Put featureBits results in the push queue. If previous assembly also has this track, compare featureBits between current assembly and previous assembly -- if there are big differences between the old and new tracks, alert the track sponsor.

Searching

If the track item names are relatively unique, perform a few searches. If search isn't enabled, make that request to the track sponsor. This isn't required for the push. If the track item names are relatively similar (i.e. RNA Genes, TFBS) we don't want to enable search, as it would return too many matches.

The checkHgFindSpec tool with the -checkTermRegex option will check that all of the names in a table that is searched match the regular expression specified in trackDb.ra. There is not an option to run it on a single table, so just run it on the whole database and look for output specific to your table. For instance:

 checkHgFindSpec -checkTermRegex hg19

If this gives errors pertinent to your track, request that the track sponsor edit the search entry in trackDb.ra.

Track description

Read the track description and edit for clarity, spelling, and grammar. Ensure references are in the correct format. Ensure quotes and ampersands are in the right html format. (need a link to html formatting page here.)

Make sure that any email addresses given on the details page have been through Hiram's sanitizer (encodeEmail.pl). It turns the address into an encrypted HREF "mailto:" address that makes it harder for spammers to use.

All details: 1 data point

Choose a representative data point for the track from the mysql table. Check all details for this data point, including all links. Make sure information from the table is displaying correctly (e.g., if a color is used in the table, make sure that color appears for the item.)

Gene annotations only

Choose a gene that is on the positive strand, exonCount of 1. Check that the protein sequence displayed by the details page matches the sequence displayed by the Base Position track.

Comparison

Compare your track to a similar track if possible. Look to see that the features in your track are not all one-off from the existing track, and that features are more or less in the same position.

Default Position

Re-evaluate the default position. Does the new track appear as a default? Should it? Location (dbDb.defaultPos) & track priorities (trackDb.ra) should be optimized to make available annotation aesthetically pleasing and scientifically interesting.

Performance and Display

For tracks displayed by default, the full chromosome view (chr1) should display within 20 seconds. For tracks which are not displayed by default, the full chromosome view should display within a minute. Use JKSQL_PROF=on to measure SQL time (reported in Apache error log). -- need to check this.

Turn on to full display mode a track that is located physically below your track in the display. Make sure that when your track is in full display mode, that the items in the track below it are still mapping correctly. Sometimes there can be an off-by-one error which is caused by your track. If this is happening, you should not push your track.

Hit the Reverse button and ensure your track displays correctly.

Track Settings (hgTrackUi)

Click on the track name or the mini-button to the left of the track (in hgTracks) to get to the track settings page. Make sure that the track settings work as expected.

Table Descriptions

Hit the "view table schema" button (on hgTrackUi, hgc, or hgTables) and make sure there is a description column present with descriptions of the table fields. If a track has more than one table, be sure to check for table descriptions on each of them. The description column uses the 'tableDescriptions' table to display this information. This table is built nightly on hgwdev and hgwbeta, and must be pushed to the RR if it contains descriptions for a new type of table (things like psl will already be out there).

Background on the tableDescriptions table is here.

Brooke pushes the tableDescriptions tables once a week.

Label lengths

Check that the shortLabel is less than 17 characters and that the long label is less than 80 characters.

Push Request

When track is ready to be released, ask admins (push-request at soe) for a push of the tables (including trackDb_public and hgFindSpec_public, if needed) from mysqlbeta to mysqlrr. Push any associated downloads. Notify (cc:) the track sponsor on the push request.

If the track already exists on the RR you may need to selectively push the tables/trackDb. Click here for more information

Validate on the RR

Check your track on the RR. Check that searches work. Also check that all default tracks still display.