HgTablesTest details

From Genecats
Revision as of 23:37, 27 June 2016 by Galt (talk | contribs)
Jump to navigationJump to search

hgTablesTest - what it is actually testing.

For each org/db/group/track/table does this:

 - For each org/db, it gets 5MB test region from the middle of the first chrom in chromInfo table.
 - Can filter by -org= -db= or specify number to check -orgs=N -dbs=N
 - Can filter by specifying a single -group= -track= -table=
 - Can filter by specifying the number to check, -groups=N -tracks=N -tables=N
   Defaults to all groups, and the first 4 tracks and the first 2 tables. 
   Since testing just the first 4 tracks all the time does not get much coverage
   of the rest of the system I have recently added the ability for it to
   shuffle the track and table lists or not. 
   -seed=N - option for reproducibility and debugging.
   -noShuffle - do not shuffle tracks and tables lists.
 - Recursively selects the track/table in the hgTables drop-downs.
 - Checks with the htmlCheck library all pages fetched by the robot.
 - Presses the schema button to bring up the schema page.
   Because the schema page includes the track description at the bottom,
   it ends up checking the html description which is located under makeDb/trackDb/
   and which gets built into the trackDb.html field.  It turns out that
   it stops at the first error, so actually testing and fixing goes faster
   by just running the htmlCheck utility directly on the .html files under makeDb/trackDb/.
   You can quickly find out if the fix worked, and if there are any other errors,
   without waiting for a whole other build-cycle of 3 weeks.
 - Presses the summary/statistics button.
 - testAllFields - chooses "all fields from selected table", "get output".
   Counts the rows returned and keeps as expectedCount for further steps.
 - testOneField - chooses "Select Fields from primary and related tables", "get output"
   It automatically checks the first field found and submits. It compares the rows returned to the expected count.
 - If no BED output is available, this is a signal that it cannot limit the output to the 5MB test position,
   which means that the entire table will be scanned.  The table is skipped if over 500K rows,
   which it checks in the database.
 - If BED output is available (and output is limited to 5MB test region),
   it then proceeds to test these:
   - testOutSequence - chooses "sequence", "get output", fetches, compares output rowCount to expectedCount.
   - testOutBed - chooses "BED", "get output", fetches, compares output rowCount to expectedCount.
   - testOutHyperlink - chooses "hyperlinks", "get output", fetches, compares output <A> tags count to expectedCount.
   - testOutGff- chooses "GTF", "get output", fetches. No other checking. (internally calls everything GFF not GTF)
   - testOutCustomTrack -- chooses "custom track", "get output", "CT in Table Browser". Checks that group "user" now exists.
      Note that because of previous tests, many custom tracks may now exist, invalidating the check.
   "CDS FASTA from multiple alignments" output type is NOT tested.

What it is NOT testing:

identifiers (names/accessions)
filter
intersection
correlation

But, at the end it does test joining uniProt.taxon. And it compares the number of rows returned to the table size which is fetched from the database. And if you are connected to hgwdev db while hitting hgwbeta URL, those two tables can be different, and hgTablesTest complains about it. You can address this by using the environment variable HGDB_PROF=someprofile where someprofile is defined in your .hg.conf file and points to the database which you are testing against, which would be mysqlbeta.