HgTablesTest details: Difference between revisions

From Genecats
Jump to navigationJump to search
No edit summary
Line 85: Line 85:


=Ex error3=
=Ex error3=
<tt> <nowiki> Error near line 163 of hgwbeta.cse.ucsc.edu/cgi-bin/hgTables:  </BLOCKQUOTE></TD><TD><TT>varchar(255)</TT></TD> <TD><A HREF="/cgi-bin/hgTables </BLOCKQUOTE> without preceding <BLOCKQUOTE> </nowiki> </tt>
This error is actually a data bug -- the stray "</BLOCKQUOTE>" is in the intron column of the tRNAs table.


=Ex error4=
=Ex error4=

Revision as of 20:31, 7 August 2017

hgTablesTest - what it is actually testing.

For each org/db/group/track/table does this:

 - For each org/db, it gets 5MB test region from the middle of the first chrom in chromInfo table.
 - Can filter by -org= -db= or specify number to check -orgs=N -dbs=N
 - Can filter by specifying a single -group= -track= -table=
 - Can filter by specifying the number to check, -groups=N -tracks=N -tables=N
   Defaults to all groups, and the first 4 tracks and the first 2 tables. 
   Since testing just the first 4 tracks all the time does not get much coverage
   of the rest of the system I have recently added the ability for it to
   shuffle the track and table lists or not. 
   -seed=N - option for reproducibility and debugging.
   -noShuffle - do not shuffle tracks and tables lists.
 - Recursively selects the track/table in the hgTables drop-downs.
 - Checks with the htmlCheck library all pages fetched by the robot.
 - Presses the schema button to bring up the schema page.
   Because the schema page includes the track description at the bottom,
   it ends up checking the html description which is located under makeDb/trackDb/
   and which gets built into the trackDb.html field.  It turns out that
   it stops at the first error, so actually testing and fixing goes faster
   by just running the htmlCheck utility directly on the .html files under makeDb/trackDb/.
   You can quickly find out if the fix worked, and if there are any other errors,
   without waiting for a whole other build-cycle of 3 weeks.
 - Presses the summary/statistics button.
 - testAllFields - chooses "all fields from selected table", "get output".
   Counts the rows returned and keeps as expectedCount for further steps.
 - testOneField - chooses "Select Fields from primary and related tables", "get output"
   It automatically checks the first field found and submits. It compares the rows returned to the expected count.
 - If no BED output is available, this is a signal that it cannot limit the output to the 5MB test position,
   which means that the entire table will be scanned.  The table is skipped if over 500K rows,
   which it checks in the database.
 - If BED output is available (and output is limited to 5MB test region),
   it then proceeds to test these:
   - testOutSequence - chooses "sequence", "get output", fetches, compares output rowCount to expectedCount.
   - testOutBed - chooses "BED", "get output", fetches, compares output rowCount to expectedCount.
   - testOutHyperlink - chooses "hyperlinks", "get output", fetches, compares output <A> tags count to expectedCount.
   - testOutGff- chooses "GTF", "get output", fetches. No other checking. (internally calls everything GFF not GTF)
   - testOutCustomTrack -- chooses "custom track", "get output", "CT in Table Browser". Checks that group "user" now exists.
      Note that because of previous tests, many custom tracks may now exist, invalidating the check.
   "CDS FASTA from multiple alignments" output type is NOT tested.

What it is NOT testing:

identifiers (names/accessions)
filter
intersection
correlation

But, at the end, just once it does these special tests on uniProt db:

joining
filter
identifier

HGDB_PROF (or HGDB_CONF)

It joins uniProt.taxon. And it compares the number of rows returned to the table size which is fetched from the database. And if you are connected to hgwdev db while hitting hgwbeta URL, those two tables can be different, and hgTablesTest complains about it. You can address this by using the environment variable HGDB_PROF=someprofile where someprofile is defined in your .hg.conf file and points to the database which you are testing against, which would be mysqlbeta. Alternatively you can point HGDB_CONF to .hg.conf.beta which points db.* to mysqlbeta.

Errors you can ignore

Ex error1

allFields n/a hg38 rep chainSelf chainSelfLink carefulAlloc: Allocated too much memory - more than 500,000,000 bytes (734,348,198)

This error is just saying the track has too many things to access in the Table Browser. In this instance the issue is that this is the self-alignment track, and it is in an area of a lot of repeats, near the centromere, so the track has a lot of items here.

Ex error2

summaryStats Mouse mm10 rna intronEst est Error near line 169 of hgwbeta.cse.ucsc.edu/cgi-bin/hgTables:<li>Can\x27t\x20start\x20query\x3A\x3CBR\x3Eselect\x20tStart\x2CtEnd\x2CqName\x LI outside of any of DIR MENU OL UL
This is a known bug with the est table on mm10 where somebody forgot about split-chrom tables. The table 'mm10.est' doesn't exist, since it is split across each chromosome, so the real table names are chr1_est etc.

Ex error3

Error near line 163 of hgwbeta.cse.ucsc.edu/cgi-bin/hgTables: </BLOCKQUOTE></TD><TD><TT>varchar(255)</TT></TD> <TD><A HREF="/cgi-bin/hgTables </BLOCKQUOTE> without preceding <BLOCKQUOTE>

This error is actually a data bug -- the stray "" is in the intron column of the tRNAs table.


Ex error4