Running joinerCheck for all databases

From genomewiki
Revision as of 20:19, 1 April 2010 by Ann (talk | contribs) (moving this from the static qa page docs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Overview

The joinerCheck program is run bi-monthly on all active databases on the hgwbeta machine. Errors that are found may be newly introduced or pre-existing. Corrections to data and/or all.joiner rules that impact browser users are given priority; corrections to known errors are made as time permits.

  • When/Who: Run bi-monthly by a member of QA
  • Why: Catch global rule problems with all.joiner or new data problems within assemblies
  • Where: hgwbeta:/hive/groups/qa/joinerCheck
  • Remember: Before starting, be sure to update your source tree so that you have the latest version of all.joiner

Run joinerCheck -keys

  • Create new directory at: /hive/groups/qa/joinerCheck/keys/YYYYMM
  • In the past it has been useful to run against all DBs in three batches: (1) human, (2) mouse and rat, and (3) the rest. One batch per day, for three days in a row
  • The goal is to have each DB run without conflicting with genbank updates. Never run overnight on a Sunday (EST updates). In the past, running Mon-Tue-Wed has been successful, starting in the morning after the previous night's genbank update has finished (~8am) and ending the batch before the next night's genbank updates start (~12pm)
  • Execute Keys program hgwbeta:/hive/groups/qa/joinerCheck/keys/YYYYMM> nice nohup joinerBatchKeys.csh [hg | mm_rn | other | extra] &
  • When the script finished, it sends an email to whoever ran it.
  • When complete, execute Error program> getJoinerKeyErrors.pl /hive/groups/qa/joinerCheck/keys/YYYYMM

Evaluate -keys result

  • Create working directory to investigate problems at: /hive/groups/qa/joinerCheck/rules/YYYYMM
  • Review results of getJoinerKeyErrors.pl summary report
  • Compare to the last results to find out what is known, what is new, and what has been fixed since the last run
  • Analysis hints:
    • Review makedocs and all.joiner for tracks/assemblies with issues
    • Attempt to isolate exact problem. Is it data or a rule problem?
    • Original developer can help confirm track problems
    • If it is a genbank table, confirm that genbank updates are on then contact Mark Diekhans
    • Data fixes can often be made by QA, rule fixes are usually made by a developer

Run joinerCheck -times

  • Create new directory at: /hive/groups/qa/joinerCheck/times/YYYYMM
  • Execute Times program hgwbeta:/hive/groups/qa/joinerCheck/keys/YYYYMM> nice nohup joinerBatchTimes.csh [hg | mm_rn | other | extra] &

Evaluate -times result

  • Use -keys working directory to investigate problems at: /hive/groups/qa/joinerCheck/rules/YYYYMM
  • Review results
  • Compare to the last results to find out what is known, what is new, and what has been fixed since the last run
  • Analysis hints:
    • Any and all output is considered an error
    • Priority of found errors can be high if they involve proteome/known genes data
    • Makedoc notes are the primary source for tracking changes
    • Analysis hints from -keys analysis can be helpful for isolating root cause of -times errors, too