Threestatemetadb

From genomewiki
Jump to navigationJump to search

Three State Meta DB

This is a proposal for instituting a metaDb release process developed by Brian with all real programming to implement the metaDb table done by Tim. Please see the page ThreeStateTrackDb page for some description of how that three state process works.

This address the following issues:

  • track metadata is stored in the trackDb table, ignoring the fact that we have metadata for downloads that aren't tables.

Updated trackDb release process

The three state metaDb release process parallels the trackDb make process, so when you use the make alpha, beta, and public targets in the trackDb directory, the corresponding metaDb table is built. The RA files that are used to build the metaDb are in a subdirectory under each assembly with the name {org}/{assembly}/metaDb/{state}.

  • alpha state: make alpha run on hgwdev loads all tracks with release alpha into trackDb and hgFindSpec, and loads the metaDb from the RA files that are in {org}/{assembly}/metaDb/alpha
  • beta state: make beta run on hgwbeta loads all tracks with release beta into trackDb and hgFindSpec and loads the metaDb from the RA files that are in {org}/{assembly}/metaDb/beta
  • public state: make public run on hgwbeta loads all tracks with release public into trackDb_public and hgFindSpec_public and loads the metaDb from the RA files that are in {org}/{assembly}/metaDb/public

QA can look at metaDb_public on hgwbeta-public.cse.ucsc.edu as a final check before pushing it to the RR. Then trackDb_public, hgFindSpec_public, and metaDb_public will be pushed from mysqlbeta to mysqlrr, renaming them to trackDb and hgFindSpec.

Tracks that don't already exist on the RR function follow this process:

  • Developer creates table(s) and adds a corresponding stanza to trackDb.ra, with no release tags, and if there is metadata, creates a file in the metaDb/alpha directory called {compsiteName}.ra that has all the metadata for the particular composite.
  • when the track is ready for QA, the developer copies the {composite}.ra file from the alpha directory to the beta directory and checks it in.
  • QA pushes table to mysqlbeta, does a make beta, and QAs track
  • When track is ready, QA copies the {composite}.ra file from the beta directory to the public directory and does a make public, then pushes the tables, trackDb_public, hgFindSpec_public, and metaDb_public to mysqlrr, renaming them to remove the "_public"

If a developer subsequently wants to make changes to the track, she will change the release tags as described in the ThreeStateTrackDb file

  • the old stanza:
track someTrack
shortLabel Mediocre RNAs
visibility hide
  • becomes two stanzas:
track someTrack
release alpha
shortLabel Great RNAs
visibility pack
track someTrack
release beta,public
shortLabel Mediocre RNAs
visibility hide
  • nothing leaks out to the RR before it is ready. QA looks at the changes on hgwbeta by changing the first stanza to release alpha,beta and the second stanza to release public
  • when it is deemed worthy, the trackDb.ra entry can be collapsed back to one stanza, with no release tags (although leaving release alpha,beta,public in there would have the exact same effect):
track someTrack
shortLabel Great RNAs
visibility pack

Managing large composite tracks

To address this issue, each large composite track will be moved to its own file with all of its contained track stanzas. To minimize the amount of editing required, the include directive will be modified to have a release attribute. Since includes are processed line-per-line, not as part of a stanza, an attribute is an easier approach than adding a release tag.

For example, if a developer created the file encGencode.ra, the following line could be added to trackDb.ra (or trackDb.wgEncode.ra, or whatever):

include encGencode.ra alpha

When QA gets it, this will become:

include encGencode.ra alpha,beta

And when it is released:

include encGencode.ra alpha,beta,public

Updating existing public tracks

When changes need to be made to an already-released composite track, the composite trackDb file is copied to a new file name and it is added to cvs. So two copies of the entire file will exist, and trackDb.ra will look like:

 include encGencode.new.ra alpha
 include encGencode.ra beta,public

When the track is in QA and staged on hgwbeta, this will become:

 include encGencode.new.ra alpha,beta
 include encGencode.ra public

To release the new version, QA will:

 cp encGencode.new.ra encGencode.ra
 cvs commit encGencode.ra

And change the include line back to:

   include encGencode.ra

The changes won't leak to the RR before QA approves them and removes the release restrictions on the include directive (which is equivalent to a release restriction of alpha,beta,public).

Updating a track that is mid-QA

Often, a developer needs to start more work on a track before it is through the QA process and released and make the changes visible on genome-test for data contributors. In this case, a third file can be created and committed to CVS:

 include encGencode.latestForContributor.ra alpha  #name is arbitrary
 include encGencode.new.ra beta
 include encGencode.ra public

When QA of the original update is complete that change released, the developer is free create a pushQ entry for it. QA can copy the latest version to encGencode.new.ra when ready to move it to beta. Once the latest changes have been cvs committed to encGencode.new.ra and there are no include directives for the arbitrarily-named encGencode.latestForContributor.ra file, the latter can be cvs removed.

Migration Plan

The goal of the migration plan is to minimize any disruption in tracks currently being developed.

Changes that can be made while the current release model continues to function:

  • Check for any existing tables were tracks have overlapping tags. For instance, there are composite tracks where some tables have the beta tag and some have no tag. Correct any that are found.
  • Add generic release tag changes to the library code and hgTrackDb program
  • Modify all instances of release beta to be release beta,public.
  • Modify trackDb/makefile where make strict requests the beta,public tags
  • Setup hgwbeta-public
  • Convert ENCODE composite tracks to one per include.
  • edit any trackDb that is staged on hgwbeta but not yet released back to release beta (remove the public)

At this point, all mechanism are in place

  • replace make strict with make beta and make public
  • switch to new methodology