Local tracks at mirror sites

From genomewiki
Jump to navigationJump to search

Mirror sites that load their own tables may have meta-data shared between tables overwritten when the mirrored tables are updated from UCSC. This page outlines solutions to some of these collisions.

Local track group and track definitions

The UCSC genome browser supports multiple trackDb, hgFindSpec, and grp tables, that are combined at run time to define the set of track groups and tracks for a genome assembly. This allows storing local tracks definitions in local tables that will not be overwritten when databases are updated from UCSC.

For example, a mirror site might have lab-wide tracks configured in the tables:

  • grp_lab - contains the definitions of the track groups for the lab
  • trackDb_lab - contains the lab's track definitions
  • hgFindSpec_lab - contains search specs for the tracks in trackDb_lab. The genome browser locates this table by replacing the sub-string "trackDb" in the name of the trackDb table with the string "hgFindSpec".

The key concept here is that these tables contain meta-data only for local tracks. This approach differs from that used by the UCSC genome browser staff for developing new tracks, where tables like trackDb_user contain copies of all track definitions, both current and ones they are developing.

These tables are configured in the genome browser hg.conf file using the db.trackDb and db.grp variables. Their values are comma-separated lists of tables names. This first occurrence of a track is used. Thus mirror sites can use this facility to override the default settings of standard tracks as well as manage their own tracks. If the tables defined in the variables don't exist in a particular genome, they are ignored.

The hg.conf file for the above example contains:

 db.grp=grp_lab,grp
 db.trackDb=trackDb_lab,trackDb

Any number of tables can be supplied in these lists. So if Joe the bioinformatician has his own tracks, his hg.conf could contain:

 db.grp=grp_joe,grp_lab,grp
 db.trackDb=trackDb_joe,trackDb_lab,trackDb


The best approach to building local tracks is to create a build system completely divorced from the one in kent/src/hg/makeDb/trackDb/. This directory hierarchy is both very large and oriented to UCSC genome browser development.

To explain the setup, lets assume you have a directory browser/ where you want to build your local track meta-data. It's suggested you keep the files in this directory under in your source control tree and write make files to run the commands described here.

Lets assume you have your track information in a directory labTracks/ under this you have a directory trackDb/. Under this directory, you need to have a organism and genome structure similar to that in the kent tree. Lets assume you have local tracks for human hg18 and mouse mm9, with tracks in each browser. This would require files such as

 labTracks/trackDb/trackDb.ra
                   human/trackDb.ra
                         hg18/trackDb.ra
                   mouse/trackDb.ra
                         mm9/trackDb.ra
                            

Note that the trackDb.ra files must exist, although they maybe emty. It may also be desirable to add HTML pages for each track. See kent/src/hg/makeDb/trackDb/README details of on configuration tracks.

To load lab tracks for hg18, assuming one is running in the labTracks/ directory and has the kent source tree location in a variable $KENT.

 hgTrackDb human hg18 trackDb_lab ${KENT}/src/hg/lib/trackDb.sql ./trackDb
 hgFindSpec human hg18 hgFindSpec_lab ${KENT}/src/hg/lib/hgFindSpec.sql ./trackDb

Group definitions are created using the approriate SQL commands. The following will create the grp_lab table:

 DROP TABLE IF EXISTS grp_lab;
 CREATE TABLE grp_lab (
     name varchar(255) not null,
     label varchar(255) not null,
     priority float not null,
     PRIMARY KEY(name)
 );
 INSERT grp_lab VALUES("lab", "Lab Tracks", 0.5);

Local MAF tracks

By default, many tracks, including MAF tracks, locate data files using the seq and extFile tables. When these tables are updated from UCSC, it causes local MAF tracks to stop functioning.

MAFs tracks can be configured to access the external file directly using the following steps.

  1. Create a single MAF file, not split by chromosome. For this examples, lets say the locations is /gbdb-lab/hg18/ourTrack/ourTrack.maf.
  2. Configure the absolute path to the MAF in the mafFile property in the track's trackDb.ra entry and load the trackDb tables. For example:
    mafFile /gbdb-lab/hg18/ourTrack/ourTrack.maf
    . Note that this is the not the full stanza for the MAF conservation file, for an example of how MAF trackDb entries look like, see the hg18 conservation track
  3. Use the -custom option of hgLoadMaf:
    hgLoadMaf -custom -loadFile=/gbdb-lab/hg18/ourTrack/ourTrack.maf hg18 ourTrack