Editing trackDb.chainNet.ra

From Genecats
Revision as of 00:11, 29 November 2011 by Steve (talk | contribs) (New page documenting how to edit a trackDb.chainNet.ra file)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search


When performing QA on chains and nets and the output of getChainLines.csh and/or getMatrixLines.csh does not match the description page, you will need to directly edit the trackDb.chainNet.ra file of the organism being aligned to. For example, if you are adding a Gorilla Chain/Net track to the marmoset assembly, you would need to edit the marmoset trackDb.chainNet.ra file.

Location of the file

The path to trackDb is /kent/src/hg/makeDb/trackDb, but for simplicity's sake, when referring to trackDb, I will just say /trackDb.

Each organism actually typically has two trackDb.chainNet.ra files. One is located in the /trackDb/[organism] directory and one is located in the /trackDb/[organism]/$db directory. For example, the latest marmoset assembly has a file in /trackDb/marmoset and a file in /trackDb/marmoset/calJac3. The file in /trackDb/marmoset/calJac3 applies only to calJac3 whereas the file in /trackDb/marmoset applies to calJac1 and calJac3 and also any future release of the marmoset. Unless there is a specific reason to put an entry into the file at the database-level directory, it is ok to make additions to the file at the organism-level directory and you will likely notice that the latter typically contains most if not all of the entries.

Structure of the file

The trackDb.chainNet.ra file contains stanzas that represent chain/net tracks for other organisms. Here is an example of some entries you might see in a trackDb.chainNet.ra file:

track chainNetGorGor3 override
priority 220.3
matrix 16 90,-330,-236,-356,-330,100,-318,-236,-236,-318,100,-330,-356,-236,-330,90

track chainNetGorGor1 override
priority 220.3
shortLabel $o_db Chain/Net

track chainNetPonAbe2 override
priority 230.3
chainMinScore 5000

track chainNetNomLeu1 override
priority 235.3
matrix 16 90,-330,-236,-356,-330,100,-318,-236,-236,-318,100,-330,-356,-236,-330,90
chainMinScore 5000
chainLinearGap medium

Note that not each stanza contains the same entries, so there are no rules as to which entries must be present in an individual stanza. The only requirements are that the stanzas are declared in the format "track chainNet[$db] override" and that the stanzas are separated by a blank line.

priority

This number represents the evolutionary distance between the two organisms being compared. It also determines the order of the track listings on the Browser page. Note in the example above that gorGor1 and gorGor3 have the same priority value (since they are just different releases of the same organism) which differs from the priority values for ponAbe2 and nomLeu1. When adding a new assembly where a previous assembly exists, you can just copy the priority value from the previous assembly. If you are adding a new assembly where a previous assembly does not exist and you are unsure of the priority value, you can run the following command:

hgwdev> chainNet.pl $db

So if you were adding chain/net tracks to the marmoset assembly and you wanted the priority values, you would type "chainNet.pl calJac3".

chainMinScore and chainLinearGap

These values represent the output of getChainLines.csh. If the output of getChainLines.csh does not match the description page, you need to add one or both of these lines to the override stanza. When in doubt, always trust the output of getChainLines.csh.

matrix

This value represents the output of getMatrixLines.csh. If the output of getMatrixLines.csh does not match the description page, you need to add this line to the override stanza. The simplest thing to do is to just paste the output of getMatrixLines.csh directly into trackDb.chainNet.ra. When in doubt, always trust the output of getMatrixLines.csh.

shortLabel

This value represents the track name that is displayed in the Browser's track listings. When no shortLabel entry is present, the track name defaults to include the organism name. In the example above, there is no shortLabel entry in the nomLeu1 override stanza, so the track name would be displayed as "Gibbon Chain/Net". In the gorGor1 override stanza, the $o_db tag in the shortLabel entry forces that track name to be displayed as "gorGor1 Chain/Net".

Replacing an old assembly with a new assembly

When replacing an old assembly with a new assembly, the old tables (and thus the old track) are removed from beta and the RR once it is verified that the new track functions properly. The old tables and track are left on dev, however, potentially resulting in multiple chains/nets from the same organism. As a rule, only the most recent assembly bears the organism name in the track name, so in the case of gorGor1 and gorGor3, gorGor1 would be displayed as "gorGor1 Chain/Net" and gorGor3 would be displayed as "Gorilla Chain/Net". Any time a new assembly chain/net track is added, a new stanza should be added for the new assembly and the previous assembly's stanza should have a shortLabel entry added to maintain this convention.

In a case where there are multiple chain/net tracks from the same organism, each same-organism track should appear sequentially in the track listing. If they do not, the priority entry is either missing or incorrect for one or more of the override stanzas.