Editing trackDb.chainNet.ra

From Genecats
Jump to navigationJump to search


When performing QA on chains and nets and the output of getChainLines.csh and/or getMatrixLines.csh does not match the description page or the chain/net track names are incorrect or out of order, you will need to directly edit the trackDb.chainNet.ra file of the organism being aligned to. For example, if you are adding a Gorilla Chain/Net track to the marmoset assembly, you would need to edit the marmoset trackDb.chainNet.ra file.

Location of the file

The path to trackDb is /kent/src/hg/makeDb/trackDb, but for simplicity's sake, when referring to trackDb, I will just say /trackDb.

Each organism actually typically has two trackDb.chainNet.ra files. One is located in the /trackDb/[organism] directory and one is located in the /trackDb/[organism]/$db directory. For example, the latest marmoset assembly has a file in /trackDb/marmoset and a file in /trackDb/marmoset/calJac3. The file in /trackDb/marmoset/calJac3 applies only to calJac3 whereas the file in /trackDb/marmoset applies to calJac1 and calJac3 and also any future release of the marmoset. Unless there is a specific reason to put an entry into the file at the database-level directory, it is ok to make additions to the file at the organism-level directory and you will likely notice that the latter typically contains most if not all of the entries.

Structure of the file

The trackDb.chainNet.ra file contains stanzas that represent chain/net tracks for other organisms. Here is an example of some stanzas you might see in a trackDb.chainNet.ra file:

track chainNetGorGor3 override
priority 220.3
matrix 16 90,-330,-236,-356,-330,100,-318,-236,-236,-318,100,-330,-356,-236,-330,90

track chainNetGorGor1 override
priority 220.3
shortLabel $o_db Chain/Net

track chainNetPonAbe2 override
priority 230.3
chainMinScore 5000

track chainNetNomLeu1 override
priority 235.3
matrix 16 90,-330,-236,-356,-330,100,-318,-236,-236,-318,100,-330,-356,-236,-330,90
chainMinScore 5000
chainLinearGap medium

Note that not each stanza contains the same entries, so there are no rules as to which entries must be present in an individual stanza. The order of entries in a stanza is unimportant. At a minimum, there should be a stanza declaration in the form of "track chainNet$db override" and a priority value. Beyond that, you only need to add whatever entries are required for the chain/net track to be correct. If you need to add a new stanza, before you do so, be sure the stanza you intend to add does not already exist in either trackDb.chainNet.ra file.

priority

This value represents the evolutionary distance between the two organisms being compared and also determines the order of the track listings on the Browser page. Note in the example above that gorGor1 and gorGor3 share the same priority value (since they are just different releases of the same organism) which differs from the priority values for ponAbe2 and nomLeu1. When adding a new stanza where a stanza from a previous assembly exists, you can just copy the priority value from the previous assembly's stanza. If you are adding a new stanza where a stanza from a previous assembly does not exist and you are unsure of what priority value you should assign, you can run the following command:

hgwdev> chainNet.pl $db

So if you were adding chain/net tracks to the marmoset assembly and you wanted to list the priority values, you would type "chainNet.pl calJac3".

chainMinScore and chainLinearGap

These values represent the output of getChainLines.csh. If the output of getChainLines.csh does not match the description page, you need to add one or both of these lines to the override stanza. When in doubt, always trust the output of getChainLines.csh.

matrix

This value represents the output of getMatrixLines.csh. If the output of getMatrixLines.csh does not match the description page, you need to add this line to the override stanza. The simplest thing to do is to just paste the output of getMatrixLines.csh directly into trackDb.chainNet.ra. When in doubt, always trust the output of getMatrixLines.csh.

shortLabel

This value represents the track name that is displayed in the Browser's track listings. When no shortLabel entry exists, the track name defaults to include the organism name. In the example above, there is no shortLabel entry in the nomLeu1 override stanza, so the track name would be displayed as "Gibbon Chain/Net". In the gorGor1 override stanza, the $o_db tag in the shortLabel entry forces that track name to be displayed as "gorGor1 Chain/Net".

Replacing an old assembly with a new assembly

When a new assembly is released that replaces an old one, the old chain/net tables (and thus the old chain/net track) are removed from beta and the RR once it is verified that the new chain/net track functions properly. The old tables and track are left on dev, however, resulting in multiple chain/net tracks from the same organism in the track listing. As a rule, only the most recent assembly's chain/net track should bear the organism name in the track name, so in the case of gorGor1 and gorGor3, the gorGor1 track should be displayed as "gorGor1 Chain/Net" and the gorGor3 track should be displayed as "Gorilla Chain/Net". Any time a new assembly chain/net track is added, a new override stanza should be added for the new assembly and the previous assembly's stanza should have a shortLabel entry added to maintain this convention.

In a case where there are multiple chain/net tracks from the same organism, each same-organism track should appear sequentially in the track listing. If they do not, the priority value is most likely either missing or incorrect for one or more of the override stanzas.