Editing the human trackDb.ra file

From Genecats
Jump to navigationJump to search

This might be out of date as include statements often control these stanzas now.

Human assemblies do not display chain/net tracks in the same way that other assemblies do. In most assemblies, the chain/net tracks are listed individually in the Browser's track listings. In human assemblies, there are three composite tracks called "Primate Chain/Net", "Placental Chain/Net" and "Vertebrate Chain/Net". In order to add new chain/net tracks to these composite tracks and so they don't appear individually, you must edit the human trackDb.ra file.

Location of the file

The path to trackDb is /kent/src/hg/makeDb/trackDb, but for simplicity's sake, when referring to trackDb, I will just say /trackDb.

As with trackDb.chainNet.ra, there are two versions of trackDb.ra. One is located in the /trackDb/[organism] directory and one is located in the /trackDb/[organism]/$db directory. For example, the latest human assembly has a file in /trackDb/human and a file in /trackDb/human/hg19. The file in /trackDb/human/hg19 applies only to hg19 whereas the file in /trackDb/human applies to any previous human assembly, to hg19 and also any future human assembly. Unlike with trackDb.chainNet.ra, you want to make sure that you edit the version of trackDb.ra at the database-level directory.

Structure of the file

The trackDb.ra file is long and complex, but there are only three sections to be concerned with:

  • primateChainNet
  • placentalChainNet
  • vertebrateChainNet

The organism you are adding belongs in one of those three sections. Regardless of which section the organism belongs in, there are three entries that need to be modified/added depending on whether you're adding an assembly to replace an older one or you're adding a brand new assembly that does not yet exist in trackDb.ra:

  • The subGroup2 line
  • The track chain$db entry
  • The track net$db entry

Adding an assembly that replaces an old assembly

Adding an assembly to replace an old one is fairly simple. As an example, let's assume you're replacing bosTau4 with bosTau6.

1. Search the file for "Cow" which should bring you to the top of the placentalChainNet section. You will see that Cow already conveniently exists in the subGroup2 line, so you don't need to add or change anything there:

track placentalChainNet
compositeTrack on
shortLabel Placental Chain/Net
longLabel Non-primate Placental Mammal Genomes, Chain and Net Alignments
subGroup1 view Views chain=Chains net=Nets
subGroup2 species Species aTree_Shrew=Tree_Shrew bMouse=Mouse cRat=Rat dKangaroo=Kangaroo_Rat eGuinea=Guinea_Pig fSquirrel=Squirrel gRabbit=Rabbit
hPika=Pika iAlpaca=Alpaca jDolphin=Dolphin kSheep=Sheep lCow=Cow mPig=Pig nHorse=Horse oCat=Cat pDog=Dog qPanda=Panda rMicrobat=Microbat sMegabat=Megabat
tHedgehog=Hedgehog uShrew=Shrew vElephant=Elephant wRock_Hyrax=Rock_Hyrax xTenrec=Tenrec yArmadillo=Armadillo zSloth=Sloth
subGroup3 clade Clade aEuarch=Euarchontoglires bLaura=Laurasiatheria cAfro=Afrotheria dXenar=Xenarthra

2. Examine the actual Placental Chain/Net description page in the Browser. Notice in the track listing below the box containing all of the organisms, the track listing for Cow should say bosTau4.

3. Scroll down to the track chainBosTau4 entry and replace every instance of "bosTau4" with "bosTau6" - there are only 3 instances in this entry:

track chainBosTau4
subTrack placentalChainNetViewchain off
subGroups view=chain species=lCow clade=bLaura
shortLabel $o_Organism Chain
longLabel $o_Organism ($o_date) Chained Alignments
type chain bosTau4
otherDb bosTau4

4. Scroll down to the track netBosTau4 entry and again replace every instance of "bosTau4" with "bosTau6" - there are 4 instances in this entry:

track netBosTau4
subTrack placentalChainNetViewnet on
subGroups view=net species=lCow clade=bLaura
shortLabel $o_Organism Net
longLabel $o_Organism ($o_date) Alignment Net
type netAlign bosTau4 chainBosTau4
otherDb bosTau4

5. Save the file and do a make alpha.

6. Go back to the Placental Chain/Net description page and refresh the page. Verify that bosTau6 now appears in the track listing.

7. Go back to the main hg19 Browser page and verify that neither bosTau4 nor bosTau6 (or Cow) appear as individual Chain/Net tracks on that page.

8. If everything looks good, commit and push your changes.

Adding a new assembly that does not exist in trackDb.ra

This is a bit more complicated than simply modifying entries that already exist in the file. As an example, let's say we're adding nomLeu1.

1. Determine which of the 3 sections your organism belongs in - primate, placental or vertebrate. It should be fairly obvious, but if you're not sure, ask someone. In this instance, we're adding Gibbon which belongs in the primate section.

2. Search the file for "primateChainNet" which should take you to the top of the primateChainNet section. Gibbon will not appear in the subGroup2 line, so it will need to be added. The key here is to add it in the proper place in the list. The order of the list is determined by the phylogenetic tree generated by the data in /kent/src/hg/utils/phyloTrees/##way.nh where ## represents the highest number in the file list in that directory.

3. Browse to the phyloTrees directory and enter the command "cat ##way.nh". Copy the output.

4. Go to the Phylogenetic Tree Gif Maker utility at http://genome.ucsc.edu/cgi-bin/phyloGif. Change both the width and height to 1000 so the resulting tree is actually visible. Clear any existing text from the large text box and then paste the output from step 3 into it. Click the submit button to generate the tree.

5. The order of organisms listed on the right side of the tree is the order in which they should appear in the composite Chain/Net tracks.

6. Now go back to the trackDb.ra file and insert Gibbon into the subGroup2 line. In the tree, nomLeu1 appears between ponAbe2 and rheMac2, so Gibbon should be inserted between Orangutan and Rhesus. Note that Gibbon now becomes "d" so Rhesus becomes "e", Baboon becomes "f", etc.

subGroup2 species Species aChimp=Chimp bGorilla=Gorilla cOrangutan=Orangutan dRhesus=Rhesus eBaboon=Baboon fMarmoset=Marmoset gTarsier=Tarsier hMouse_lemur=Mouse_lemur iBushbaby=Bushbaby

subGroup2 species Species aChimp=Chimp bGorilla=Gorilla cOrangutan=Orangutan dGibbon=Gibbon eRhesus=Rhesus fBaboon=Baboon gMarmoset=Marmoset hTarsier=Tarsier iMouse_lemur=Mouse_lemur jBushbaby=Bushbaby

7. Next, you will need to add a track chainNomLeu1 entry. The easiest thing to do is to just copy an existing entry from another organism. The important thing here is to make sure the organism you're adding is assigned to the proper clade (in the subGroups line). Examine the actual Primate Chain/Net description page and notice how the clades are grouped together as you go down the list. The clade will usually be determined by an organism's position in the list, but if you're not sure, ask someone.

track chainNomLeu1
subTrack primateChainNetViewchain off
subGroups view=chain species=dGibbon clade=aHom
shortLabel $o_Organism Chain
longLabel $o_Organism ($o_date) Chained Alignments
type chain nomLeu1
otherDb nomLeu1

8. Repeat step 7 to add a track netNomLeu1 entry.

track netNomLeu1
subTrack primateChainNetViewnet off
subGroups view=net species=dGibbon clade=aHom
shortLabel $o_Organism Net
longLabel $o_Organism ($o_date) Alignment Net
type netAlign nomLeu1 chainNomLeu1
otherDb nomLeu1

9. If you copied the track chainNomLeu1 and track netNomLeu1 entries from another organism's entry, be sure to replace every instance of the other organism's name with nomLeu1. Be sure the clade designation is correct and be sure you change the species tag to "dGibbon".

10. Note the subTrack line of both the track chainNomLeu1 and track netNomLeu1 entries. The "on" or "off" switch at the end of the line determines whether or not the checkboxes are checked by default in the Primate Chain/Net description page.

11. Go through the list of track chain$db entries and be sure to change the species tag in the subGroups line to the correct letter for any species whose letter was changed by the insertion of Gibbon in step 6.

12. Repeat step 11 for the list of track net$db entries.

13. Save the file and do a make alpha.

14. Go back to the Primate Chain/Net description page and refresh the page. Verify that Gibbon now appears in the box containing all of the organisms at the top of the page. Verify that it appears in the correct place in the box. Verify that Gibbon now appears in the correct place in the track listing below the box.

15. Go back to the main hg19 Browser page and verify that nomLeu1 (or Gibbon) does not appear as an individual Chain/Net track on that page.

16. If everything looks good, commit and push your changes.