Mm9 multiple alignment: Difference between revisions
Line 31: | Line 31: | ||
<TH>% of other<BR>matched by mm9</TH> | <TH>% of other<BR>matched by mm9</TH> | ||
<TH>done</TH> | <TH>done</TH> | ||
</TR> | |||
<TR> | |||
<TH>mouse mm9</TH> | |||
<TD>21</TD> | |||
<TD>2654 Mb</TD> | |||
<TD>0.0</TD> | |||
<TD> </TD> | |||
<TD> </TD> | |||
<TD> </TD> | |||
<TD> </TD> | |||
<TD> </TD> | |||
</TR> | </TR> | ||
Line 36: | Line 48: | ||
<TH>rat rn4</TH> | <TH>rat rn4</TH> | ||
<TD>21</TD> | <TD>21</TD> | ||
<TD> | <TD>2718 Mb</TD> | ||
<TD>0.160657</TD> | <TD>0.160657</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 48: | Line 60: | ||
<TH>human hg18</TH> | <TH>human hg18</TH> | ||
<TD>24</TD> | <TD>24</TD> | ||
<TD> | <TD>3080 Mb</TD> | ||
<TD>0.452619</TD> | <TD>0.452619</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 60: | Line 72: | ||
<TH>Rhesus rheMac2</TH> | <TH>Rhesus rheMac2</TH> | ||
<TD>22</TD> | <TD>22</TD> | ||
<TD> | <TD>2864 Mb</TD> | ||
<TD>0.452745</TD> | <TD>0.452745</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 72: | Line 84: | ||
<TH>Orangutan ponAbe1</TH> | <TH>Orangutan ponAbe1</TH> | ||
<TD>79553</TD> | <TD>79553</TD> | ||
<TD> | <TD>3240 Mb</TD> | ||
<TD>0.453809</TD> | <TD>0.453809</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 84: | Line 96: | ||
<TH>Marmoset calJac1</TH> | <TH>Marmoset calJac1</TH> | ||
<TD>49724</TD> | <TD>49724</TD> | ||
<TD> | <TD>3029 Mb</TD> | ||
<TD>0.454272</TD> | <TD>0.454272</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 96: | Line 108: | ||
<TH>Chimp panTro2</TH> | <TH>Chimp panTro2</TH> | ||
<TD>25</TD> | <TD>25</TD> | ||
<TD> | <TD>3175 Mb</TD> | ||
<TD>0.454514</TD> | <TD>0.454514</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 108: | Line 120: | ||
<TH>GuineaPig cavPor2</TH> | <TH>GuineaPig cavPor2</TH> | ||
<TD>295514</TD> | <TD>295514</TD> | ||
<TD> | <TD>3403 Mb</TD> | ||
<TD>0.479871</TD> | <TD>0.479871</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 120: | Line 132: | ||
<TH>Horse equCab1</TH> | <TH>Horse equCab1</TH> | ||
<TD>32</TD> | <TD>32</TD> | ||
<TD> | <TD>2056 Mb</TD> | ||
<TD>0.479871</TD> | <TD>0.479871</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 132: | Line 144: | ||
<TH>TreeShrew tupBel1</TH> | <TH>TreeShrew tupBel1</TH> | ||
<TD>150851</TD> | <TD>150851</TD> | ||
<TD> | <TD>3660 Mb</TD> | ||
<TD>0.494934</TD> | <TD>0.494934</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 144: | Line 156: | ||
<TH>Bushbaby otoGar1</TH> | <TH>Bushbaby otoGar1</TH> | ||
<TD>120882</TD> | <TD>120882</TD> | ||
<TD> | <TD>3420 Mb</TD> | ||
<TD>0.498957</TD> | <TD>0.498957</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 156: | Line 168: | ||
<TH>Armadillo dasNov1</TH> | <TH>Armadillo dasNov1</TH> | ||
<TD>304391</TD> | <TD>304391</TD> | ||
<TD> | <TD>3856 Mb</TD> | ||
<TD>0.517360</TD> | <TD>0.517360</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 168: | Line 180: | ||
<TH>Rabbit oryCun1</TH> | <TH>Rabbit oryCun1</TH> | ||
<TD>215471</TD> | <TD>215471</TD> | ||
<TD> | <TD>3464 Mb</TD> | ||
<TD>0.519779</TD> | <TD>0.519779</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 180: | Line 192: | ||
<TH>Cat felCat3</TH> | <TH>Cat felCat3</TH> | ||
<TD>217790</TD> | <TD>217790</TD> | ||
<TD> | <TD>4045 Mb</TD> | ||
<TD>0.530610</TD> | <TD>0.530610</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 192: | Line 204: | ||
<TH>Dog canFam2</TH> | <TH>Dog canFam2</TH> | ||
<TD>39</TD> | <TD>39</TD> | ||
<TD> | <TD>2445 Mb</TD> | ||
<TD>0.533544</TD> | <TD>0.533544</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 204: | Line 216: | ||
<TH>Elephant loxAfr1</TH> | <TH>Elephant loxAfr1</TH> | ||
<TD>233134</TD> | <TD>233134</TD> | ||
<TD> | <TD>3707 Mb</TD> | ||
<TD>0.536627</TD> | <TD>0.536627</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 216: | Line 228: | ||
<TH>Cow bosTau3</TH> | <TH>Cow bosTau3</TH> | ||
<TD>30</TD> | <TD>30</TD> | ||
<TD> | <TD>2434 Mb</TD> | ||
<TD>0.540852</TD> | <TD>0.540852</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 228: | Line 240: | ||
<TH>Hedgehog eriEur1</TH> | <TH>Hedgehog eriEur1</TH> | ||
<TD>379801</TD> | <TD>379801</TD> | ||
<TD> | <TD>3367 Mb</TD> | ||
<TD>0.632457</TD> | <TD>0.632457</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 240: | Line 252: | ||
<TH>Shrew sorAra1</TH> | <TH>Shrew sorAra1</TH> | ||
<TD>262057</TD> | <TD>262057</TD> | ||
<TD> | <TD>2936 Mb</TD> | ||
<TD>0.658734</TD> | <TD>0.658734</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 252: | Line 264: | ||
<TH>Tenrec echTel1</TH> | <TH>Tenrec echTel1</TH> | ||
<TD>325491</TD> | <TD>325491</TD> | ||
<TD> | <TD>3823 Mb</TD> | ||
<TD>0.666303</TD> | <TD>0.666303</TD> | ||
<TD>3000</TD> | <TD>3000</TD> | ||
Line 264: | Line 276: | ||
<TH>Opossum monDom4</TH> | <TH>Opossum monDom4</TH> | ||
<TD>9</TD> | <TD>9</TD> | ||
<TD> | <TD>3431 Mb</TD> | ||
<TD>0.909852</TD> | <TD>0.909852</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 276: | Line 288: | ||
<TH>Platypus ornAna1</TH> | <TH>Platypus ornAna1</TH> | ||
<TD>201522</TD> | <TD>201522</TD> | ||
<TD> | <TD>1996 Mb</TD> | ||
<TD>1.165888</TD> | <TD>1.165888</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 288: | Line 300: | ||
<TH>Chicken galGal3</TH> | <TH>Chicken galGal3</TH> | ||
<TD>33</TD> | <TD>33</TD> | ||
<TD> | <TD>1032 Mb</TD> | ||
<TD>1.285399</TD> | <TD>1.285399</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 300: | Line 312: | ||
<TH>Lizard anoCar1</TH> | <TH>Lizard anoCar1</TH> | ||
<TD>7233</TD> | <TD>7233</TD> | ||
<TD> | <TD>1781 Mb</TD> | ||
<TD>1.404225</TD> | <TD>1.404225</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 312: | Line 324: | ||
<TH>X. tropicalis xenTro2</TH> | <TH>X. tropicalis xenTro2</TH> | ||
<TD>19759</TD> | <TD>19759</TD> | ||
<TD> | <TD>1513 Mb</TD> | ||
<TD>1.726205</TD> | <TD>1.726205</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 324: | Line 336: | ||
<TH>Stickleback gasAcu1</TH> | <TH>Stickleback gasAcu1</TH> | ||
<TD>21</TD> | <TD>21</TD> | ||
<TD> | <TD>400 Mb</TD> | ||
<TD>2.012649</TD> | <TD>2.012649</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 336: | Line 348: | ||
<TH>Zebrafish danRer4</TH> | <TH>Zebrafish danRer4</TH> | ||
<TD>25</TD> | <TD>25</TD> | ||
<TD> | <TD>1547 Mb</TD> | ||
<TD>2.027153</TD> | <TD>2.027153</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 348: | Line 360: | ||
<TH>Tetraodon tetNig1</TH> | <TH>Tetraodon tetNig1</TH> | ||
<TD>21</TD> | <TD>21</TD> | ||
<TD> | <TD>217 Mb</TD> | ||
<TD>2.051015</TD> | <TD>2.051015</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 360: | Line 372: | ||
<TH>Fugu fr2</TH> | <TH>Fugu fr2</TH> | ||
<TD>1</TD> | <TD>1</TD> | ||
<TD> | <TD>400 Mb</TD> | ||
<TD>2.086669</TD> | <TD>2.086669</TD> | ||
<TD>5000</TD> | <TD>5000</TD> | ||
Line 372: | Line 384: | ||
<TH>Medaka oryLat1</TH> | <TH>Medaka oryLat1</TH> | ||
<TD>24</TD> | <TD>24</TD> | ||
<TD> | <TD>724 Mb</TD> | ||
<TD>2.200402</TD> | <TD>2.200402</TD> | ||
<TD>5000</TD> | <TD>5000</TD> |
Revision as of 20:21, 21 August 2007
Mouse Mm9 multiple alignment/conservation track
To avoid artifacts in downstream processing of the UCSC multiple alignments, it is important to be careful on the use of the parameters used in the blastz processing pipeline. There are a number of steps in the pipeline and a variety of tunable parameters involved. This page will track the various parameters used in the alignments as they proceed toward the completion of a multiple alignment conservation track on the mm9 mouse (NCBI build 37) assembly The chrom count in the table below does not include haplotypes, chr*_random, chrUn or chrM unless chrUn or scaffolds are the only sequences for that assembly. The genome size has the same limitation as the chrom count, no randoms. Tree distances are from the hg18 28-way measurements, with ponAbe1 and calJac1 manually inserted into the tree. I believe we use "syntenic" nets on the organisms that are assembled into chromosomes. |
axtChain parameters and end results
name db | chrom count (*) |
genome size |
tree distance |
axtChain minScore |
axtChain linearGap |
% of mm9 matched |
% of other matched by mm9 |
done |
---|---|---|---|---|---|---|---|---|
mouse mm9 | 21 | 2654 Mb | 0.0 | |||||
rat rn4 | 21 | 2718 Mb | 0.160657 | 3000 | medium | 68.357 | 69.541 | 16 August |
human hg18 | 24 | 3080 Mb | 0.452619 | 3000 | medium | 38.499 | 35.201 | 16 August |
Rhesus rheMac2 | 22 | 2864 Mb | 0.452745 | 3000 | medium | xx.123 | xx.456 | tbd |
Orangutan ponAbe1 | 79553 | 3240 Mb | 0.453809 | 3000 | medium | xx.123 | xx.456 | tbd |
Marmoset calJac1 | 49724 | 3029 Mb | 0.454272 | 3000 | medium | xx.123 | xx.456 | tbd |
Chimp panTro2 | 25 | 3175 Mb | 0.454514 | 3000 | medium | xx.123 | xx.456 | tbd |
GuineaPig cavPor2 | 295514 | 3403 Mb | 0.479871 | 3000 | medium | xx.123 | xx.456 | tbd |
Horse equCab1 | 32 | 2056 Mb | 0.479871 | 3000 | medium | xx.123 | xx.456 | tbd |
TreeShrew tupBel1 | 150851 | 3660 Mb | 0.494934 | 3000 | medium | xx.123 | xx.456 | tbd |
Bushbaby otoGar1 | 120882 | 3420 Mb | 0.498957 | 3000 | medium | xx.123 | xx.456 | tbd |
Armadillo dasNov1 | 304391 | 3856 Mb | 0.517360 | 3000 | medium | xx.123 | xx.456 | tbd |
Rabbit oryCun1 | 215471 | 3464 Mb | 0.519779 | 3000 | medium | xx.123 | xx.456 | tbd |
Cat felCat3 | 217790 | 4045 Mb | 0.530610 | 3000 | medium | xx.123 | xx.456 | tbd |
Dog canFam2 | 39 | 2445 Mb | 0.533544 | 3000 | medium | xx.123 | xx.456 | tbd |
Elephant loxAfr1 | 233134 | 3707 Mb | 0.536627 | 3000 | medium | xx.123 | xx.456 | tbd |
Cow bosTau3 | 30 | 2434 Mb | 0.540852 | 3000 | medium | xx.123 | xx.456 | tbd |
Hedgehog eriEur1 | 379801 | 3367 Mb | 0.632457 | 3000 | medium | xx.123 | xx.456 | tbd |
Shrew sorAra1 | 262057 | 2936 Mb | 0.658734 | 3000 | medium | xx.123 | xx.456 | tbd |
Tenrec echTel1 | 325491 | 3823 Mb | 0.666303 | 3000 | medium | xx.123 | xx.456 | tbd |
Opossum monDom4 | 9 | 3431 Mb | 0.909852 | 5000 | loose | xx.123 | xx.456 | tbd |
Platypus ornAna1 | 201522 | 1996 Mb | 1.165888 | 5000 | loose | xx.123 | xx.456 | tbd |
Chicken galGal3 | 33 | 1032 Mb | 1.285399 | 5000 | loose | xx.123 | xx.456 | tbd |
Lizard anoCar1 | 7233 | 1781 Mb | 1.404225 | 5000 | loose | xx.123 | xx.456 | tbd |
X. tropicalis xenTro2 | 19759 | 1513 Mb | 1.726205 | 5000 | loose | xx.123 | xx.456 | tbd |
Stickleback gasAcu1 | 21 | 400 Mb | 2.012649 | 5000 | loose | xx.123 | xx.456 | tbd |
Zebrafish danRer4 | 25 | 1547 Mb | 2.027153 | 5000 | loose | xx.123 | xx.456 | tbd |
Tetraodon tetNig1 | 21 | 217 Mb | 2.051015 | 5000 | loose | xx.123 | xx.456 | tbd |
Fugu fr2 | 1 | 400 Mb | 2.086669 | 5000 | loose | xx.123 | xx.456 | tbd |
Medaka oryLat1 | 24 | 724 Mb | 2.200402 | 5000 | loose | xx.123 | xx.456 | tbd |
(*) chrom count does not include haplotypes, chr*_random, chrUn or chrM unless chrUn or scaffolds are the only sequences for that assembly.
The genome size has the same limitation as the chrom count, no randoms.
Tree distances are from the hg18 28-way measurements, with ponAbe1 and calJac1 manually inserted into the tree.
blastz alignment parameters details
query | abridged repeats |
M | K | L | Q | Y |
---|---|---|---|---|---|---|
Rat rn4 | yes | 40M | 3K | 3K | default | 9400 |
Human hg18 | yes | 40M | 3K | 3K | default | 9400 |
Rhesus rheMac2 | no | 40M | 3K | 3K | default | 9400 |
Orangutan ponAbe1 | no | 50 | 3K | 3K | default | 9400 |
Marmoset calJac1 | no | 50 | 3K | 3K | default | 9400 |
Chimp panTro2 | yes | 40M | 3K | 3K | default | 9400 |
GuineaPig cavPor2 | no | 50 | 3K | 3K | default | 9400 |
Horse equCab1 | no | 40M | 3K | 3K | default | 9400 |
TreeShrew tupBel1 | no | 50 | 3K | 3K | default | 9400 |
Bushbaby otoGar1 | no | 50 | 3K | 3K | default | 9400 |
Armadillo dasNov1 | no | 50 | 3K | 3K | default | 9400 |
Rabbit oryCun1 | no | 50 | 3K | 3K | default | 9400 |
Cat felCat3 | no | 50 | 3K | 3K | default | 9400 |
Dog canFam2 | yes | 40M | 3K | 3K | default | 9400 |
Elephant loxAfr1 | no | 50 | 3K | 3K | default | 9400 |
Cow bosTau3 | no | 40M | 3K | 3K | default | 9400 |
Hedgehog eriEur1 | no | 50 | 3K | 3K | default | 9400 |
Shrew sorAra1 | no | 50 | 3K | 3K | default | 9400 |
Tenrec echTel1 | no | 50 | 3K | 3K | default | 9400 |
Opossum monDom4 | no | 50 | 2200 | 6000 | HoxD55 | 3400 |
Platypus ornAna1 | no | 50 | 2200 | 6000 | HoxD55 | 3400 |
Chicken galGal3 | yes | 40M | 2200 | 6000 | HoxD55 | 3400 |
Lizard anoCar1 | no | 50 | 2200 | 6000 | HoxD55 | 3400 |
X_tropicalis xenTro2 | no | 50 | 2200 | 6000 | HoxD55 | 3400 |
Stickleback gasAcu1 | no | 40M | 2200 | 6000 | HoxD55 | 3400 |
Zebrafish danRer4 | yes | 40M | 2200 | 6000 | HoxD55 | 3400 |
Tetraodon tetNig1 | no | 40M | 2200 | 6000 | HoxD55 | 3400 |
Fugu fr2 | no | 40M | 2200 | 6000 | HoxD55 | 3400 |
Medaka oryLat1 | no | 40M | 2200 | 6000 | HoxD55 | 3400 |
default blastz parameters
m=80 v=0 B=2 C=0 E=30 G=0 H=0 K=3000 L=K M=0 O=400 P=1 R=0 T=1 W=8 X=10*(A-to-A match score) Y=O+300*E Z=1 From the blastz usage message: Default values are given in parentheses. m(80M) bytes of space for trace-back information v(0) 0: quiet; 1: verbose progress reports to stderr B(2) 0: single strand; >0: both strands C(0) 0: no chaining; 1: just output chain; 2: chain and extend; 3: just output HSPs E(30) gap-extension penalty. G(0) diagonal chaining penalty. H(0) interpolate between alignments at threshold K = argument. K(3000) threshold for MSPs L(K) threshold for gapped alignments M(0) mask any base in seq1 hit this many times; 0 = no dynamic masking O(400) gap-open penalty. P(1) 0: entropy not used; 1: entropy used; >1 entropy with feedback. Q load the scoring matrix from a file. R(0) antidiagonal chaining penalty. T(1) 0: W-bp words; 1: 12of19; 2: 12of19 without transitions. 3: 14of22; 4: 14of22 without transitions. W(8) word size (unused unless T=0) X(10*(A-to-A match score)) X-drop parameter for ungapped extension. Y(O+300E) X-drop parameter for gapped extension. Z(1) increment between successive words in sequence 1.
The default scoring matrix is: | The HoxD55 scoring matrix is: | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
matrix parameters
The "medium" gap score matrix, tuned for the mouse-human distance is:
tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300
The "loose" gap score matrix, tuned for the chicken-human distance is:
tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000
the tree diagram
(((((((( (((Mouse_mm9:0.076274,Rat_rn4:0.084383):0.200607, GuineaPig_cavPor2:0.202990):0.034350, Rabbit_oryCun1:0.208548):0.014587, ((((((Human_hg18:0.005873,Chimp_panTro2:0.007668):0.013037, Orangutan_ponAbe1:0.02):0.013037,Rhesus_rheMac2:0.031973):0.0365, Marmoset_calJac1:0.07):0.0365,Bushbaby_otoGar1:0.151185):0.015682, TreeShrew_tupBel1:0.162844):0.006272):0.019763, ((Shrew_sorAra1:0.248532,Hedgehog_eriEur1:0.222255):0.045693, (((Dog_canFam2:0.101137,Cat_felCat3:0.098203):0.048213, Horse_equCab1:0.099323):0.007287, Cow_bosTau3:0.163945):0.012398):0.018928):0.030081, (Armadillo_dasNov1:0.133274,(Elephant_loxAfr1:0.103030, Tenrec_echTel1:0.232706):0.049511):0.008424):0.213469, Opossum_monDom4:0.320721):0.088647, Platypus_ornAna1:0.488110):0.118797, (Chicken_galGal3:0.395136,Lizard_anoCar1:0.513962):0.093688):0.151358, Frog_xenTro2:0.778272):0.174596, (((Tetraodon_tetNig1:0.203933,Fugu_fr2:0.239587):0.203949, (Stickleback_gasAcu1:0.314162,Medaka_oryLat1:0.501915):0.055354):0.346008, Zebrafish_danRer4:0.730028):0.174596);