Pegasoferae?

From genomewiki
Revision as of 15:00, 4 May 2008 by Tomemerald (talk | contribs)
Jump to navigationJump to search

Can rare genomic events establish Pegasoferae?

Pegasoferae is a novel proposal for the phylogenetic ordering within Laurasiatheres, grouping bats, perissodactyls, and carnivores to the exclusion of the other hoofed mammalian group artiodactyls. Bats have been placed in many previous locations, notably in the Euarchonta wing (with primates). While that particular idea is clearly refuted by many lines of evidence, the proper placement of bats remains under discussion.

Pegasoferae.png

Rare genomic events may be more useful for this than maximal likelihood because the orders of Laurasiatheres may have diverged relatively rapidly. Retroposon events are so numerous per million years however that they may be able to resolve branching at these tight nodes. However they suffer from homoplasy in that separate insertion events from a given parental element can look very similar and because deletions over time (no selection for their retention) can cause their disappearance and so confusion with lineages that never had the insertion.

Qualifying retroposons need to be situated between two well-conserved flanking markers because orthology is otherwise difficult to decisively establish in intergenic regions. These markers ideally are no more than 1500bp apart to allow tiling of traces for species without assemblies (eg vicugna, pig, dolphin, macrobat in Laurasiatheres) and spanning PCR runs. Higher sampling density greatly enhances the ability to correctly infer the sequence of events.

Short coding indels in coding exons can also be phylogenetically informative. Here if the exon is otherwise quite conserved, the risk of homoplasy (recurrent events at the same or indistinguishably similar position) is fairly low. These events are inherantly rare first because conserved regions of a protein may not admit indels structurally (ie are inactivated) and second because the window of relevancy for a given tree topology issue may only be a small fraction of elapsed evolutionary time (eg 1 million year stem on a 85 myr branch).

Coding indels can exhibit the usual problems of lineage-sorting: two co-existing alleles at the time of speciation that resolve differently in descendent lineages. Insertions, while a third as common as deletions and so less likely to have arise multiple times, are more subject to subsequent confusing reversion; deletions are less likely to revert to ancestral length for lack of genetic mechanism. It goes without saying that indels from repetitive regions or in dna of anomalous composition are wholly unsuitable for taxonomic purposes.

Analysis of L1MA9 retroposon INT189

The phlogenetic distribution of the L1MA9 retroposon INT189 has been taken as evidence for bats being the immediate outgroup of horse + dog. That interpretation can be revisited using newly available genomes. Yet only two sequences representing perissodactyl and carnivore are at GenBank as cat assembly has a gap in the critical region. But other new data in 3 bats and 4 cetartiodactyls and 2 shrew/hedgehog confirm the lack of L1MA9 near the distal exon.

The trouble is a second L1MA9 element lies upstream of the MER58A middle marker. This is lacking in both carnivores. Evidently it was deleted in stem carnivore -- otherwise it would be providing evidence for carnivores being outgroup to cows + bats + horse. In short this single intron is providing 'support' for two contradictory topologies.

The sizes of many bat genomes have been experimentally determined: the 30-genus average of 2.6 gbp is about 500,000,000 bp less than human. Since bats in essence have the same 20,000 coding genes as other mammals, that discrepancy has to arise from less intronic and intergenic dna. Possibly bats had fewer active retroposing elements. Far more likely, bats they have an average number and the discrepancy arises from a faster rate of deletions than insertions.

Thus for taxonomically informative (ancestral laurasiathere) retroposons, many millions of deletion events have occured. Since the L1MA9 elements here are only 100bp or so, it would come as no surprise if a high percentage of the older relevent ones have experienced partial (or full) deletions making them unrecognizable with RepeatMasker.

Thus presence of a retroposon in a given orthologous position bat can be informative but absence is not so informative. INT189 is an absence. That one event isn't insufficient anyway to establish branching order. So bat/horse/carnivore tree topology remains unresolved. If horse is the outgroup to carnivore + bat -- and cow outgroup to all of these -- then hoofed animals are parsimoniously ancestral (rather than arising twice by convergent evolution) and bat and carnivore lost hooves (a bit unreasonable as dog and bats retain the ancestral 5 digits).


Summary of the phylogenetic distribution of the L1MA9 retroposon INT189:

>PGM2_canFam Canis familiaris (dog)           abseny               -MER58 182-265 23% -L1MA9 6069-6302 27%
>PGM2_felCat Felis catus (cat)   genomic del  absent               -MER58              no data
>PGM2_equCab Equus caballus (horse)          -L1MA9 6172-6264 26%  -MER58A  1-145 23% -L1MA9 6050-6302 23%
>PGM2_myoLuc Myotis lucifugus (microbat)     -L1MA9 6174-6264 20%  -MER58A 38-157 26%
>PGM2_pteVam Pteropus vampyrus (macrobat)    -L1MA9 6161-6291 25%  -MER58A 35-145 29%
>PGM2_pipAbr Pipistrellus abramus (microbat) -L1MA9 6180-6301 252% -MER58A 38-157 24%
>PGM2_bosTau Bos taurus (cow)                -L1MA9 6155-6263 28%  -MER58A  7-157 21%
>PGM2_turTru Tursiops truncatus (dolphin)    -L1MA9 6155-6265 29%  -MER58A 37-148 21%
>PGM2_susScr Sus scrofa (pig) cdna + tiled   -L1MA9 6159-6264 27%  -MER58 212-271 28%
>PGM2_vicVic Vicugna vicugna (vicugna) tiled -L1MA9 6162-6310 24%  -MER58A 35-157 20%
>PGM2_ateAlb Atelerix albiventris (hedgehog) ...                   ...
>PGM2_sorAra Sorex araneus (shrew)           ...                   ...
>PGM2_canFam Canis familiaris (dog) -MER58 182-265 23% -L1MA9 6069-6302 27% VISAELASFLATKNLSLSQQLKAIYVE YGYHITKASYFICHDQGTIKKLFENLRNY 
GTCATCAGCGCCGAGTTGGCTAGCTTTCTAGCAACCAAGAATTTGTCTTTGTCTCAGCAGCTAAAGGCCATTTATGTTGAGTACGTTTCTATTAACTCTG 
TTTAATTGAAATAATACTTTTTAAAAGTTTTATTATGTTTTTATGTGTGACACTAATATTCTAACCCTCTTACTTTGGGTGAGGGTTCTTCTGAAAACTA 
AAGGATCACTTTTTCTTTTAATGCTTAACTATTCAATACTAATTATCACTTATGACTGTGTTAATCCTTAACAAATGAGAACATCAGTTGCAGAAATAGC 
TAATTGAGGAGGGTGATTCCCTGATGTCAGAAAGGACAAAGGTTTTCGTGAAACATCTATTACGTGTTTAGAgccactagtcaagtctgcctttgtagtg 
caaaagcagctgatggcaagacgtacaggaatgggtgtggtgtggctgcaatgaaaTGAAACTTTCACCTCCCAAGATAGGCCGAAGGCCAGGCAGCAGT 
TTGGCAATACCTGGGGTCAATAGTTATACCTCTTTTTTATGCTAAATTATTCCTTTGAAGCTAGTCATTGTTATCGTTTCATTTAGCTTAAAATATACTG 
ATTGCTACATGTTCTGTATACACCACGTGAGATTATTTGTTCCTCATTTTGCATATTTGTACTTTTtttattgagatgtaattgacattaatgtcaggta 
taataacataatgattcgatatttatatattattacaaagtgatcaccatagtaagtcgagttaacatccacaccacatataatcacaaatattcattct 
tgtgatgatagcttttatgatctgtggtcttagcaactttcaaatatacagtacaatactagtagatacagtcaccaagttatatatATATAATTTTATT 
TCTTTTGATAGATATGGCTACCATATTACCAAAGCTTCCTATTTTATCTGCCATGATCAAGGCACCATTAAAAAATTGTTTGAAAACCTTAGAAACTAC 

>PGM2_felCat Felis catus (cat) genomic del incomplete coverage -MER58 ASFLATKNLsLSQQLKAIYGE YGYRITKASYFICHDQGTIKQLFENLRNY 
GCTAGCTTTCTAGCAACCAAGAATTTGTTTGTCTCAGCAGCTAAAGGCCATCTACGGCGAGTAAGTGTCTTCTAACCTGGTAAAGAAGTAATAG 
TGTTAAATATTTTCTTATGGTTCTACGTGTGAGATATTAATATTCTTTCTAATGCTCTTTGGTTGTGAATTCTATTTCTTTTTCTTTTTTTAATGTTTAT 
TTATTTTTGAGAGAGAGAGAGAGAGAGATGGAGTATGAGCAGGGGAGGGGCAGAGAGAGAGGGAGATACAGAATCCAAAGCAGGCTCCAGGCTCTGAGCT 
GTCAGCACAGAGCTCCACACGGGGCTTAAACTCACAAACCATGAGATCATGACCTGAGCTGAAGTCAGACACTCAACCGTTTGAGCCACCCACGTGCCCC 
ATGAATTCTATTTCTTATGAAACTAAATAATCATCTTTTCTTTTGATACTTAACCATGTAATGGTAATTATCATTCACGATTGCACGAATCCTTAACAAA 
TGAGGGCATCAGTTGCAGAAATAGCTAATTGAAGAATGTGATTTTAAGTGTGTGATGTCAAAAAAGATTAAAGGTGTTCATGAAATCTCTATTAAGTTTT 
TAGAGCAATGACCCAGGTCTGCCTTTATAAAGTGCAAAAGCAGCCCGTGGCAACACGTTGCAGTAAGACTCTTACTTACAAATACAGGCTAAAGGCCAGG 
CAGCAGTTTGGCAATCCCCAGGGTTAATTGTTGTACCTCTTTTTTATGCTAAATTATTCCTTTGAAGGTACTCATGGCTATTTGTTTCATTTGGTTTAAA 
ATATACTGGTTGACAAATGTACACTGTGTGGAATTATGTGTTCCTCATTTTGCATATTTGTATTTCCTTAACTGAGATATAACTGACATTAGTTTCAGGT 
ATGCGATACAGTGTTTCAATATCTGTATATATTACAAAATGATCATCACAGTACATCTAGTAACAGTCGCACCACACTTAATACAAAAGT  
TCCaTATGGCTACCGTATTACCAAAGCTTCATATTTTATTTGCCATGATCAAGGCACCATTAAACAATTATTTGAAAACCTTAGAAACTAT 

>PGM2_equCab Equus caballus (horse) -L1MA9 6172-6264 24% -MER58A 1-145 23% -L1MA9 6172-6264 26%-L1MA9 6050-6302 23% VISAELASFLATKNLSLSQQLKAIYVE YGYHITKASYFICYDQDTIKKLFENLRNY 
GTCATAAGCGCAGAGTTGGCTAGCTTTCTAGCAACCAAGAATTTGTCTTTGTCTCAGCAGCTAAAGGCCATCTATGTTGAGTAAGTTTCTATTAACTCTC 
TTTAACTGAGGTAATTTTTTTTATTAGtttcaaatgtacaacataatgattcaatgtatgtatatattttgaaatgatcgccacaataagtctggctaac 
ctgtatcaccgacatagGGCTCTTTTTAAATGTTTTATGTTCTTTTGCATGAAACAGTAATATTCTTTTGAATGCTCTTACTTTAGCTATGAATTGTTCC 
TTATGAAAACTAAGTAAGAGATCACTTTTTCCTTTCGATACTTAACCACTTAGTAGTATTACCCTTTGTGATTGCATTAATCCTTAACAAATGAGAACAT 
TAGTCACGGAAATGGTGAAGTGAAGAATGTAATTTTCAGTGTCTGAGGTCAAAAAAGATTAAATGTGTTCATGAAACATCTATTTAGTCTTTAACTTCat 
tgctcagctctgcctttgtagtgcagaaacagccggggacaatacataatgtaatgggtgtggggtggctgtgttccagtagatcttttacttaaaaata 
caggccgaaggccaggcagcagtttggcaatccctgGGGGAGATTATTGTACCTTTTTTTAATGTTAAATTATCCCTTTGAAGTTAGTCATGGTTATTTC 
ATTTAGTTTAGAATATAATGGTTAATACATAGTGTATGTACACCATGTGGAATTATTTTTTCCCATTTTGCATTTCTTCTtttgttgagatataattaac 
atagaacattatattagcttcaggtgtacagtgtaattatttgataattgtatatattgcagattgatcaccaccataagactagttaacatccatcacc 
acacatagttataaatttttttcttgtgatgagaacttttaaggtctattctcttagcaaccttcaaatatacaatacagtattattaattctagtcacc 
gtgctgtgtattatatcctcatgacccattTTATTATTTTGTTTCGAAAGGTATGGCTACCATATTACCAAAGCTTCATATTTTATCTGCTATGATCAAG 
ACACCATTAAAAAATTGTTTGAAAACCTTAGAAACTAC 

>PGM2_myoLuc Myotis lucifugus (microbat) -L1MA9 6174-6264 20% -MER58A 38-157 26% VISAELASFLATKNLSLSQQLKAIYVE YGYHITKASYFICHDQGTIKKLFENLRNY AAPE01636299 
GTCATAAGCGCAGAGCTGGCTAGCTTTCTTGCAACCAAAAATTTGTCTCTGTCTCAGCAGCTAAAGGCCATCTACGTTGAGTAAGTTTCTATTGATTATTG 
AATTGAAGTAATATAGTTTGATTAGTTTCATGTGTACAATGTAATGATTCAATATGTGTATATATTGGGACATGGTTGCCACAATAAGTCGTTAACATAC 
ATTACCACATGTGGCAATGTATTTTAAGTGTATTATGTTCTTGCGTATGAGATGCTAATGTTCTTTCCAAAGCTCTGACTTTAGTTATGAATTCTATTTC 
TTAAGAAAACGAAACGAGATTATCTTTTCCTTTTGATACTTACCATTTGTGATAGCACTAATCTTTACTAAATGAGAACATGACACAGAATGTGATTTTA 
AGTGTCTGATGCCAAAAAAGATTAAATGTGTTCATGAAACGTCTATTTAGTCTTTATAGCAGTTTCTCAACTCTTGCCTTTCTGATGCAAAAGGAGCCAG 
ACACAGTACATAATGCAATGGGCGTGGTATGGCTGTTCCAGTATAATTTTACTTACAAGTATAGGCTGAAGGCAAGGTAGCAGCTTGGTGAGCCCTCGGG 
TAAATTGTTGCACCTCCTTTTAATGCTAAATGATTGCTTTGAAGCTAGTCATGGTCATTTGTCTCATTACGTATTTGAGAATGTGCTGGTTGGTGCCCGT 
TCTGTATATGCTATGCATAATTATTTGTTCCTCATTTTGCATGTATTTGTATTTGTTTTGATAGGTATGGCTACCATATTACCAAAGCTTCATATTTTAT 
CTGCCATGATCAAGGAACCATTAAGAAATTATTTGAGAACCTTAGAAACTAT 

>PGM2_pteVam Pteropus vampyrus (macrobat) -L1MA9 6161-6291 25% -MER58A 35-145 29% VISAELASFLATKNLSLSQQLKAIYVE YGYHITKASYFICHDQGTIKKLFENLRNY 
GTCATAAGCGCGGAGTTGGCTAGCTTTTTAGCAACCAAGAATTTGTCTTTGTCTCAGCAGCTAAAGGCCATCTATGTTGAGTAAGTTTCTATTGACTCTA 
CATAACTGAAATAATATTTTTTATTAGTTTCAGGTGTACAGCACAGTGATTCGGTATATGTATATATTATGACATGATTGCTATAAGTCTATTGCATGCA 
TCAGTCTATTACTACATGCATCACCACACGTAGTAATATTTTTAAATGTATTATGTACTTGTGCACAAGATACTAATATTCTTTCCAATGCTCTTACTTT 
AGTTATGAATTCTATTTCTTATAAAAACCAAATAAGAAATTACCTTTTCCCTTTGATACTTAGCCATTTAATAGTAATTACCATTTGTGATGACAGTAAC 
CTTTACCAGATGAGACATTAGCCACAGAAACAGCTAAAGAATATGATTTTAAGTGTCCGATGTCAAAAGATTAAATGTGTTTATGAAACATCCTATTTAG 
TCTTTTTATAGCATTATTCAGCTGTGCCTTTGTAGTACAAAAGCAGCCAGACCCGATGCATATGTAATGGGTGCAGCGTGGCTACATTTCTGTAAAATTT 
TTACTTACAAATATAGGCTGAAGGCCAGGCAACAGTTTGGTGATCCCCTGAGTAAATTGTTATACTTCTTTCTTAATGCTGAACTATTCCTTTGAAGCTA 
GTCATGGTCATTTGTTTCATTAAGCGTTTTAGAATGTACTGGTTGATACATGTTCTGTGTACACTATGCAGAATGATTTGTTCCTTATTTTGCATGTGTT 
TGTATTTATTTTGATAGGTATGGCTACCATATTACCAAAGCTTCATATTTCATCTGCCATGATCAAGGCACCATCAAAAAATTATTTGAAAACCTTAGAAACTAT 

>PGM2_pipAbr Pipistrellus abramus (microbat) -L1MA9 6180-6301 25% -MER58A 38-157 24% AB258957 AIYVE YGYHITKASYFICHDQGTIKKLFENLRNY 
GGCCATCTATGTCGAGTAAGTTTCTATTGATTATTGAATTAAAGTAATATAATTTGATTAGATTCATGCGTACAGTGTAATGATTCAATACATGTATATA 
ATGGGACATGGTTGCCACAATAAGTCGTTAACATACATCACCACCTGTGGCAATATATTTTAGGTGTATTATGTTCTTTAGTATGAGACACTAGTACTAA 
TATTCTTTCCAAGGCTCTGACTTTAGTTATGAATTCTATTTCTTAAGAAAATGAAACGAGATTATCTTTTCCTTTGGATACTTACCATTTGTGATTGCAC 
TAATCTTGATTAAACGAGAACATTACACAGAATGTGATTTTAAGTGTCTGATGCCAAAAAAGATTACATGTGTTCATGAAACATCTATTTAGTCTTTATA 
GCAATTTCTCAACTCTTGCCTTTCTGGTGCAAAAGCAGCCTGACACAATACATAATGTAATCGGCGAGGGATGGCTGGTCCAATAAAACTGTACTTACCA 
ATGTAGGCTGAAGGCAAGGTAGCAGCGTGGTGTTCCCTCAGAATTATTTGTTCCTCATTTTGCACGTATTATTTGTTTTGATAGGTATGGCTACCATATT 
ACCAAAGCTTCATATTTTATCTGCCATGATCAAGGCACCATTAAGAAATTATTTGAAAACCTAAGAAACTACGATGGGAAGAATAATTAT 

>PGM2_bosTau Bos taurus (cow) -L1MA9 6155-6263 28% -MER58A 7-157 21% VITAELASFLATKNLSLSQQLKAIYVE  YGYHITRASYFICHDQETIKQLFENLRNY AB258958 [L1_Carn7] 
GTCATAACTGCAGAGTTGGCCAGCTTTCTAGCAACCAAGAATTTGTCTTTGTCTCAGCAGCTAAAAGCCATCTATGTTGAGTAAGTTTCTATTGACTATT 
TAATTGAAGTAATTTTTTTTTATCAGttcaggtatacaacacagtgattcagtgtatgtctatattgtgaaatgatcacagtggatacaattaacatgca 
tccccacacaggaatattttttaatgtTTTACTCTCTTCTTGTGCACCCGATACTCATATTCTTTCTGATGCTCTTGCTTTAGTTATGAATTCTATTTCG 
TATGAAAACTAAATAAGAGATCACCTTTTCCTTTTGCTACTTAAGCAGTTAATAGTAATTACCATTCATGATGACGTTAATCCTTAATAAATGAGAACGT 
TAGCTGCAGAAATGGCTAAGGGAAGAATGTGATTTTTTAAATGTCCAGTGTTGAAAAAGACTAAATGTGTTCATTAAACATCTATTTAgtctttgtagca 
attacttatttctgcctttctagtgcaaaagcaaccagacacaaggtaatgggcatgacgtggctgtattccaatgataaaacttttacttacaaacaga 
gactgagggccACACAGCAGGGCAGTGATTCCTGGTGTAGATTGTTGGACCTCTTTATTTAATGCTGAATTACTCCTTTGAAATTAGTCATGGTTGTTTG 
TTTTAGAATATACTGTTTGATAGATACATGTTCAGTGTACACTGTGCCCAATTATTTGTCCCTCATTTGCATGTAACCATGTTTGTATTGATAGGTATGG 
CTACCATATCACCAGAGCTTCGTATTTTATCTGCCATGATCAAGAAACTATTAAACAATTATTTGAAAACCTTAGAAACTAT 

>PGM2_turTru Tursiops truncatus (dolphin) -L1MA9 6155-6265 29% -MER58A 37-148 21% FISAEVGSFLAQNCLVSAAKAIYV YGYHITKASYFICHDQGTIKKLFENLRNY 
TTCATAAGTGCAGAGGTTGGCAGCTTTCTAGCACAGAATTGTCTTGTTTCAGCAGCTAAAGCCATCTATGTTGAGTAAGTTCTTCTATGACTGTTAAATG 
AGTAATGTTTTTTTTCATTTCAGTTGTGCAACACAATGATTCAATGTATATCTATTATTGTGAAATGATTGCAACAAATACAGTTTACATGTATCCCCAC 
ATGTAGTAATATTTTTTAATGTTTTACTCCGTTCTTATGCATGAGATACTAATATTCTTTCTGATGTCCTTACTTTGGCTATGAATTCTATTGCCTATAA 
AAACTAAATAAGGGATCACCTTTTCCTTTCGATATTTAACTACTTAATAGTAGTTACCCCTTCATGATGACATTGATTCTTAACAAATGAGAACATTAGT 
TGCAGAAATGGCTAAGGGAAGAATGTGATTTTTAAGTGTCCAATGTCAAAAAAGACACATGTGTTCACAAAACATGTTTAGCCTTTAAAGCAATTATTCA 
CCAGTGTCTTTGTAGTGCAAAAGCAGCCAGACACAATACATAAGGTAATGGGCATGGCATGGCTACGTTCCAATAGAGAAACTTTTACTTAGAAATACAG 
GCTGAGGGCCACAGAGCAGTTCAGCGATCCCTGGGGTAGATTGTTGGACCTCTTTTATAAAATTGGACCTCTTTTTTTTTTTTTTTTTTTTTGGCGGGGG 
GTACGTGGACCTCTCACTGTTGTGGCCTCTCCCGTTGCAGAGCACAGGCTCCAGACGCGCAGGCTCAGTGGCCATGGCTCGCGGGCCCAGCCGCTCCACG 
GCATGTGGGATCTTCCCAGACCGGGGCACGAACCCGTGTCCCCTGCGTCGGCAGGCGGACTCTCAACCACTGCGCCACCAGGGAAGCCCTGAACCTCTTT 
TTTAATGCTGAATTATTCCTTTGAAATTAGTCGCGGTTATTTGTTTTAGAATATACTGGTTGATACATGTTCAGTGTACACTGTGCAGAATTATTTGTTC 
CTCGTTTTGCATGTAATTGTGTTTGTATTGATAG GTATGGCTACCATATCACCAAGGCTTCGTATTTTATCTGCCACGATCAAGGCACTATTAAAAAATTATTTGAAAACCTTAGAAACTAC 

>PGM2_susScr Sus scrofa (pig) cdna + tiled -L1MA9 6159-6264 27% MER58 212-271 28% VISAELASFLATKNLSLSQQLNAIYVE YGYHVTKGTYFICHDQGNVKKLFENLRNY 
GTCATAAGCGCAGAGTTGGCCAGCTTTCTAGCAACCAAGAATTTGTCTTTGTCTCAGCAGCTAAATGCCATCTATGTTGA 
GTAAGTTTCTATTGACTGCATTTAATTGAAGTAATTTTTTTAATCAGTTTCGGGTGTACGACATAATGATTCAGTGTATATGTATTGTGAAATGATCCCAA 
TGAGTACAGCTAACATGCATCCCACACGTAATAATATTTTTTTTTCTTTCTTTTTCTTTTTTTAGGGCTACTCCTGTGGCATATGGAAGTTCCCAGGCTA 
AGGGTCGAATAGGATCCATAGCCGCTAGCCTAAGCCACAGCCACAGCAGCACGGAATTCGAGCCACATCTTTGACCTCCGCTACAGCTCATGGCAATGCC 
AGATCCTTAACCCACTGAGCAAAGCCAGGGATCAAACCCAACATCTCATGGATCCTAGTCGGGTTTGTTAACCCTTGAGCTGCAAAGGGAACTCCCATAA 
TAATCCTTTTAAATGTTTTACTCTGTTCTGATGCATGAGACTAATATTCTTTCTGATACTCTCATTTTAGCTATAAAGTTGATTTCTTATGAAAACTCAG 
TAAGAGATCACTCTTTCCTTTTGATATTTAACCCCTTAATAGTAATTACCATTCATGATGACATTAATCCATAACAGATGAGAACAGTAGTTGCAGAAATGGGTAAT 
GGAAGAATGTGATTTCAACTAAATGTCCAATATCAAAAAAGACTAAGTGTGTTCATGAAACATCTATTTACTATTTATAGCAGTTATTCAGCTCTGCCTT 
TGTAGTGGTAAAGTGGTCAGACACAATACTTAAGGTAAAAGTTTCCAGTTATGAAACTTTTACTTACAAATATGGGCTGAGACTGGGCAATAGTTCAGTG 
ATTCCTTGGGGTAGATTCTTGGACCTCTTTTTTTAAATGTTGGACCTCTTTTTTAATGCTAAGTTATTCCTTTGAAATTAGTCTTGCTTATTTGTGTCAT 
TTGTATTGAAGTATACTGGTGAATTACATGTTCTGTGTATGCTGTGTGGAATTATTTGTTCCTCATTTTGCATGTAATTGTATTTGTATTGATAGG 
TATGGCTACCATGTTACCAAAGGTACATATTTTATCTGCCATGATCAAGGCAATGTTAAAAAATTATTTGAAAACCTTAGAAACTACGATGGGAAGAATAATTAT 

>PGM2_vicVic Vicugna vicugna (vicugna) tiled -L1MA9 6162-6310 24% -MER58A 35-157 20% VITAELASFLATKNLSLSQQLKAIYVE YGYHITKASYFICHDQGTVKKLFENLRNY 
GTCATAACTGCAGAGTTGGCTAGCTTTCTAGCAACCAAGAATTTGTCTTTGTCTCAGCAGCTAAAGGCCATTTACGTTGAGTAAGTTTCTATTAATGCTG 
TTTAATTGGAGTAAGCTTTTTATCCATTTCAGATGTACCACATTATGACTCAGTATACGTCTACATTGTGAAATGATCACAATTAGTAAAGTTAACGTGT 
ATCATCACACATAGTAATATTTTATAATGCTTTACTCTGTTCTTGTGCATGGGACACTAATGTTCTTTCTGATGCTCTTTCTTTAGTTATGAATTCTGTT 
TCTTATGAGAACTAGATAAGAGATCATCTTTTCCTTTTGATACCTAATCACTTAATAGTAATTACCATTCATGATGACATTAATCCTTACAAATGAGAAA 
ATTAGTTGCAGAAATGGCTAATGGAAGAATGCGATTTTAAGTGTCTAATGTCAAAAAAGACTAAATGTGTTCATGAAACATCTGTTTAGTCTTTATAGCA 
ATTACTCAACTCTACCTTTGTAGTGCAGAAGCAGCCAGACTCAACACATAAGGTAATGATGTGGCTGTTCCACTAATAAAACTTTTACTCAAAAACACTG 
GCTGAGGGCCAGGCAACAGTTCAGCAATCCCTGGGGTAGATAGTTGGACCTCTTTTTTTTAATTCTAAATTATTCCTTTGAAACTCATCATGGTTATTTG 
TGTCATTTATTTTAGAGTATACTGGTTGATGACATGTTCAGTGTACACTGTGCAGAATTCTTTGTTCCTTGTTTGCATGTAATTGTATTTGTATTGATAG 
GTATGGCTACCATATTACCAAAGCTTCATATTTCATCTGTCACGATCAAGGCACTGTTAAAAAATTATTTGAAAACCTTAGAAACTAC 

>PGM2_ateAlb Atelerix albiventris (African hedgehog) No repetitive sequences [VISAELASFLATKNLSLSQQLKAIYVE] YGYHITKASYFICHDQVTIKKLFENLRNY AB258952  
TTAATGTGTTTGTTAAACATCTATTTATTCTTTACAGCTATCACTCAACTCTGACTTTGTAATACAAAATAGCCACACTTAGTCCATGAGGTCATGGACC 
TGATGTGACTGCCCCAATAAAACTTATACCTACAGATATAATCAAAATAAGATAAAATGGATGCTATCAATACTTAAGAATATTGGCTAAGTAAAAACAA 
AGAACTAGTTTAGAAACCTACAGGGGGTTATTGTTCTTCCTTTTTTTCATGCTATATTATTCCTTTGAAGCCAGTCATAGTTATTAGTCTCATTAACTTT 
ATAATATACTGGTTATATATGTTCTGTGTATACTAGGTAAAGTTATTTCTACCTAATTTTGCATACGTTTTATTTGTTTGCTAGGTATGGCTACCATATA 
ACCAAAGCTTCATATTTTATCTGCCATGATCAAGTCACCATTAAAAAATTATTTGAAAACCTTAGAAATTAT 

>PGM2_sorAra Sorex araneus (shrew) +SOR1_SINE +SOR1_SINE VISAELASFLATRNLSLSQQLKAIYVE YGYHITKASYFICHDQSIIKKLFENLRNY AALT01183695 AALT01470682 
GTCATTAGCGCGGAGCTGGCCAGCTTTCTCGCCACCAGGAACCTGAGTTTGTCCCAGCAGCTAAAGGCCATCTATGTGGAGTAAGTTCCCTACTGACTGT 
GCTTAATCAAAATAACCCGTATTTTTGGATCCATTTTTAACGGTTTATTATGCTCTTGTGTGTGTGATACTGATAGTCTCTCTAATGTCCTCACTTCAGT 
TATAAATCCTATTTCTTAAAAACATGAAGTTAAGGGGCTGAACCGATAGCACAGCGATAGCAAGGTTTGCCTTGCATGTGACCGATCTGGGTTCGATTCC 
CAGCATCCCATTTGGTCCCCTGAGCACTGCCAGGAGTAATTCCTGAGTGCATGAGTCAGGAGTAATCCCTGTGCATCGCTGGGTGTCACCAAAAAAAAAA 
AAACCATGAAGTTAAAAAATCACCTTTTGGGGGGGGTCGGAGAGATAGTGCAGCAGTGGGTAGGGAGCTTGAGTCATTCATGGGTCACCCAGCTTCAATC 
CCTGGCACGCCCTGTGGCCTCCCAAGTCCCGCCAGGAGTGATCCCTGAGCTCAGAACCATAAGCAAGCCCTGAGCACCATTGGTGTGGCCCCAGAATAAA 
TAAATTAGAGATAGAAATCACTTTTTCATGCTTAACTACTTAATAATACTTATGATTGCCATACTCCCTAATGAATGAGATCTAATCGCAGAACTAGTTA 
TTAGTTAAAAGTGTGAATTTAAATGTGTAGTGTCAAAAAAATG ACCAAGATAACCAGCTTATAACTTAGACTTATAAATGACTGCTTATCAATATATCT 
AAGGCCAGACAGCAGTTTACTTTAGCAGTTCCTAGGATAGGTTATTGTTCCTCTTTTTTTTTTCCCTTTATTTTTTGCCCCCTAGAGATCTACCTTTTAA 
AAAAATATTTTTTTAATTGAATCACAATGAGATACACAGTTACAAATTGTTTCTGATTTGATTTCAGTCAGACAATGTTCAAATATCTGTCCCTTTCACA 
GTGTACATTTCCCACCACCAGTGTCCCCACTTTCCTTCCTTGTTCCTCTTTTTTCATGCTCAGTGATTCCTTTGAAGCGAATTATGGTCATTCGCTTCAC 
TTGCTTAAAAGCAAATGAATCAGCGGCCGATTGATGTCCTGTGTTCGAAAGACAGAATTCTTTGTTCCTTATTTTGCGTGTATTTGTATTGATAGGTATG 
GCTACCATATAACCAAAGCTTCGTATTTTATCTGTCACGATCAAAGCATCATTAAAAAGTTGTTTGAAAACCTTAGAAATTAT


Analysis of L1MA9 retroposon INT391

Extended validation of this L1MA9 insertion, which occurs in a short intron of the gene ACSL5, is feasible in May 2008 because of newly available genomes. elements. The basic technique consists of first establishing the comparative genomics of the two outside coding exons. These are needed to reliably probe contig assemblies and trace archives with tblastn and blastn respectively. A complete intron can often be obtained by tiling out to the center from the two ends. It is imperative to avoid paralogous exons in doing so.

Pegasoferae.png

In the case of INT391, dog, cat, horse, microbat, macrobat had the retroposon judging by location, fragment coordinates relative to the full length retroposon, and strand orientation relative to the coding exons (minus strand here). Cow, dolphin, pig, vicugna, and shrew did not have it. Since cetartiodacytl L1MA9s might have been interrupted by a later retroposon breaking the L1MA9 into two unrecognizable shorter pieces, it is necessary to remove other repeats and re-run RepeatMasker.

Despite more intensive phylogenetic sampling, INT391 continues to support Pegasoferae as Nishijimi et al originally stated. It should be noted that the MER-class retroposon, while not at issue here, exhibits the type of homoplasy that makes retroposons dicey as tree topology markers. Introns are often susceptible to multiple insertions of similar retroposons as well as to complicated patterns of micro deletions that prevent their recognition even if they aren't fully deleted.

Summary of the phylogenetic distribution of the L1MA9 retroposon INT391:

>INT391_ACSL5_Peg_canFam +MER91 85-140 23%9 -L1MA9 6082-6298 24% span 762bp
>INT391_ACSL5_Peg_felCat -MER91C 97-140 27% -L1MA9 6082-6301 22%
>INT391_ACSL5_Peg_equCab -MER91B 62-128 26% -L1MA9 6082-6302 21%
>INT391_ACSL5_Peg_myoLuc  no MER            -L1MA9 6079-6277 23%
>INT391_ACSL5_Peg_pteVam -MER91B 8-62 24%   -L1MA9 6060-6302 25% 

>INT391_ACSL5_Peg_bosTau -MER91C 55 -85 28%  No L1MA9
>INT391_ACSL5_Peg_turTru -MER91 261-311 22%  No L1MA9
>INT391_ACSL5_Peg_susScr -MER91 257-306  8%  No L1MA9 
>INT391_ACSL5_Peg_vicVic -MER91 284-337 24%  No L1MA9  
>INT391_ACSL5_Peg_sorAra  no MER91           No L1MA9

Markup of exons and intronic retroposons of INT391 within ACSL5:
  blue: coding exons
  magenta: L1MA9 INT391
  red: MER91 retroposon

>INT391_ACSL5_Peg_canFam +MER91 85-140 23% -L1MA9 6082-6298 24% span 762bp GDPKGAMLTHQNIISNVSSFLKCME YTFKPTPEDVTISYLPLAHMFERIVQ
ACAGGTGACCCTAAAGGAGCCATGCTGACCCATCAAAATATTATTTCAAATGTTTCTTCTTTCCTCAAATGTATGGAGGTCAGTGGTCAATTGTCAAGGA
GGTCTTCATTAAAATGTAAATCTGTCATAAGATTTTAATCCTGATGTAAGAGGAGTCAGAGACTAACACAAAACAAAACAAAAACAAAACTCATGATAAA
GGCCTGAAGAAGGGACAAATAGTGGTGTCTCTTTGTCCAGAGGACTGTGCATTTTCAAGCCTTGGCCTTTTAGAATCACTGCACATCTCTACACTCAGTG
AAATTAAGGggcacctctcagagttatacagtgcaccacctgtacaactgggtgtggcagtcctgGGAAGGAGCAGTTTTTTTTAAATTAAAGAAAAAAT
Tttgagatacaattaacataacactatattaatttcagatacacaacataatgatttcatatatatgttgcaaaatggttcccacaataaatctaacatc
cattatcacacatagctatagtttctttttcttgtgatgagaatttttaagatctgctcacttactaacttgcagatatgcaatacagtattattaacta
tagttaACGGGAGTTACTTTTAAGTCTCCTTCGGAAGAGAAAGTTGGCATTAACACAATGTCTCCTCCTTGTTCTAATCTACAGTATACTTTCAAGCCCA
CCCCTGAAGATGTGACCATATCCTACCTGCCCTTGGCTCATATGTTTGAGAGGATTGTACAG

>INT391_ACSL5_Peg_felCat -MER91C 97-140 27% -L1MA9 6082-6301 22% GDPKGAMLTHENIVANSSAFLKCME CIFKPTTEDVSISYLPLAHMFERIVQ
GGTGACCCTAAAGGAGCCATGTTGACCCATGAAAATATTGTTGCAAACAGTTCTGCTTTTCTCAAATGTATGGAGGTCAGTGGTCAATTTAAAAAGAGGT
AGTCATTAAAATGTAAATCCATCATAAGATTTTGATCTTGATGTCAGAGGAGGCAGAGACAAAAAACAAAACAAAACCAAAAGCCACGTTAAAGGCCTGA
CAATGAATCAGTGTGGACAAATACTGGTGCATCTTTGTCCAGAGGACTGTGCATTTTCCAGCCTTGGTCTCTTAGAATCACTGCATGTATCTACACTCAG
TGAAGTTAAGGAGCACCTTAACTTCagtcatacagtgcaaaacctgtgcaactatgtgtggcaatcctgGCAATTTCTTTAAAAGTAAAGAAAAAAAttt
gttgagatattattgacgtattaatttcaggtgtacaacgtgattccatatatgtatgtactgcaaaatggtccctgtgataaattccaagtccatcaac
acacataattttttttcttgtgatgagaacttttcagatctactcacttaacaactttcaaatctgcaacacagcattattaactgtagttaATAGGAGC
TGCTTTTAAATCTCCTTTAGAATAGAAAGTTAGCACTAATCCAATGGTGTCTCTTTCTTGTTCTGGTCTATAGTGTATTTTCAAGCCCACCACTGAGGAT
GTGTCCATTTCCTACCTCCCCTTGGCTCATATGTTTGAGAGGATTGTACAG

>INT391_ACSL5_Peg_equCab -MER91B 62-128 26% -L1MA9 6082-6302 21% GDPKGAMITHQNITSNTAAFLRSME GTFEINLEDVTISYLPLAHMFERVVQ
GGTGACCCCAAAGGAGCCATGATAACCCATCAAAATATTACTTCAAATACTGCTGCTTTTCTTAGATCTATGGAGGTCAGTGATCAATTGAAAAAGAGGA
ATTCCTAATTAAATTTCAATTGAAAATTCCTAATTAAAATAGGAATCTGCCATAAGATTTTAATCTTGAAATTAGAGAAGGCATAGAGGAAAAAAATAGG
TTTAAGGCCTAAGTATGCACACATATCAGTGCCTCTTTGTCCAGAGGACTGTGCATTTTCACGTCTTGGTCTTTTAGGATCACTGCAGAGCTCTACACTC
TGTGCAgttaagggtacctcttacagttgtacagtacatcacctgcacaaccatatgtggcagttctgGGAAGGAGTAGttttttaaaaattaaaaaaat
attttattgagatatgattgacatataacattatgctagtttcagatgtacaacataatgatttgaggtttgggtatattgcaaaatgatccccacaata
agtctagttaacatccatcaccacgcatagttacaaattttttcttgtgatgaaaacgtttaagatctactctcttagcaaatttctaatatataataca
gtattactaactagaattaATAGTAGTTTTTAAATCTCCTTCGAAGAGAAAGTTGGATTAATACAATGTTGTCTCCTCTTTGTTCCCTGATCTGTAGGGT
ACTTTTGAGATCAACCTTGAGGATGTGACCATATCCTACCTCCCCTTGGCTCATATGTTTGAAAGGGTTGTACAG

>INT391_ACSL5_Peg_myoLuc no MER -L1MA9 6079-6277 23% GDPKGAMLTHQNVVSNASAFLRCVE ESFAPTPEDVSISYLPLAHMFERVVQ AAPE01034117
GGTGACCCCAAAGGAGCCATGCTAACCCATCAAAATGTTGTTTCAAATGCTTCAGCTTTCCTCAGATGCGTGGAGGTTAGTGGTAGCTTGAAAAAGAGGT
CTTCGTTAGAATGTGACTCTGTCATAAGATTTTAATCTTGAAGCTAGAGGAGGCAGAGAAGAAAAAAACCAAAACAGGTTAAGGGCCTGAGTGTGGACAA
ACACATGTGCATCTTTGTGTGGAGGGCTGTGCATTTTCAAGCCGTGATCTTTGAGGATCCCTGCAGACCTCTACTCCAGCGCAGTCCAGGGCACCTCTCC
CAGTTCTTCAGGGCACCCCCTGCATGACTGTATGGGGCACTCATGGAAGGAAATAGTTAAAAAAAAATTTAAATTTTAAATGAGATGTAACGATGCCTaa
cattataatagtttcaggtgtgcaacataatgattcaatatttatatgtattgcaaaatgatcctcatagtaagtgtagttaatatccatcactgcacac
agttacaaattctttgttcttgtgatcagaacttctaagatcaactctctcagcaactttcgaatatacaatagagtgttattaactatagttaacaAGG
GTAGTTCTTAAATCTCTTTGGTAAAGAAGGTTGGCATTAATCCGATTTTGTCTCCTCCCCCTTCCCGATCTGTAGGAAAGCTTTGCACCCACCCCCGAGG
ATGTGAGCATATCCTACCTCCCCTTGGCTCATATGTTTGAGAGGGTTGTACAG

>INT391_ACSL5_Peg_pteVam -MER91B 8-62 24% -L1MA9 6060-6302 25% GEPKGAVLTHQNVISNAAAFLKLLEVS DSFQVTPKDVTISYLPLAHMFERIVQ ti|1386642117 ti|1371644127
GGTGAGCCCAAAGGGGCCGTGCTAACCCATCAAAATGTCATTTCAAATGCTGCTGCTTTTCTCAAACTTTTGGAGGTCAGTCGATCAAATGAAAAAGAAG
TCCTGATCAAAATGTGAATTTGTCATAAGATTTTAATCTTGAAGTCAGAGGAGGCAGAGAGGGGGAAAAAAAACAGGTTAAGGGCCTGAATGTGGGCAAA
TATTTGTGCATCTTTGTCTGGAGGACTGTGCATTTTCAAGCCTTGGTCTTTTAGGATCACTGCAGACCTTTGTACTCAGTTAAGGGCACCTCTTAGAGTG
ATGCAGTGTACCGCCCGCACAACTGTATGTGGCCCACCTAGAAAGAAGTAGCTTAAATTTTTTAAAAATTTTAATTGAGATATAATTGATATCTAACATT
GCCTTAGTTTCAGGTGTACAATGTAATGATTCAATATTTGTATATGTTGCTAAACGATCCTCAAAATAAGTCTAGCTAAGAAAGATCACCACACTTAGAT
AAAAACTCTTTTTTTGTGTGTGACAAGAACTTTTAGCAACTTTCATTATTAACTGTCGTTAACAGGGTAGTTCTTAAATCTCCTTTGGAAGAGAAAGTTG
GCATTAATCCAATGTCATTTCCTCTTTGTTCTTTATCTATAGGACAGCTTCCAGGTCACTCCCAAGGATGTGACCATATCCTACCTCCCCTTGGCTCATA
TGTTTGAGAGGATTGTACAGGTGAGT

>INT391_ACSL5_Peg_bosTau -tRNA-GluSine -MER91C 55-85 28% No L1MA9 GDPKGAMLTHANIVSNASGFLKCME GVFEPNPEDVCISYLPLAHMFERIVQ
GGTGATCCCAAAGGAGCCATGTTAACCCATGCAAATATTGTTTCCAATGCTTCTGGTTTTCTCAAATGTATGGAGGTCAGTGGTCAATTGAAAACAAGGC
CCTCATTAAAATGTAAATCTGTCGTAAGATTTTAATCTTAAAGTGAGAGGAGGCAGAGAGGGAAAAAACTGATTGAAGGCCTGAGTGTGGATGAATACCA
GTACATCTTTGTCTGGAGTTTTGCCCTTTTATTTATTTATTAatatatatatatatatatatatatTTTTTAATCTGGACCATTTTTAAAGTTTTTATCG
AATGTGTTATAGTATTGGTTCTGTTTTATGTTTTGATTTTTGGGGGGCTACAAGgtacatgggatctcagctccctgaccaggggtagaactcacaccct
ctgcattggaaggtgaagtcttaaccactggacctctggggaagtccCATAGAGTTTTGCTGTGTTAGGGTCACTGCAGATCTCCACACTCAATGCAGTT
AGAgcagcccttagatttacacagggcacatctgcacagctgtatgcagcagtcctAGAAAGAAGTGTTTAAATCCTCTTTGGAAGAGGAAATTGACATT
AACCCATTGTTGTCTCTTTTCCATTTCCTGATCTCTAGGGTGTTTTTGAGCCCAATCCTGAGGACGTGTGTATATCCTACCTCCCCTTGGCTCATATGTT
TGAAAGGATTGTACAG

>INT391_ACSL5_Peg_turTru -MER91 261-311 22% No L1MA9 GDPKGAMLTHENIVSNAAAFLKCVE HTFEPSSEDVTISYLPLAHMFERVVQ
GGTGACCCCAAAGGAGCCATGTTAACCCATGAAAATATCGTTTCAAATGCTGCTGCTTTTCTCAAATGTGTGGAGGTCAGTGGTCAATTGAAAAGGAGGC
CCTCGTTAAAATGGGAATCTGTCATAAGATTTTAAAGTTAGAGGAGGCAGAGGGGGAAGAAACAGGTTGAAGGCCTGAGTGTGGACAAATACTGGTGCAT
CTTTGTCTAGAGTTTTGCTCTTTTAGGGTCACTGCAGATCTCTGCACTCAGTGCAGTTAGGGCACCCCTTAGGGCACAGTGCACACCTGTACAACTGTAT
GCAGCAGTCCTAGAAAGAAGAAGTGTTTAAATCTTCTTTGGAAGAGAAAGTTGGCATTAATCCACTGTTGTCTCCTTTCCATTTCCTGATCTATAGCATA
CTTTTGAGCCCAGTTCTGAGGACGTGACCATATCCTACCTCCCCTTGGCTCATATGTTTGAGAGGGTTGTACAG

>INT391_ACSL5_Peg_susScr -MER91 257-306 8% No L1MA9 GDPKGAMITHQNIVSNVASFLKRLE YTFQPTPEDVSISYLPLAHMFDRIVQ ti|2023263948
GGTGACCCCAAAGGAGCCATGATAACCCATCAAAATATTGTTTCAAATGTTGCTTCTTTTCTCAAACGTCTGGAGGTCAGTGGTCGACTGAAAAAGAAGC
CCCTGTTGAAATGTGAATCTGTTATAAGATTTTAAAGTTAGAGGAGGCAGAGAGGAAAGAACCAGGTCAAAGCCCCAAGTATGGGAAAATACTAGTGCAT
CTTTGGAGTTTTGCTCTTCTAGGGTCACTATAGATCTCTACACTCAGTGTAATTAGGGCACCCCCCAGAGTTGTGCAGTGCACACCTGCACAACTGTATG
TGGCAGTACTAGAAAGTAGTGTTTAAATCTTCTTTGGAGGAAAAAGTTGGCATTAATCCATTGTTGTCTCCTTTCCCTTTCCTGATCTACAGTACACTTT
TCAGCCCACCCCTGAGGACGTGTCCATATCCTACCTCCCCTTGGCTCATATGTTTGATAGGATCGTACAG

>INT391_ACSL5_Peg_vicVic -MER91 284-337 24% No L1MA9 GDPKGAMITHENVVSNVAAFLKFME YSFEPTPEDVAISYLPLAHMFERVVQ ti|1970855441 
GGTGACCCCAAAGGAGCCATGATAACCCATGAAAATGTTGTTTCAAATGTTGCTGCTTTTCTCAAATTTATGGAGGTCAGTGATCAACTGAAAAAGACAC
CCTCGTTAAAATGTGAATCTGTCATAAGACTTTAATCTTCAGGTTAGAGGAGGCAGAGAGGGAAAATGACAGGTTTAAAGCCTGAGGGTTGACAAAGACT
GGTGCATCTTTGTCTGGAGGACTGTGCGTTTCCAAGTTTTACTCTTAAGAATCACTGCCGGTCTCTCCACCCAGTGCAGTTAGGGCATCTCTTAGATTTG
CGCAGTGCACACTTGTGCAACTGTATGTGGCGGTCCTAGAAAGAAGTAGTGCTTAAATCTTCTTTGGAAGAGAAAGTTGGCATTAATCGAATGTTGTCTT
CCTCCCATTCCCTGATCTCTAGTATTCTTTCGAGCCCACCCCTGAGGATGTGGCCATATCCTACCTCCCCTTGGCTCATATGTTTGAGAGGGTTGTACAG

>INT391_ACSL5_Peg_sorAra -SOR1SINE No L1MA9 WGPKGAKITHEILSSKAZAFLNSVE YAFEPTPEDVSISYLPLAHMFERVVQ AALT01576933
GGGCCTAAGTGGTGCTGAGGATGGAACCCAGGCCTTCTGCAGCTCCAACCCCCTGGGCCAGCTCTCCAGCTCTAAAGTGCCCCTAATGTAAGGGGAT
GCAGGAAATATGGCAGAGCTGAAGTCATGAACCCAGAAACAACAGGAGGAGGTGATGGGCTTTTCTTTGTAACTGCATCTGTGATTGTGGTCTTGTGGAA
TGTCGCTGCACATTGCAAAGCCAAAGACGGGCTGTGTGCTTTATAAAGGGTCTTTCTCTCCACCTCTTGTCTCCTCCAGGTGACCCCAAAGGAGCCATGA
TCACGCATGAAAATATTGTTTCAAACGCCTCTGCTTTCCTCAAGTGTGTGGAGGTCAGTGGATGTGGGAAAAGAGGTCCTAGCAAAAGGGTGGATGCCAC
AAAGTTCAGAAGTGGAAGTTAGAGCAGCAGCAGGGCTGGAGGGTGGCGTTCAAAGGGCTGTGTGTGTGCAGATGCCCCGACAGCTTGGGACATCAGTGTT
ATCATTATCATTATTATTATTACCATTTTGGTTTTTGGGGTACACTTGGGAATGGACAGGGGGCACTTCTGGCTTATGCACTCAGGAATTACTCCTGGTG
GTGCTCAGGGAACCATGTGGGATGCTGGGAATCAAGCCACATGCAAGGCAAATGCCCTACCCACTGTGCTATTGCTCCAGTCTCATCAGTGTTTTAGGAA
GCTGTGTATGTTGCTGCCTTGATATCCAGCACCTCTCTGCTCTCGGCGTGTAACAGCGCCCCTCAGAGCTCCACGGGGGGTCTAGCCTGCACACCCAGGT
GTGGCCCTGCTGGAAATGCCTGGTCTTTAGGTCTTCTTTGTCTGGGGAAATTTGGCATTGATCGATGGTCTCTTTCCTCTGTGCCCTGATCTGTAGTATG
CGTTCGAGCCCACGCCTGAGGATGTGAGCATCTCCTACCTCCCCTTGGCACACATGTTTGAGAGGGTCGTGCAG


Phylogenetically informative coding insertions and deletions

introduction


Analysis of indel 1

etcetc


Analysis of indel 2

etcetc