Opsin evolution: RBP3 (IRBP)

From genomewiki
Revision as of 21:41, 3 March 2009 by Tomemerald (talk | contribs) (New page: == RPB3 (IRBP) (96 marsupials): introduction == Interphotoreceptor retinol-binding protein, poorly named by IGNC as RBP3 despite its complete lack of paralogs, is a 4 exon 1247 residue gl...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

RPB3 (IRBP) (96 marsupials): introduction

Interphotoreceptor retinol-binding protein, poorly named by IGNC as RBP3 despite its complete lack of paralogs, is a 4 exon 1247 residue glycoprotein that shuttles retinoids between the photoreceptor cells and the retinal pigment epithelium. The protein's size results from four ancient internal tandem duplictions that became established prior to the intronation era (that is, the gene structure does not reflect the repeat structure; the repeats happened first, introns were inserted randomly later within the fourth repeat) though lamprey lacks the final intron. The repeats cluster best across species to the same-numbered repeat consistent with thus.

Thus the origin of RBP3 seems to have preceded the origin of the RPE by an immense time span, suggesting an earlier non-visual function possibly related to beta-carotene metabolization. However the gene can really not be traced back earlier than amphioxus (where the match is strong but the genomic situation is quite confused). There is no significant match in cnidarians or protostomia. Xray crystallography establishes a curious fold relationship to crotonase/tail specific proteases in plants, suggesting recruitment of a pre-existing protein. It does not appear all four domains separately bind retinol.

The first three homology domains and part of the fourth are all encoded by the first large exon of 1090 amino acids. This exon has been much used in marsupial phylogeny (along with the first intron of transthyretin). Indeed the 96 marsupial species in 51 genera having determined IRBP sequences at GenBank include a Dec 2008 partial sequence for Thylacinus cynocephalus, as well as for Sarcophilus harrisii.

The closest matches to the thylacine IRBP are shown in the difference alignment of the first 60 residues below. These species all lie with the Dasyuromorphia. The indicated E-->K may be one of several phyloSNPs breaking this group into blue and green subclades.

The numbat Myrmecobius fits implausibly (its amino terminal sequence EF028750 needs verification) -- its affinities seem to lie with the Didelphimorphia. Thylacinus is not basal within Dasyuromorphia relative to Myrmecobius using IRBP. However this may be a case of mis-comparison of genes.

  *           *                                     *
STSKAPQHDSKFTNATQEELLALFQQIIKYQVLEGNVGYLRVDYIPGREMIEEVGEFLVN EU091365  0 Thylacinus cynocephalus
.........P..A..................I............................ AY532676  3 Myoictis wallacei
........NP..A............................................... AY532687  3 Neophascogale lorentzii
........NP..A........T...................................... AY532686  4 Phascolosorex dorsalis
.........P..V............................................... AY532670  2 Parantechinus apicalis
....V....P..A..................I.....................L...... AY532675  5 Myoictis melas
.........P..A...................................D........... AY532679  3 Dasyurus hallucatus
...E.....P..A............K........D.............D........... AY532685  6 Sarcophilus harrisii
...E.......RA..........L............................Q..K.... EF028748  6 Sminthopsis crassicaudata
.......R.P.LA.........SL.......................Q....Q....... EF028749  8 Planigale ingrami
..A......P.LA.V.....................................K....... EF028736  6 Antechinus stuartii
..A......P.L..V.....................................K....... EF028743  5 Micromurexia habbema
..A......P.LA.V.....................................K....... EF028744  6 Murexchinus melanurus
..A......P.L..V....V................................K....... EF028746  6 Paramurexia rothschildi
..A......P.LA.V.....................................K....... EF028747  6 Phascogale calura
..A......P.LA.V.....................................K....... EF028745  6 Phascomurexia naso
.SA......P.LA.V.....................................K....... AY532667  7 Murexia longicaudata
......K..PNLA........T.L..R....................Q.VV.K....... EF028750 12 Myrmecobius fasciatus
..PET...VP..A.V........L..M....................Q.VV.K....... AY233765 13 Caluromys philander
..PET...VP.LA.V.......QL..M....................Q.VV.K....... AF257675 15 Caluromysiops irrupta
..PET...VP.LA.V......T.L..M....................Q.VV.K....... AF257688 15 Glironia venusta  
.IPET...VP..A.V.R....T.L..M....................Q.VV.K....... AF257683 16 Didelphis albiventris
.IPE....VP.LA.I......T.L..M....................Q.VV.K....... AF257686 15 Gracilinanus microtarsus
.IPET...VP..A.V......T.L..M....................Q.VV.K....... AF257676 15 Marmosops noctivagus
.IPET...VP.LA.V........L..M....................Q.VV.K....... AY233788 15 Philander opossum
.IPET...VP.LA.I......T.L..M....................Q.VV.K....... AF257689 16 Thylamys pallidior

Using Sarcophilus as probe in a different region, 721-900, we find this peculiar outcome: what appears to be a second very odd gene, XY difference, pseudogene, weird balanced polymorphism, nonhomologous recombination, sequence submission error, frameshifts, or systemic experimental error (eg Dasyurus maculatus AY532680 is identical to AY243439 outside the 15 amino acid block). However the genomic reads from individual Sarcophilus used in this project show no sign of this gene despite excellent coverage of the second type of gene.

Macropus and Monodelphis genomes only contain the second type of gene. All Didelphimorphia and Diprotodontia are of this type, as are platypus and all placentals. With the Sarcophilus genome, this can be resolved as it should have both and be the such first genome. Perhaps the alignment above is a mixture of type 1 and type 2 genes (resp. alleles). The Myrmecobius anomaly makes it more likely two distinct genes are present.

A definite pecularity seen in blast searches is the occurence earlier in the sequence of a very homologous segment for this very block, likely the homologous part of another of the internal tandem repeats. It is seen in both types of genes. Possibly internal non-homologus recombination or gene conversion has inserted first repeat sequence again in this distal block in place of what was relatively diverged sequence. Internal gene conversion would make IRBP extremely difficult to use in alignment-based phylogeny. As rare genomic event, it unites the species that have it but species that don't have it would have to be re-examined to exclude the possiblity that only the type 2 gene happened to be sequenced.

It emerges from direct tblastn that the Sacrophilus individual sequenced was female. That is, ATRX is well represented but not ATRY (though the situation is somewhat confused due to additional paralogs). Marsupial XY are quite different from placentals:

"Many or most genes on the mammal Y chromosome evolved a testis-specific function after diverging from an X-borne copy with a general function in both sexes. In marsupial but not eutherian mammals, a testis-specific orthologue (ATRY) of the widely expressed X-borne ATRX gene lies on the Y chromosome. Since mutations in human ATRX cause sex reversal, it is possible that one function of ATRY in marsupials is testicular differentiation. We report here the isolation and sequencing of the tammar wallaby (Macropus eugenii) ATRY cDNA, and comparison of its sequence with that of tammar ATRX. The evolution of a testis-specific function for the ATRY protein distinct from the general role of ATRX in both sexes has been accompanied by sequence changes in many protein domains that would alter protein binding partners. A large open reading frame encodes a 1771 amino acid ATRY protein that has diverged extensively from ATRX. The conservation and loss of particular motifs identify those required for testicular function (ATRY) and function in other tissues (ATRX)."

AY532685 MEILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE Sarcophilus harrisii
AY532684 ....E................................S....................P. Dasyurus geoffroii
AY532681 ....E................................S....................P. Dasyurus albopunctatus
AY532683 ....E................................S....................P. Dasyurus viverrinus
AY532682 ....E........................P.......SE...................P. Dasyurus spartacus
AY532680 ....E..............R.................SR...................P. Dasyurus maculatus
AY532678 ..V..................................S....................P. Dasycercus cristicauda
AY532669 ..V..................................S....................P. Dasykaluta rosamondae
AY532676 ..V..................S...............S....................P. Myoictis wallacei
AY532675 ..V..................S...............S....................P. Myoictis melas
AY532687 ..V........N.L.......................S....................P. Neophascogale lorentzii
AY532671 ..V..................................S....................P. Parantechinus bilarni
AY532670 ..V.................................TS.........RG.........P. Parantechinus apicalis
AY532686 ..V..................................S........P...........p. Phascolosorex dorsalis
AY532674 ..V.......................................................P. Pseudantechinus ningbing
AY532672 ..V..................................S....................P. Pseudantechinus woolleyae
AY532673 ..V........N..R......................S...................SP. Pseudantechinus roryi
454 read MEILQKYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAMLQAVSEDP           Sarcophilus harrisii
EF028739 ............................V.TEEDLAAKLNAMLQA.............P. Antechinus minimus
AY243439 ....E..............R........V.TEEDLAAKLNAMLQA.............P. Dasyurus maculatus
EF028750 ....K................KT.....I.TEEDLAAKLNAILQA.............P. Myrmecobius fasciatus
EF028737 ..V.........................V.TEEDLAAKINAMLQA.............P. Antechinus flavipes
EF028748 ..V.........................V.TEEDLAAKLNA.LQA.............P. Sminthopsis crassicaudata
AY243438 ..V.........................V.TEEDLAAKLNA.LQA.............P. Planigale sp.
EF028749 ..V.........................V.TEEDLAAKLNA.LQA.............P. Planigale ingrami
AY532679 ..V.........................V.TEEDLAAKLNAMLQA............... Dasyurus hallucatus
AF025382 ..V.........................V.TEEDLAAKLNAMLQA.............P. Phascogale tapoatafa
EF028741 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus godmani
AY532666 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus swainsonii
EF028736 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus stuartii
EF028742 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus agilis
EF028738 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus bellus
EF028740 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus leo
EF028747 ..V.........................V.TEEDLAAKLNAMLQA.............P. Phascogale calura
EF028744 ..V.........................V.TEEDLAAKLNAMLQA.............P. Murexchinus melanurus
EF028743 ..V.........................V.TEEDLAAKLNAMLQA.............P. Micromurexia habbema
EU086688 ..V.........................V.TEEDLAAKLNAMLQA.............P. Pseudantechinus macdonnellensis
EU086689 ..V.........................V.TEEDLAAKLNAMLQA.............P. Pseudantechinus roryi
EU086686 ..V.........................V.TEEDLAAKLNAMLQA............SP. Pseudantechinus macdonnellensis
EU086687 ..V.........................V.TEEDLAAKLNAMLQA..........G..P. Pseudantechinus mimulus
AY532667 ..V.........................V.TEEDLAAKLNAMLQA.............P. Murexia longicaudata
EF028746 ..V.........................V.TEEDLAAKLNAMLQA.............P. Paramurexia rothschildi
AY532677 ..V.........................V.TEEDLAAKLNAMLQA.............P. Dasyuroides byrnei
EF028745 ..V..........I..............V.TEEDLAAKLNAMLQA.............P. Phascomurexia naso

Macropus eugenii assembly         
sacHar   MEILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE 
         ME+LQ YYTLVDRVPALLHHLTAIDYSS L  +   ++       VSEDPRLLVRVLR E
macEug   MEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPE  

Monodelphis domestica assembly     TSSLVLDLQHSSGGEISG 
sacHar   MEILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE  
         ME+LQ YYTLVDRVPALLHHLTAIDYSS L  +   ++       VSEDPRLLVRVLR E
monDom   MEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPE  

Ornithorhynchus anatinus assembly
sacHar    EILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE 
          ++L+ YY LVDRVPALL HL A+D SS L  +   SR        SEDPRLLVR L  E
ornAna    DLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPE  

Equus caballus assembly
sacHar    EILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE  
          E LQ YYTLVDRVPALLHHL ++D+SS +  D   ++       VSEDPRLLV V+RS+
equCab    EALQDYYTLVDRVPALLHHLASMDFSSVVSEDDLVAKLNAGLQAVSEDPRLLVWVVRSK

RBP3 reference sequences: human to amphioxus

>RPB3_homSap human
0 MMREWVLLMSVLLCGLAGPTHLFQPSLVLDMAKVLLDNYCFPENLLGMQEAIQQAIKSHEILSISDPQTLASVLTAGVQSSLNDPRLVISYEPSTPEPPPQV
PALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLG
ERYGADKDVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAILTLRSALPGVVHCLQ
EVLKDYYTLVDRVPTLLQHLASMDFSTVVSEEDLVTKLNAGLQAASEDPRLLVRAIGPTETPSWPAPDAAAEDSPGVAPELPEDEAIRQALVDSVFQVSVLPGNVGYLRFDSFADA
SVLGVLAPYVLRQVWEPLQDTEHLIMDLRHNPGGPSSAVPLLLSYFQGPEAGPVHLFTTYDRRTNITQEHFSHMELPGPRYSTQRGVYLLTSHRTATAAEEFAFLMQSLGWATLVG
EITAGNLLHTRTVPLLDTPEGSLALTVPVLTFIDNHGEAWLGGGVVPDAIVLAEEALDKAQEVLEFHQSLGALVEGTGHLLEAHYARPEVVGQTSALLRAKLAQGAYRTAVDLESL
ASQLTADLQEVSGDHRLLVFHSPGELVVEEAPPPPPAVPSPEELTYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVRLVWQQLVDTAALVIDLRYNPGSYSTAIPLLCS
YFFEAEPRQHLYSVFDRATSKVTEVWTLPQVAGQRYGSHKDLYILMSHTSGSAAEAFAHTMQDLQRATVIGEPTAGGALSVGIYQVGSSPLYASMPTQMAMSATTGKAWDLAGVEP
DITVPMSEALSIAQDIVALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQ 0
0 IPSPEVFEELIKFSFHTNVLEDNIGYLRFDMFGDGELLTQVSRLLVEHIWKKIMHTDAMIIDMR 2
1 FNIGGPTSSIPILCSYFFDEGPPVLLDKIYSRPDDSVSELWTHAQVV 1
2 GERYGSKKSMVILTSSVTAGTAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTNLYLTIPTARSVGASDGSSWEGVGVTPHVVVPAEEALARAKEMLQHNQLRVKRSPGLQDHL* 0
 
>RBP3_bosTau cow run-on terminal exon
0 MVRKWALLLPMLLCGLTGPAHLFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPP
RAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLRSFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVY
DRPSNTTTEIWTLPEALGEKYSADKDVVVLTSSRTGGVAEDIAYILKQMRRAIVVGERTVGGALNLQKLRVGQSDFFLTVPVSRSLGPLGEGSQTWEGSG
VLPCVGTPAEQALEKALAVLMLRRALPGVIQRLQEALREYYTLVDRVPALLSHLAAMDLSSVVSEDDLVTKLNAGLQAVSEDPRLQVQVVRPKEASSGPE
EEAEEPPEAVPEVPEDEAVRRALVDSVFQVSVLPGNVGYLRFDSFADASVLEVLGPYILHQVWEPLQDTEHLIMDLRQNPGGPSSAVPLLLSYFQSPDAS
PVRLFSTYDRRTNITREHFSQTELLGRPYGTQRGVYLLTSHRTATAAEELAFLMQSLGWATLVGEITAGSLLHTHTVSLLETPEGGLALTVPVLTFIDNH
GECWLGGGVVPDAIVLAEEALDRAQEVLEFHRSLGELVEGTGRLLEAHYARPEVVGQMGALLRAKLAQGAYRTAVDLESLASQLTADLQEMSGDHRLLVF
HSPGEMVAEEAPPPPPVVPSPEELSYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVQLVWQKLVDTAALVVDLRYNPGSYSTAVPLLCSYFFE
AEPRRHLYSVFDRATSRVTEVWTLPHVTGQRYGSHKDLYVLVSHTSGSAAEAFAHTMQDLQRATIIGEPTAGGALSVGIYQVGSSALYASMPTQMAMSAS
TGEAWDLAGVEPDITVPMSVALSTARDIVTLRAKVPTVLQTAGKLVADNYASPELGVKMAAELSGLQSRYARVTSEAALAELLQADLQVLSGDPHLKTAH
IPEDAKDRIPGIVPMQ 0
0 IPSPEVFEDLIKFSFHTNVLEGNVGYLRFDMFGDCELLTQVSELLVEHVWKKIVHTDALIVDMR 2
1 FNIGGPTSSISALCSYFFDEGPPILLDKIYNRPNDSVSELWTLSQLE 1
2 GERYGSKKSMVILTSTLTAGAAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTDLYLTIPTARSVGAADGSSWEGVGVVPDVAVPAEAALTRAQEMLQHTPLRARRSPRLHGRRKGHHRQSQGRAGSLGRNQGVgRPEVLTEAPSGQKRGLLQCG* 0

>RBP3_monDom opossum
0 MTSQCLLLFSALLFSLAHAEQIFQPSLVRDMAKILLDNYCFPENLMGMQEVIEQAIKSGEILDISDPQMLASVLTAGVQGALNDPRLVISFEPSIPETPQ
HVPKLANVTQEELLILLQQMIKYQVLEGNVGYLRVDYIPGQEVVEKVGEFLVNNIWKKLMGTSSLVLDLQHSSGGEISGIPFVISYLHQGDILLHVDTVY
DRPSNTTTEIWTLPQVLGERYGGEKDMVVLTSHRTVGVAEDIAYILKKLRRAIVVGEQTLGGALDLRKLRIGQSDFFITVPVSRSLSPLGGGSQTWEGSG
VLPCVGIPAEQALGKALAILTLRRARPGAIQRLMEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPEEATMGEA
EEEDATPAANSLPEDESQRQALVDSVFQVSVLPGNVGYLRFDEFADSSVLGTLAPYVIRQVWEPLQDTNHLIMDLRYNPGGPSSAVPLLLSYFQDPAAGP
IRLFTTYDRQTNQTQEHLSRAELLGKPYGAQRGVYLLTSHHTATAAEEFAFLMQSLGRATLVGEITAGSLMHTRTFPLLQPPNGNLVLTVPILTFIDNNG
ECWLGGGVVPDAIVLAEEALDKAKEVLEFHQRLGALVEGTGHLLEAHYALPEVVGQASALLKAKLEHGTYRTAVDFESLASQLTSDLQEVSGDHRLHVFH
SPGEPVSEELTPPQKGVPSPEELTYLIEALFKTEVLPGQLGYLRFDMMAEAETVRAIAPQLVELVWEKLVHTEALVVDLRYNPGGYSTAVPLLCSYFFEA
EPRRHLYTIFDRAASQLTEVWTLPQVAGERYGSQKDLYILISHTSGSAAEAFVHTMKDQHRATVIGEPTGGGALSVGIYQVENSPLYASMPTQVAISPVT
GKAWDMAGVEPDVSVLSSEALMTTQGIVALRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHI
PEDAKDRIPGIVPMQ 0
0 LPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQVSDLLVEHVWKKVVHTDGMIIDMR 2
1 FNIGGPTSSISALCSYFFDEGQEVLLDQIYNRPNDSISEIWTQSQVA 1
2 GERYGSKKSVIILTSSMTAGAAEEFVYVMQRLGRALVIGEVTSGGCQPPQTYHVDDTDLYITIPTARSVGSGDKPSWEGVGVAPHVEVPADQALSKAKEMFNHHLQRAK* 0

>RBP3_ornAna platypus genome rife with frameshifts, dels, misassembly frag
0 MGVCLPLLLVAQFSLTGHVEPVSQPSMVLDVAKILLDNYCYPENLMGMQEAIEEAIQRGEILDIADPKRLASVLTAGVQGSLNDPRLVISYEPAPVAVSQ
QPPEPASLPAEQPLERLRPAVGSEVLEGNVGYLRVDRLPGREEIERVGAVLGRDIWEKLLGTSALVLDLRHSTGGHVSGIPFFISYFYPEGPALHVDTVY
DRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFFITVPVACSLGPLGGGGRSWEGSG
VLPCVAVPADRALDEALDILALRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPP
RKEEEQKEEEEEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCFDEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYF
QDPAAGPIRLFTTYNRPADVTREYASRAGALEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGRLLHSRTFPLLRPPWEGLVLTVPFL
TLFDPHGEGWLGGGVVPDAIVLAEEALEKAGEVLAFHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAALRSKVP
TVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQ 0 
0 IPSAETFEDLIKFSFHTSVMEGNIGYLRFDMFGDCELLTQVSELMVEHVWKKIVHTDGLIIDMR 2
1 NIGGPTSSISALCSYFFDEDHPVLLDKIYNRPNDSISEIWTHSHIA 1
2 GERYGSRKSVVILTSNMTAGAAEEFVSIMKRLGRALVVGEVTGGGCHPPQTYHVDDTHLYITIPTSRSVGSEDGSSWEGVGVTPHLVVPADVALSRAKDLFRAHLEHRD* 0

>RBP3_taeGut Taeniopygia guttata
0 MIRTHFLLLSALIMCSIPAEEIFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPLPHSGPK
QEAEGSPTREQLLSLIEHVIMYDKLEGNVGYLRIDYIIGEEVVQKVGAFLVDKVWKTLIETSALVIDLRHSTGGQISGLPFIISYLHEQDKILHVETVYN
RPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGV
MPCVATEAEQALQKSLDILAVRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAE
KPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVH
LFTTYDRRTNHTQEHNSQAELLGQSYGAKRGVYLLTSHHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTRTFPLLQPGPGITRGLTITVPVITFIDNH
GESWMGGGVVPDAIVLAEDALEKAEEVLAFHKNMGVLLEGTGQLLEDHYAIPEVAAKASAMLSTKRAQGGYRSAIDSETLASQLTSDLQEASGDHRLHVF
HSHVEPTPEEQLPNVIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLLQMVWNKLVDTDAMIIDMRYNTGGYSTAIPILCSYFFDPE
PRKHLYTVFDRSTSRSTEVWTLPQLAGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVVGEPTVGGSLSVGIYRVGNSSLYASIPSQVVLSPVTG
KVWSVSGVEPHITIQASEAMAAAQHIANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLY
IPEHAKDSIPGIMPK 0
0 QIPPPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDSELLTQLSDLMIEHVWKKIFHTDALIIDLR 2
1 YNIGGSTTPIAILCSYFFDEGHPVLLDRVYDRPSDSVKEIWTQPQLK 1
2 GERYGSQKGLVILTSAVTAGAAEEFVYIMKRLSRALIIGEQTSGGCHSPQTYQVDETNFYVVIPTSRSVTSADSTSWEGKGVSPHIETPAETALIKAKEMLNAHLHSSR* 0

>RBP3_galGal Gallus gallus 1236 aa N-terminal 21 aa signal peptide 5 glyc (3 unique) two W per repeat
0 MRTYFFLFSVLIVCSISAEEIFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQ
EAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGAFLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNR
PSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGVM
PCVASEAEQALKKSLDILAVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEM
PIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHL
FTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTFPLVQPEQGITRGLTITVPVITFIDNHG
ESWMGGGVVPDAIVLAEDALEKAEEVLTFHRKMGILLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFH
SHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQMVWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEP
RQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPSQVVLSPVTGK
VWSVSGAEPHITIQASEALAAAKHIASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYI
PEHAKDSIPGILPK 0
0 QIPSPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDCELLTQVSDLLVEHVWKKIVHTDALIIDMR 2
1 YNIGGYTNSIPILCSYFFDEGHQVLLDKVYDRPSDSVKEIWTQPQLR 1
2 GERYGSQKGLIILTSAVTAGAAEEFVFIMKRLGRALIIGEQTSGGSHSPQTYQVDDTNFYIIIPTARSVISAESASWEGKGVPPHMETPAVTALIKAKEVLSAHLHSSR* 0

>RBP3_anoCar lizard
0 MLRKCLWLSIVLVCCSSYADSVLQSTLVLDMAKLLLDNYCLPENLVGMREAIEQAIKNGEVLDISDPKLLATVLTAGVQGALNDPRLVISYEPTAPAAPK
QRMETSLTPEQLLSLIQHTVKYEVLDDNVGYLRIDYIMGQDIVQKIGSFLVEKVWKTLLGTSALILDLRYTTGGDVSGIPFIISYLYNGDKVLHVDTVYN
RPSNTTVEILTLPKVLGVRYSKDKDVILLISKYTTGVAENVAYILKHMHRTIIVGEKSAGGSLDTQKMQIGNSQFYMTVPLSCSVSPLSGSGQSWEISGV
TPCVVISAEQALDKALAILSLRKAIPNSMSYLVDIIKNNYSMLEQVPVLLQHLSTFDYSSVLSVKDLASKLNAELQTISEDPRLFLRVPASDEAVTSQTD
EKVAMASDLPNNEQLMKALVMTVFKVSVLPGNVGYMRFDEFGDATVLVKLGPYLLQHVWEPLQATDYLIIDLRYNIGGPSSSAVPVLLSYFQDPSAGPVH
FFTTYNRLTNQTQAYSSSAEMVGKPYGARRGVYLLTSHNTATAAEEFAYLMQTLGRATLVGEITAGSLSHTHTFCILELGGGCGLLINVPVITLIDNHGE
YWLGGGVVPDSIVLADEALEKAREVLEFHKGMGSLIERVGQLLEAHYAIPEMARRVSSMLNSKLAQGGYRTAVDFETLASQLTNDLQETSGDHQLHVFHS
HVEPSLEEQSPFKTLTPEELNFIIEALFKVDVLPGNVGYLRFDMMAEFESVKTIEPQILHMVWEKLVETSAMIVDMRYNTGSYSTAVPMFCSYFFDAEPQ
QHLYTIIDRSTSQSTEVWTSSQVSGKRYGSTKDLYILISHASGSAAEAFTRSLKDLHRATVIGEPTVGGSLSASIYNIGSTPLYASIPSQIVLSPVSGKV
WSLSGIQPHVTTQSNEALASAQNIILFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIP
ERSKGRILGLVPMQ 0
0 QIPPPEILEDLIKFSLHTNVFENNIGYLRFDMFGDCELMSQVSELLVQHVWNKIVNTDALIIDMR 2
1 YNVGGPACSVPLLCSYFFDEGHPILLDKVYNRPNDTTSNIWTVSKLA 1
2 GKRYGLNKGLIILTSSVTSGAAEEFAHIMKRLGRAFIIGQKTSGGCHPPQTFHVDGTNLYITTPVSRSVFSVNDSWEGVGVSPHLDVSTDVALIKAKEMLKAHLH* 0

>RBP3_xenLae Xenopus laevis
0 MPPLFQALTTALFFCGIASNPLFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAVKGGEILHISDPDTLANVFTSGVQGYLNDPRLVVSYEPNYSGPQT
EQSLELTPEQLKFLINHSVKYDILPGNIGYLRIDFIIGQDVVQKVGPHLVNNIWKKLMPTSALILDLRYSTQGEVSGIPFVVSYLCDSEIHIDSIYNRPS
NTTTDLWTLPELMGERYGKVKDVVVLTSKYTKGVAEDASYILKHMNRAIVVGEKTAGGSLDTQKIKIGQSDFYITVPVSRSLSPLTGQSWEVSGVSPCVV
VNAKDALDKAQAILAVRSSVTHVLHQLCDILANNYAFSERIPTLLQHLPNLDYSTVISEEDIAAKLNYELQSLTEDPRLVLKSKTDTLVMPGDSIQAENI
PEDEAMLQALVNTVFKVSILPGNIGYLRFDQFADVSVIAKLAPFIVNTVWEPITITENLIIDLRYNVGGSSTAVPLLLSYFLDPETKIHLFTLHNRQQNS
TDEVYSHPKVLGKPYGSKKGVYVLTSHQTATAAEEFAYLMQSLSRATIIGEITSGNLMHSKVFPFDGTQLSVTVPIINFIDSNGDYWLGGGVVPDAIVLA
DEALDKAKEIIAFHPSIFPLVKGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASLLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP
SPEELNYIIDALFKIEVLPGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSTGK
DIWTLPEVFGERYGSTKDIYILTSHMTGSAAEVFTRSLKDLNRATLIGEPTSGVSLSVGMYKVGDSNLYVTIPNQVVISSVTGKVWSVSGVEPHVIIQAN
EAMNIAHRIIKLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPM 0
0 QIPSPELFEELIKFSFHTDVFEKNIGYIRFDMFADSDLLNQVSDLLVEHVWKKVVDQDALIIDMR 2
1 FNIGGPTSSIPIFCSYFFDEGTPVLLDKIYSRTSNAMTDIWTLPDLV 1
2 GKTFGSKKPLIILTSSLTEGAAEEFVYIMKRLGRAYVVGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAEPGESWEGKGVLPDLEISSETALLKAKEILESQLEGRR* 0

>RBP3_xenTro Xenopus tropicalis 89% xenLae
0 MSPLFKALTTVLFFCIVASNPVFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAMKSGEILHISDPETLANVFTSGVQGFLNDPRLVVSYEPNYSGPRK
EQSPEPTLEQLKFLLDHSVTYDLLPGNIGYLRIDFIIGQDVVQKVGPLLVNNIWKKLMPSSALILDLRYSTQGKVSGIPFVVSYLTDPQIHIDSIYNRPS
NTTTDLWTLSELMGERYGKDKDVVVLTSKYTEGIAEGAAYILKHMSRAIVVGEKTAGGSLDIQKIKIGQSEFYITVPVSRSISPLTGQSWEVAGVFPCVV
VNANNALNKAQGILAVRSSITHILLQLSEILVNNYAFSERIPTLLQHLPNLDYSSVISEEDITAKLNYELQSLTEDPRLVLKSKTDSLVMPEDSTQVENL
PDDEATLQALVNTVFKVSILPGNIGYLRFDEFADVSVLAKLGPYIVNTVWDPITVTENLIIDLRYNIGGSSTSIPLLLSYFQEPENRIHLFTIYNRQQNS
TNEVYSLPKVLGKPYGSKKGVYVLTSHETATAAEEFAYLMQSLSRATIIGEITSGNLMHSKAFPLDGTRLSVTVPIMNFIDNNGDYWLGGGVVPDAIVLA
DEALDKAKEIIAFHPSVFALVEGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASQLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP
SAEELNYIIDALFKIEVLQGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSSGT
DIWTLPEVVGERYGSTKDIYILTSHMTGSAAEVFTRSMKELNRATIIGEPTSGVSLSVGMYKVGESNLYVSIPNQVVISSVTGKVWSVSGVEPHVIAQAS
EAMNVAHHIIKLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQ 0
0 IPSPEMFEDLIKFSFHTDVFEKNLGYIRFDMFADSDLLNQVSDLLVEHVWKKVVNQDALIIDMr 2
1 FNIGGPTSSIPTFCSYFFDEGTPVLLDKIYSRTTNAITDVWTLPHLV 1
2 GNAFGSKKPVIILTSSLTEGAAEEFVYIMKRLGRAYVIGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAKPGESWEGKGVLPDLEITSETALMKAKEILVSQLEGR* 0

>RBP3_tetNig frameshifts in genome two domains: 23-324,326-612 no upstream dup
0 MAKALFTVASLLLLANGFFVGAAFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAG
PHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPALIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYD
RPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYITVPTAKSINPITGSSWEIRGVTP
HVEVNAEDALATAIKIVNLRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPM 0
0 DYTPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLr 2
1 NNVGGPTTAIAGFCSYFFDADKQNRVGQAVRQASGTTTELLTLSELT 1
2 GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQTFRVGETDVFLLIPTVHSDTGAGPAWEGAGIAPHIPASAEAALGTARAILNKHFAGQK* 0

>RBP3_takRub fugu two domains:  23-324,326-612 plus upstream dup
0 MAKALFLVASLLLLANDVLVRAAFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAV
PPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPSLIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYD
RPSNTTTKLFSLSNLLGERYGITKPLIILTSKNTKGIAEDVAYCLKNLKRATIVGERTAGGSVKLDNFKVGSTDFYITVPTAKSINPVTGSSWEITGVKP
DVEVNAEDALATAIKIVSLRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPM 0
0 DYSPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLR 2
1 NNVGGPTTAIAGFCSYFFDADKLIVLDKLHDRPSGTTTELLTLPELT 1
2 GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQVFSVGEIGIFLSIPTVHSDTAAGPAWEGTGITPHIPVSAEAALGTAKGILNKHFGGQK* 0

>RBP3_gasAcu sticklebck two domains: 27-317,323-612 no upstream dup
0 MAKLIFLVAPLLVLGNIAFIHAGFAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVV
PPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLLLDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYD
RPSNTTTKLFSMSTLLGERYSTSKPLIILTSKNTKGIAEDVAYCLQNLKRATIVGEKTAGGSVKVDKIQVRDTGFYVTVPTAKSVNPITGSTWEVTGVTP
NVEVNAEDALATAIKIVTLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPM 0
0 NYSPEMYIELIKVSFHTDVFEDNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDAMIVDLR 2
1 NNIGGPTTAIAGFCSYFFDSDKQIVLDRLYDRPSGTTTELRTLPELT 1
2 GTRYGSKKSLVMLTSRATAGAAEEFVYIMKKLGRAMIVGETTAGTSHPPKTFRVGETDIFLSIPTVHSDTAAGPAWEGAGVAPHIPVPADAALETAKGIFKKHFAGQK* 0
 
>RBP3_oryLat medaka two domains: 28-314,320-605  no upstream dup
0 MAKTLFLVASLLVLGNVVFLHASFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVV
PPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLELVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPL
NTTTKLLSMQSTLGQTYGGTKPLLVLTSKNTKDIAEDVAYCLKNLKRATIVGEKTAGGSAKIKKFRVGDTDFYVTLPTAKSINPITGSSWEVTGVKPNVE
VNAEEALATALKIINLRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPM 0
0 EYTPEMYIELIKISFHTDVFENNIGYLRFDMFGDFEEVKAIAQVIVEHVWNKVLHTDAMIIDLR 2
1 NNVGGPTTAIAGFCSYFFDGDKQILLDKLYDRSTGTTTDLLTLGELT 1
2 GERYGSKKSLIILASRATAGAAEEFVYIMKRLGRAMIVGETTAGASHPPKVFQVGESDIFLSIPTVHSDTSAGPGWEGAGVAPHIPVAAGAALETAKAILNKHIGGQQHAAS* 0

>RBP3_danRer zebrafish upstream frag as well two domains: 22-322,324-609
0 MAQALVLLVSLLFFSNVAHCNFSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPP
AMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLLLEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRT
ADLTIELWSMPTLLGKRYGTSKPLIILTSKDTLGIAEDVAYCLKNLKRATIVGENTAGGTVKMSKMKVGDTDFYVTVPVAKSINPITGKSWEINGVAPDV
DVAAEDALDAAIAIIKLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPM 0
0 NPTPEMFIALIKSSFQTDVFENNIGYLRFDMFGDFEHVATIAQIIVEHVWNKVVDTDALIIDLr 2
1 NNIGGHASSIAGFCSYFFDADKQIVLDHIYDRPSNTTRDLQTLEQLT 1
2 GRRYGSKKSVVILTSGVTAGAAEEFVFIMKRLGRAMIIGETTHGGCQPPETFAVGESDIFLSIPISHSTAQGPSWEGAGIAPHIPVPAGAALDTAKGMLNKHFSGQK* 0

>RBP3x_takRub fugu single upstream exon 42% frameshift no transcripts three domains: 23-323,325-615,618-907
MAPRTPVLLLVLLFCALPVRSFYQHTLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRN
SIKLDILDSDVGYLRIDRIIDEETLLKFGPLLRENVWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGK
TFGGKKDMIVLIGRRTAGAAEAVAYTLKHLNRAIVVGERSAGGSLKVRKFRIAESDFYITMPVARSVSPITGKSWEVSGISPTVNVAAREALAKAQTFLA
VRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQV
LPHNTGYLRLDRFVRCSEGDKLEEIVAEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSER
GVYVLTSHYTAGAAEEFAYLIQSLHFGTVVGEITSGTLMHSKTFQVEGTDIFITVPFINFLDNNGEYWLGGGVVPDAIVLAEEALEHVNRTATFHQGLRSLIGRTGELLEKHYAIQEVAQKVGEV
LLSKWAEGLYRSVVDLESLASQLTADLQEASGDHRLHVFRCDVELESLHGVPKIAAVEEAGFVIDALFKSELLPRNVGYLRFDTMADIEAAKGAAPRLVKSVWNKLVDTDSLIIDMRYNA
GGSSTAVPLWCSYFVDGEPLQHLYTVYDRTTKTRVEVMTLPEVSGQRYDPGKDVYILTSHMTGSAAEAFVRAMRDLNRVTIVGEPTAGGSLSSATYQIGESVLYASIPNQVVTSAATGKL
WSISGVEPDVFAQARDALPVAQRIISARLLKREKGR* 0

>RBP3x_danRer zebrafish single upstream exon 55%/41% transcript DN857398 3 domains: 21-321,324-609,612-901 expressed: inner nuclear layer and ganglion cell layer
MAGVFVFILVTYRVLLVNASFQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPA
LHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLLHNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPT
NITRELWTLPTLLGERFGKRKDLIVLISKRTIGAAEGVAYILKHLKRAVIIGERSAGGSVRVDKLKIGDSGFYITVPVARSVNPVTGQSWEVSGVAPSVT
VNPKESIAKAKSLISVRKTIPKAVRRVSDIIKRYYSFKDKIPALLNQLAKADYFTVVSEEDLAGKLNHEMQSVFEDPRLLIKATQVLTDDASSEDRSSSD
DLTDPLFKLEMISGNNGYLRFDRFPTPEVLLRLEDHIKKKIWQPVQETENLVIDLRFNTGGSTEALPILLSYMFDTSSSTYLFSIYDSIKNTTFDFHTLN
NISGPSYGSTKGVYVLTSYYTAEAGEEFAYLMQSLHRGTVIGEITSGMLLHSKTFQIEQTSLAITVPIINFIDVNGECWLGGGVVPDAIVLAEEALERAH
EIIAFHKNIQGLVQEAGDLLEKHYSVPEVAAKVSRLLQSKLTEGLYRSVVDYESLASQLTSDLQETSGDQRLHIFYCETEPETLHDTPKIPSPEEAGFIV
EALFKVDVMSGNIGYLRFDMMEDIKVLQAINPEFLKVVWNKLVNTDMLIIDVRYNTGGYSTAIPLLCTYFFDAQPLTHIYTLFDRSTATVTKVTTLPDVL
GQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSILYASIPNQAVLNAVTGKPWSISGVEPHIVAQASDALIVAQKI
IATKQQKKNSGK* 0

>RBP3x_salSal Salmo salar transcript frag DY725143
EETAAKLGPLLRENIWTKVTHASSLIFDLRYSTAGELSGVPFIISYFSDPEPLIHIDTVFDRPSNTTKELWTMSSIMGERYGKRKDLIVLTSKRTMGAAEAIAYTLKHLNRAIIVGERSA
GGSVKVQKIRIGDSGFYITVPVARSVNPITGQSWEVSGVSPSVNINAKEAVANAKNLLAVRSAIPNAVQSVSDIIRQYYSFTDRVPALLQHLESTDFFSVISEEDLANKFNNELQSVSEDPRLMIKL

>RBP3_calMil elephantfish frag 2 domains 6-243,334-531
PPVTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEP
PVGLFTVYNRLTNTTSHTTLPGVGQHVYGSRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECW
LGGGVVPDSIVLAEDTLERTKEIIGFHAQVAELVESTGKLLAVHYAIPEVAAEVSAVLSAKLTQGLYRSVVDWESLASRLTVDLQETSVWSVSGAEPHVI
VQANEAMTVALGIINLRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQ 0
0 QMPPPEILEDLVKFSYQTKVLENNVGYLRFDMFGDNEMITQVSELMAKHVWNVIASTSSLIVDLR 2
1 YNIGGPTSSIPILCSYFFDDDKTVLLDTVYSRPTDTISEMKAIPQVAGNGSTESSVHSYI 1
2  * 0

>RBP3_petMar lamprey exon3/4 fused, exon4 run-on, fixed genomic frameshift; four domains: 34-312,327-615,625-914,916-1217
0 MAGSREQRTAFSTRLLLLLLLPLATCPSQAPYKFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAV
SYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLVDTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYY
RPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPADRSWGVFPCVSAP
SERALDKALEILNARGVARKAVEAAGELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLP
DDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIHPTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTA
EFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNNDEYWLGGGVVPDAIVLAENALDAAKEIIEFHAKMASL
LELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKHV
GPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTG
TYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMVALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHL
KAVCVPEHATDRMPGIVPMQ 0
0 MPPTESFEDLIKFSFITDVLEGNIGYLRFDLFSDLEALEHVAHLLVEHVWKKICDTEILIIDLR 2
1 YNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVL 1^2 GQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMVALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQVNVVRTRI* 0

>RBP3_braFlo Branchiostoma floridae Region: 9 exons 1 domain: 83-381 ClpP/crotonase e-38 419-630; misfused to PAPS sulfotransferase
0 MTRPSKVDIVFPIKPFTIPTAHEQVKGEGPVDINKNALCKSADEGHTHP 1
2 VSIAMAPTAYIVFVALVPTVLSVDWLDVVMGIGDVMADHYLDQDLRALNDQSLLQRWNRTLVHRFQ 0
0 SWSQDDMSDSLRMEEGLTSELRNITGDETIK 0
0 VWDFGVYENTTQEPVPREFYNFSTFVDNFK 2
1 KNREKHINVTMLEGNVGYVSIRSMSHIVDIILPDPEMTEFFLSKMAALNESK 0
0 AIILDLRYNLGGDREGVVHWASFFFNATPSVPLSDVYYRDGVNQYWTLLE 0
0 VPGGIRFPDMPLYLLTSNRTSREAEEFAYAMQVVNRTTIIGETT 1
2 AGEEFTGMWFPIDQTDVHLLTRTNVVRNPITQDSWSGK 1
2 GVTPDIIVPSEKALTVALRKIQGSEDTKMAASSGNIEPPRWTVYLVFICTSIAILTYPTFM* 0

RBP3 proteins parsed into constituent modules

>M1_homSap 
GPTHPALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLG
ERYGADKDVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAIL
>M2_homSap
TLRSALPGVVHCLQ
EVLKDYYTLVDRVPTLLQHLASMDFSTVVSEEDLVTKLNAGLQAASEDPRLLVRAIGPTETPSWPAPDAAAEDSPGVAPELPEDEAIRQALVDSVFQVSVLPGNVGYLRFDSFADA
SVLGVLAPYVLRQVWEPLQDTEHLIMDLRHNPGGPSSAVPLLLSYFQGPEAGPVHLFTTYDRRTNITQEHFSHMELPGPRYSTQRGVYLLTSHRTATAAEEFAFLMQSLGWATLVG
EITAGNLLHTRTVPLLDTPEGSLALTVPVLTFIDNHGEAWLGGGVVPDAIVLAEEALDKAQEVL
>M3_homSap
EFHQSLGALVEGTGHLLEAHYARPEVVGQTSALLRAKLAQGAYRTAVDLESL
ASQLTADLQEVSGDHRLLVFHSPGELVVEEAPPPPPAVPSPEELTYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVRLVWQQLVDTAALVIDLRYNPGSYSTAIPLLCS
YFFEAEPRQHLYSVFDRATSKVTEVWTLPQVAGQRYGSHKDLYILMSHTSGSAAEAFAHTMQDLQRATVIGEPTAGGALSVGIYQVGSSPLYASMPTQMAMSATTGKAWDLAGVEP
DITVPMSEALSIAQDIV
>M4_homSap
ALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQ 0
0 IPSPEVFEELIKFSFHTNVLEDNIGYLRFDMFGDGELLTQVSRLLVEHIWKKIMHTDAMIIDMR 2
1 FNIGGPTSSIPILCSYFFDEGPPVLLDKIYSRPDDSVSELWTHAQVV 1
2 GERYGSKKSMVILTSSVTAGTAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTNLYLTIPTARSVGASDGSSWEGVGVTPHVVVPAEEALARAKEML 
 
>M1_bosTau 
LFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPP
RAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLRSFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVY
DRPSNTTTEIWTLPEALGEKYSADKDVVVLTSSRTGGVAEDIAYILKQMRRAIVVGERTVGGALNLQKLRVGQSDFFLTVPVSRSLGPLGEGSQTWEGSG
VLPCVGTPAEQALEKALAVL
>M2_bosTau
LRRALPGVIQRLQEALREYYTLVDRVPALLSHLAAMDLSSVVSEDDLVTKLNAGLQAVSEDPRLQVQVVRPKEASSGPE
EEAEEPPEAVPEVPEDEAVRRALVDSVFQVSVLPGNVGYLRFDSFADASVLEVLGPYILHQVWEPLQDTEHLIMDLRQNPGGPSSAVPLLLSYFQSPDAS
PVRLFSTYDRRTNITREHFSQTELLGRPYGTQRGVYLLTSHRTATAAEELAFLMQSLGWATLVGEITAGSLLHTHTVSLLETPEGGLALTVPVLTFIDNH
GECWLGGGVVPDAIVLAEEALDRAQEVL
>M3_bosTau
EFHRSLGELVEGTGRLLEAHYARPEVVGQMGALLRAKLAQGAYRTAVDLESLASQLTADLQEMSGDHRLLVF
HSPGEMVAEEAPPPPPVVPSPEELSYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVQLVWQKLVDTAALVVDLRYNPGSYSTAVPLLCSYFFE
AEPRRHLYSVFDRATSRVTEVWTLPHVTGQRYGSHKDLYVLVSHTSGSAAEAFAHTMQDLQRATIIGEPTAGGALSVGIYQVGSSALYASMPTQMAMSAS
TGEAWDLAGVEPDITVPMSVALSTARDI
>M4_bosTau
LRAKVPTVLQTAGKLVADNYASPELGVKMAAELSGLQSRYARVTSEAALAELLQADLQVLSGDPHLKTAH
IPEDAKDRIPGIVPMQIPSPEVFEDLIKFSFHTNVLEGNVGYLRFDMFGDCELLTQVSELLVEHVWKKIVHTDALIVDMRFNIGGPTSSISALCSYFFDE
GPPILLDKIYNRPNNSVSELWTLSQLEGERYGSKKSMVILTSTLTAGAAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTDLYLTIPTARSVGAAD
GSSWEGVGVVPDVAVPAEAALTRAQEML

>M1_monDom
IFQPSLVRDMAKILLDNYCFPENLMGMQEVIEQAIKSGEILDISDPQMLASVLTAGVQGALNDPRLVISFEPSIPETPQ
HVPKLANVTQEELLILLQQMIKYQVLEGNVGYLRVDYIPGQEVVEKVGEFLVNNIWKKLMGTSSLVLDLQHSSGGEISGIPFVISYLHQGDILLHVDTVY
DRPSNTTTEIWTLPQVLGERYGGEKDMVVLTSHRTVGVAEDIAYILKKLRRAIVVGEQTLGGALDLRKLRIGQSDFFITVPVSRSLSPLGGGSQTWEGSG
VLPCVGIPAEQALGKALAIL
>M2_monDom
LRRARPGAIQRLMEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPEEATMGEA
EEEDATPAANSLPEDESQRQALVDSVFQVSVLPGNVGYLRFDEFADSSVLGTLAPYVIRQVWEPLQDTNHLIMDLRYNPGGPSSAVPLLLSYFQDPAAGP
IRLFTTYDRQTNQTQEHLSRAELLGKPYGAQRGVYLLTSHHTATAAEEFAFLMQSLGRATLVGEITAGSLMHTRTFPLLQPPNGNLVLTVPILTFIDNNG
ECWLGGGVVPDAIVLAEEALDKAKEVL
>M3_monDom
EFHQRLGALVEGTGHLLEAHYALPEVVGQASALLKAKLEHGTYRTAVDFESLASQLTSDLQEVSGDHRLHVFH
SPGEPVSEELTPPQKGVPSPEELTYLIEALFKTEVLPGQLGYLRFDMMAEAETVRAIAPQLVELVWEKLVHTEALVVDLRYNPGGYSTAVPLLCSYFFEA
EPRRHLYTIFDRAASQLTEVWTLPQVAGERYGSQKDLYILISHTSGSAAEAFVHTMKDQHRATVIGEPTGGGALSVGIYQVENSPLYASMPTQVAISPVT
GKAWDMAGVEPDVSVLSSEALMTTQGI
>M4_monDom
LRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHI
PEDAKDRIPGIVPMQLPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQVSDLLVEHVWKKVVHTDGMIIDMR 2
1 FNIGGPTSSISALCSYFFDEGQEVLLDQIYNRPNDSISEIWTQSQVA 1
2 GERYGSKKSVIILTSSMTAGAAEEFVYVMQRLGRALVIGEVTSGGCQPPQTYHVDDTDLYITIPTARSVGSGDKPSWEGVGVAPHVEVPADQALSKAKEM 

>M1_ornAna genome rife with frameshifts, dels, misassembly
SQPSMVLDVAKILLDNYCYPENLMGMQEAIEEAIQRGEILDIADPKRLASVLTAGVQGSLNDPRLVISYEPAPVAVSQ
QPPEPASLPAEQPLERLRPAVGSEVLEGNVGYLRVDRLPGREEIERVGAVLGRDIWEKLLGTSALVLDLRHSTGGHVSGIPFFISYFYPEGPALHVDTVY
DRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFFITVPVACSLGPLGGGGRSWEGSG
VLPCVAVPADRALDEALDIL
>M2_ornAna 
LRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPPRKEEEQKEEE
EEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCFDEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYNRPADVTREYASRAGA
LEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGRLLHSRTFPLLRPPWEGLVLTVPFLTLFDPHGEGWLGGGVVPDAIVLAEEALEKAGEVL
>M2_ornAna 
FHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAA
>M4_ornAna
LRSKVPTVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQ 0
0 IPSAETFEDLIKFSFHTSVMEGNIGYLRFDMFGDCELLTQVSELMVEHVWKKIVHTDGLIIDMR 2
1 NIGGPTSSISALCSYFFDEDHPVLLDKIYNRPNDSISEIWTHSHIA 1
2 GERYGSRKSVVILTSNMTAGAAEEFVSIMKRLGRALVVGEVTGGGCHPPQTYHVDDTHLYITIPTSRSVGSEDGSSWEGVGVTPHLVVPADVALSRAKDL 

>M1_taeGut Taeniopygia guttata
IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPLPHSGPK
QEAEGSPTREQLLSLIEHVIMYDKLEGNVGYLRIDYIIGEEVVQKVGAFLVDKVWKTLIETSALVIDLRHSTGGQISGLPFIISYLHEQDKILHVETVYN
RPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGV
MPCVATEAEQALQKSLDIL
>M2_taeGut 
VRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAE
KPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVH
LFTTYDRRTNHTQEHNSQAELLGQSYGAKRGVYLLTSHHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTRTFPLLQPGPGITRGLTITVPVITFIDNH
GESWMGGGVVPDAIVLAEDALEKAEEVLA
>M3_taeGut
FHKNMGVLLEGTGQLLEDHYAIPEVAAKASAMLSTKRAQGGYRSAIDSETLASQLTSDLQEASGDHRLHVF
HSHVEPTPEEQLPNVIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLLQMVWNKLVDTDAMIIDMRYNTGGYSTAIPILCSYFFDPE
PRKHLYTVFDRSTSRSTEVWTLPQLAGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVVGEPTVGGSLSVGIYRVGNSSLYASIPSQVVLSPVTG
KVWSVSGVEPHITIQASEAMAAAQHI
>M4_taeGut
ANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLY
IPEHAKDSIPGIMPK 0
0 QIPPPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDSELLTQLSDLMIEHVWKKIFHTDALIIDLR 2
1 YNIGGSTTPIAILCSYFFDEGHPVLLDRVYDRPSDSVKEIWTQPQLK 1
2 GERYGSQKGLVILTSAVTAGAAEEFVYIMKRLSRALIIGEQTSGGCHSPQTYQVDETNFYVVIPTSRSVTSADSTSWEGKGVSPHIETPAETALIKAKEM 

>M1_galGal 
IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQ
EAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGAFLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNR
PSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGVM
PCVASEAEQALKKSLDIL
>M2_galGal 
AVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEM
PIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHL
FTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTFPLVQPEQGITRGLTITVPVITFIDNHG
ESWMGGGVVPDAIVLAEDALEKAEEVL
>M3_galGal 
LLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFH
SHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQMVWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEP
RQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPSQVVLSPVTGK
VWSVSGAEPHITIQASEALAAAKHI
>M4_galGal 
ASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYI
PEHAKDSIPGILPK 0
0 QIPSPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDCELLTQVSDLLVEHVWKKIVHTDALIIDMR 2
1 YNIGGYTNSIPILCSYFFDEGHQVLLDKVYDRPSDSVKEIWTQPQLR 1
2 GERYGSQKGLIILTSAVTAGAAEEFVFIMKRLGRALIIGEQTSGGSHSPQTYQVDDTNFYIIIPTARSVISAESASWEGKGVPPHMETPAVTALIKAKEVL 

>M1_anoCar
VLQSTLVLDMAKLLLDNYCLPENLVGMREAIEQAIKNGEVLDISDPKLLATVLTAGVQGALNDPRLVISYEPTAPAAPK
QRMETSLTPEQLLSLIQHTVKYEVLDDNVGYLRIDYIMGQDIVQKIGSFLVEKVWKTLLGTSALILDLRYTTGGDVSGIPFIISYLYNGDKVLHVDTVYN
RPSNTTVEILTLPKVLGVRYSKDKDVILLISKYTTGVAENVAYILKHMHRTIIVGEKSAGGSLDTQKMQIGNSQFYMTVPLSCSVSPLSGSGQSWEISGV
TPCVVISAEQALDKALAIL
>M2_anoCar
SLRKAIPNSMSYLVDIIKNNYSMLEQVPVLLQHLSTFDYSSVLSVKDLASKLNAELQTISEDPRLFLRVPASDEAVTSQTD
EKVAMASDLPNNEQLMKALVMTVFKVSVLPGNVGYMRFDEFGDATVLVKLGPYLLQHVWEPLQATDYLIIDLRYNIGGPSSSAVPVLLSYFQDPSAGPVH
FFTTYNRLTNQTQAYSSSAEMVGKPYGARRGVYLLTSHNTATAAEEFAYLMQTLGRATLVGEITAGSLSHTHTFCILELGGGCGLLINVPVITLIDNHGE
YWLGGGVVPDSIVLADEALEKAREVLE
>M3_anoCar
EAHYAIPEMARRVSSMLNSKLAQGGYRTAVDFETLASQLTNDLQETSGDHQLHVFHS
HVEPSLEEQSPFKTLTPEELNFIIEALFKVDVLPGNVGYLRFDMMAEFESVKTIEPQILHMVWEKLVETSAMIVDMRYNTGSYSTAVPMFCSYFFDAEPQ
QHLYTIIDRSTSQSTEVWTSSQVSGKRYGSTKDLYILISHASGSAAEAFTRSLKDLHRATVIGEPTVGGSLSASIYNIGSTPLYASIPSQIVLSPVSGKV
WSLSGIQPHVTTQSNEALASAQNII
>M4_anoCar
LFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIP
ERSKGRILGLVPMQ 0
0 QIPPPEILEDLIKFSLHTNVFENNIGYLRFDMFGDCELMSQVSELLVQHVWNKIVNTDALIIDMR 2
1 YNVGGPACSVPLLCSYFFDEGHPILLDKVYNRPNDTTSNIWTVSKLA 1
2 GKRYGLNKGLIILTSSVTSGAAEEFAHIMKRLGRAFIIGQKTSGGCHPPQTFHVDGTNLYITTPVSRSVFSVNDSWEGVGVSPHLDVSTDVALIKAKEML 

>M1_xenLae Xenopus laevis
LFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAVKGGEILHISDPDTLANVFTSGVQGYLNDPRLVVSYEPNYSGPQT
EQSLELTPEQLKFLINHSVKYDILPGNIGYLRIDFIIGQDVVQKVGPHLVNNIWKKLMPTSALILDLRYSTQGEVSGIPFVVSYLCDSEIHIDSIYNRPS
NTTTDLWTLPELMGERYGKVKDVVVLTSKYTKGVAEDASYILKHMNRAIVVGEKTAGGSLDTQKIKIGQSDFYITVPVSRSLSPLTGQSWEVSGVSPCVV
VNAKDALDKAQAIL
>M2_xenLae
AVRSSVTHVLHQLCDILANNYAFSERIPTLLQHLPNLDYSTVISEEDIAAKLNYELQSLTEDPRLVLKSKTDTLVMPGDSIQAENI
PEDEAMLQALVNTVFKVSILPGNIGYLRFDQFADVSVIAKLAPFIVNTVWEPITITENLIIDLRYNVGGSSTAVPLLLSYFLDPETKIHLFTLHNRQQNS
TDEVYSHPKVLGKPYGSKKGVYVLTSHQTATAAEEFAYLMQSLSRATIIGEITSGNLMHSKVFPFDGTQLSVTVPIINFIDSNGDYWLGGGVVPDAIVLA
DEALDKAKEII
>M3_xenLae
FHPSIFPLVKGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASLLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP
SPEELNYIIDALFKIEVLPGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSTGK
DIWTLPEVFGERYGSTKDIYILTSHMTGSAAEVFTRSLKDLNRATLIGEPTSGVSLSVGMYKVGDSNLYVTIPNQVVISSVTGKVWSVSGVEPHVIIQAN
EAMNIAHRII
>M4_xenLae
KLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPM 0
0 QIPSPELFEELIKFSFHTDVFEKNIGYIRFDMFADSDLLNQVSDLLVEHVWKKVVDQDALIIDMR 2
1 FNIGGPTSSIPIFCSYFFDEGTPVLLDKIYSRTSNAMTDIWTLPDLV 1
2 GKTFGSKKPLIILTSSLTEGAAEEFVYIMKRLGRAYVVGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAEPGESWEGKGVLPDLEISSETALLKAKEIL 

>M1_xenTro Xenopus tropicalis 89% xenLae
VFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAMKSGEILHISDPETLANVFTSGVQGFLNDPRLVVSYEPNYSGPRK
EQSPEPTLEQLKFLLDHSVTYDLLPGNIGYLRIDFIIGQDVVQKVGPLLVNNIWKKLMPSSALILDLRYSTQGKVSGIPFVVSYLTDPQIHIDSIYNRPS
NTTTDLWTLSELMGERYGKDKDVVVLTSKYTEGIAEGAAYILKHMSRAIVVGEKTAGGSLDIQKIKIGQSEFYITVPVSRSISPLTGQSWEVAGVFPCVV
VNANNALNKAQGIL
>M2_xenTro
AVRSSITHILLQLSEILVNNYAFSERIPTLLQHLPNLDYSSVISEEDITAKLNYELQSLTEDPRLVLKSKTDSLVMPEDSTQVENL
PDDEATLQALVNTVFKVSILPGNIGYLRFDEFADVSVLAKLGPYIVNTVWDPITVTENLIIDLRYNIGGSSTSIPLLLSYFQEPENRIHLFTIYNRQQNS
TNEVYSLPKVLGKPYGSKKGVYVLTSHETATAAEEFAYLMQSLSRATIIGEITSGNLMHSKAFPLDGTRLSVTVPIMNFIDNNGDYWLGGGVVPDAIVLA
DEALDKAKEII
>M3_xenTro
FHPSVFALVEGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASQLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP
SAEELNYIIDALFKIEVLQGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSSGT
DIWTLPEVVGERYGSTKDIYILTSHMTGSAAEVFTRSMKELNRATIIGEPTSGVSLSVGMYKVGESNLYVSIPNQVVISSVTGKVWSVSGVEPHVIAQAS
EAMNVAHHII
>M4_xenTro
KLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQ 0
0 IPSPEMFEDLIKFSFHTDVFEKNLGYIRFDMFADSDLLNQVSDLLVEHVWKKVVNQDALIIDMr 2
1 FNIGGPTSSIPTFCSYFFDEGTPVLLDKIYSRTTNAITDVWTLPHLV 1
2 GNAFGSKKPVIILTSSLTEGAAEEFVYIMKRLGRAYVIGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAKPGESWEGKGVLPDLEITSETALMKAKEIL 

>M1_tetNig  
AFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAG
PHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPALIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYD
RPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYITVPTAKSINPITGSSWEIRGVTP
HVEVNAEDALATAIKIV
>M4_tetNig 
LRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPM 0
0 DYTPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLr 2
1 NNVGGPTTAIAGFCSYFFDADKQNRVGQAVRQASGTTTELLTLSELT 1
2 GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQTFRVGETDVFLLIPTVHSDTGAGPAWEGAGIAPHIPASAEAALGTAR 

>M1_takRub two domains:  23-324,326-612 plus upstream dup
AFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAV
PPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPSLIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYD
RPSNTTTKLFSLSNLLGERYGITKPLIILTSKNTKGIAEDVAYCLKNLKRATIVGERTAGGSVKLDNFKVGSTDFYITVPTAKSINPVTGSSWEITGVKP
DVEVNAEDALATAIKIV
>M4_takRub
LRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPM 0
0 DYSPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLR 2
1 NNVGGPTTAIAGFCSYFFDADKLIVLDKLHDRPSGTTTELLTLPELT 1
2 GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQVFSVGEIGIFLSIPTVHSDTAAGPAWEGTGITPHIPVSAEAALGTAK

>M1_gasAcu two domains: 27-317,323-612 no upstream dup
FAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVV
PPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLLLDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYD
RPSNTTTKLFSMSTLLGERYSTSKPLIILTSKNTKGIAEDVAYCLQNLKRATIVGEKTAGGSVKVDKIQVRDTGFYVTVPTAKSVNPITGSTWEVTGVTP
NVEVNAEDALATAIKIV
DALATAIKIV
>M4_gasAcu 
TLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPM 0
0 NYSPEMYIELIKVSFHTDVFEDNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDAMIVDLR 2
1 NNIGGPTTAIAGFCSYFFDSDKQIVLDRLYDRPSGTTTELRTLPELT 1
2 GTRYGSKKSLVMLTSRATAGAAEEFVYIMKKLGRAMIVGETTAGTSHPPKTFRVGETDIFLSIPTVHSDTAAGPAWEGAGVAPHIPVPADAALETAKGIFKKHFAGQK* 0
 
>M1_oryLat two domains: 28-314,320-605 no upstream dup
SFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVV
PPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLELVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPL
NTTTKLLSMQSTLGQTYGGTKPLLVLTSKNTKDIAEDVAYCLKNLKRATIVGEKTAGGSAKIKKFRVGDTDFYVTLPTAKSINPITGSSWEVTGVKPNVE
VNAEEALATALKII
>M4_oryLat
LRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPM 0
0 EYTPEMYIELIKISFHTDVFENNIGYLRFDMFGDFEEVKAIAQVIVEHVWNKVLHTDAMIIDLR 2
1 NNVGGPTTAIAGFCSYFFDGDKQILLDKLYDRSTGTTTDLLTLGELT 1
2 GERYGSKKSLIILASRATAGAAEEFVYIMKRLGRAMIVGETTAGASHPPKVFQVGESDIFLSIPTVHSDTSAGPGWEGAGVAPHIPVAAGAALETAK

>M1_danRer upstream frag as well two domains: 22-322,324-609
FSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPP
AMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLLLEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRT
ADLTIELWSMPTLLGKRYGTSKPLIILTSKDTLGIAEDVAYCLKNLKRATIVGENTAGGTVKMSKMKVGDTDFYVTVPVAKSINPITGKSWEINGVAPDV
DVAAEDALDAAIAII
>M4_danRer
KLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPM 0
0 NPTPEMFIALIKSSFQTDVFENNIGYLRFDMFGDFEHVATIAQIIVEHVWNKVVDTDALIIDLr 2
1 NNIGGHASSIAGFCSYFFDADKQIVLDHIYDRPSNTTRDLQTLEQLT 1
2 GRRYGSKKSVVILTSGVTAGAAEEFVFIMKRLGRAMIIGETTHGGCQPPETFAVGESDIFLSIPISHSTAQGPSWEGAGIAPHIPVPAGAALDTAK

>M1x_takRub single upstream exon 42% frameshift no transcripts three domains: 23-323,325-615,618-907
TLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRN
SIKLDILDSDVGYLRIDRIIDEETLLKFGPLLRENVWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGK
TFGGKKDMIVLIGRRTAGAAEAVAYTLKHLNRAIVVGERSAGGSLKVRKFRIAESDFYITMPVARSVSPITGKSWEVSGISPTVNVAAREALAKAQTFL
>M2x_takRub
AVRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQV
LPHNTGYLRLDRFVRCSEGDKLEEIVAEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSER
GVYVLTSHYTAGAAEEFAYLIQSLHFGTVVGEITSGTLMHSKTFQVEGTDIFITVPFINFLDNNGEYWLGGGVVPDAIVLAEEALE 
>M3x_takRub
FHQGLRSLIGRTGELLEKHYAIQEVAQKVGEV
LLSKWAEGLYRSVVDLESLASQLTADLQEASGDHRLHVFRCDVELESLHGVPKIAAVEEAGFVIDALFKSELLPRNVGYLRFDTMADIEAAKGAAPRLVKSVWNKLVDTDSLIIDMRYNA
GGSSTAVPLWCSYFVDGEPLQHLYTVYDRTTKTRVEVMTLPEVSGQRYDPGKDVYILTSHMTGSAAEAFVRAMRDLNRVTIVGEPTAGGSLSSATYQIGESVLYASIPNQVVTSAATGKL
WSISGVEPDVFAQARDALPVAQRII

>M1x_danRer  
FQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPA
LHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLLHNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPT
NITRELWTLPTLLGERFGKRKDLIVLISKRTIGAAEGVAYILKHLKRAVIIGERSAGGSVRVDKLKIGDSGFYITVPVARSVNPVTGQSWEVSGVAPSVTVNPKESIAKAKSLI
>M2x_danRer
SVRKTIPKAVRRVSDIIKRYYSFKDKIPALLNQLAKADYFTVVSEEDLAGKLNHEMQSVFEDPRLLIKATQVLTDDASSEDRSSSD
DLTDPLFKLEMISGNNGYLRFDRFPTPEVLLRLEDHIKKKIWQPVQETENLVIDLRFNTGGSTEALPILLSYMFDTSSSTYLFSIYDSIKNTTFDFHTLN
NISGPSYGSTKGVYVLTSYYTAEAGEEFAYLMQSLHRGTVIGEITSGMLLHSKTFQIEQTSLAITVPIINFIDVNGECWLGGGVVPDAIVLAEEALERAHEII
>M3x_danRer
FHKNIQGLVQEAGDLLEKHYSVPEVAAKVSRLLQSKLTEGLYRSVVDYESLASQLTSDLQETSGDQRLHIFYCETEPETLHDTPKIPSPEEAGFIV
EALFKVDVMSGNIGYLRFDMMEDIKVLQAINPEFLKVVWNKLVNTDMLIIDVRYNTGGYSTAIPLLCTYFFDAQPLTHIYTLFDRSTATVTKVTTLPDVL
GQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSILYASIPNQAVLNAVTGKPWSISGVEPHIVAQASDALIVAQKII

>M2_calMil frag 2 domains 6-243,334-531
VTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEP
PVGLFTVYNRLTNTTSHTTLPGVGQHVYGSRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECW
LGGGVVPDSIVLAEDTLERTKEII
>M3_calMil frag
GFHAQVAELVESTGKLLAVHYAIPEVAAEVSAVLSAKLTQGLYRSVVDWESLASRLTVDLQETSVWSVSGAEPHVI
VQANEAMTVALGIIN
>M4_calMil frag
LRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQ 0
0 QMPPPEILEDLVKFSYQTKVLENNVGYLRFDMFGDNEMITQVSELMAKHVWNVIASTSSLIVDLR 2
1 YNIGGPTSSIPILCSYFFDDDKTVLLDTVYSRPTDTISEMKAIPQVAGNGSTESSVHSYI 1
2  * 0

>M1_petMar  exon3/4 fused, exon4 run-on, fixed genomic frameshift; four domains: 34-312,327-615,625-914,916-1217
KFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAV
SYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLVDTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYY
RPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPADRSWGVFPCVSAP
SERALDKALEIL
>M2_petMar 
ELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLP
DDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIHPTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTA
EFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNNDEYWLGGGVVPDAIVLAENALDAAKEII
>M3_petMar 
FHAKMASL
LELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKHV
GPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTG
TYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMV
>M4_petMar 
ALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQ 0
0 MPPTESFEDLIKFSFITDVLEGNIGYLRFDLFSDLEALEHVAHLLVEHVWKKICDTEILIIDLR 2
1 YNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVL 1^2 GQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSA

>M3/4_braFlo Branchiostoma floridae Region: 9 exons 
0 MTRPSKVDIVFPIKPFTIPTAHEQVKGEGPVDINKNALCKSADEGHTHP 1
2 VSIAMAPTAYIVFVALVPTVLSVDWLDVVMGIGDVMADHYLDQDLRALNDQSLLQRWNRTLVHRFQ 0
0 SWSQDDMSDSLRMEEGLTSELRNITGDETIK 0
0 VWDFGVYENTTQEPVPREFYNFSTFVDNFK 2
1 KNREKHINVTMLEGNVGYVSIRSMSHIVDIILPDPEMTEFFLSKMAALNESK 0
0 AIILDLRYNLGGDREGVVHWASFFFNATPSVPLSDVYYRDGVNQYWTLLE 0
0 VPGGIRFPDMPLYLLTSNRTSREAEEFAYAMQVVNRTTIIGETT 1
2 AGEEFTGMWFPIDQTDVHLLTRTNVVRNPITQDSWSGK 1
2 GVTPDIIVPSEKALTVALRKIQGSEDTKMAASSGNIEPPRWTVYL VFICTSIAILTYPTFM* 0

RBP3 proteins parsed by module class


>M1_homSap 
GPTHPALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLGERYGADK
DVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAIL

>M1_bosTau 
LFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPPRAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLR
SFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVYDRPSNTTTEIWTLPEALGEKYSADKDVVVLTSSRTGGVAEDIAYILKQMRRAIVVGERTVGGALNLQKLRVGQSDF

>M1_monDom
IFQPSLVRDMAKILLDNYCFPENLMGMQEVIEQAIKSGEILDISDPQMLASVLTAGVQGALNDPRLVISFEPSIPETPQHVPKLANVTQEELLILLQQMIKYQVLEGNVGYLRVDYIPGQEVVEKVG
EFLVNNIWKKLMGTSSLVLDLQHSSGGEISGIPFVISYLHQGDILLHVDTVYDRPSNTTTEIWTLPQVLGERYGGEKDMVVLTSHRTVGVAEDIAYILKKLRRAIVVGEQTLGGALDLRKLRIGQSDF

>M1_ornAna genome rife with frameshifts, dels, misassembly
SQPSMVLDVAKILLDNYCYPENLMGMQEAIEEAIQRGEILDIADPKRLASVLTAGVQGSLNDPRLVISYEPAPVAVSQQPPEPASLPAEQPLERLRPAVGSEVLEGNVGYLRVDRLPGREEIERVGA
VLGRDIWEKLLGTSALVLDLRHSTGGHVSGIPFFISYFYPEGPALHVDTVYDRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFF

>M1_galGal 
IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQEAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGA
FLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNRPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFY

>M1_taeGut Taeniopygia guttata
IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPLPHSGPKQEAEGSPTREQLLSLIEHVIMYDKLEGNVGYLRIDYIIGEEVVQKVGA
FLVDKVWKTLIETSALVIDLRHSTGGQISGLPFIISYLHEQDKILHVETVYNRPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFY

>M1_anoCar
VLQSTLVLDMAKLLLDNYCLPENLVGMREAIEQAIKNGEVLDISDPKLLATVLTAGVQGALNDPRLVISYEPTAPAAPKQRMETSLTPEQLLSLIQHTVKYEVLDDNVGYLRIDYIMGQDIVQKIGS
FLVEKVWKTLLGTSALILDLRYTTGGDVSGIPFIISYLYNGDKVLHVDTVYNRPSNTTVEILTLPKVLGVRYSKDKDVILLISKYTTGVAENVAYILKHMHRTIIVGEKSAGGSLDTQKMQIGNSQFY

>M1_xenTro Xenopus tropicalis 89% xenLae
VFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAMKSGEILHISDPETLANVFTSGVQGFLNDPRLVVSYEPNYSGPRKEQSPEPTLEQLKFLLDHSVTYDLLPGNIGYLRIDFIIGQDVVQKVGPL
LVNNIWKKLMPSSALILDLRYSTQGKVSGIPFVVSYLTDPQIHIDSIYNRPSNTTTDLWTLSELMGERYGKDKDVVVLTSKYTEGIAEGAAYILKHMSRAIVVGEKTAGGSLDIQKIKIGQSEFYITV

>M1_xenLae Xenopus laevis
LFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAVKGGEILHISDPDTLANVFTSGVQGYLNDPRLVVSYEPNYSGPQTEQSLELTPEQLKFLINHSVKYDILPGNIGYLRIDFIIGQDVVQKVGPH
LVNNIWKKLMPTSALILDLRYSTQGEVSGIPFVVSYLCDSEIHIDSIYNRPSNTTTDLWTLPELMGERYGKVKDVVVLTSKYTKGVAEDASYILKHMNRAIVVGEKTAGGSLDTQKIKIGQSDFYITV

>M1_danRer upstream frag as well two domains: 22-322,324-609
FSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPPAMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLL
LEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRTADLTIELWSMPTLLGKRYGTSKPLIILTSKDTLGIAEDVAYCLKNLKRATIVGENTAGGTVKMSKMKVGDTDFYVT

>M1_takRub two domains: 23-324,326-612 plus upstream dup
AFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAVPPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPS
LIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYDRPSNTTTKLFSLSNLLGERYGITKPLIILTSKNTKGIAEDVAYCLKNLKRATIVGERTAGGSVKLDNFKVGSTDFYI

>M1_gasAcu two domains: 27-317,323-612 no upstream dup
FAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVVPPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLL
LDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYDRPSNTTTKLFSMSTLLGERYSTSKPLIILTSKNTKGIAEDVAYCLQNLKRATIVGEKTAGGSVKVDKIQVRDTGFYVT

>M1_tetNig 
AFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAGPHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPA
LIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYDRPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYI

>M1_oryLat two domains: 28-314,320-605 no upstream dup
SFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVVPPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLE
LVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPLNTTTKLLSMQSTLGQTYGGTKPLLVLTSKNTKDIAEDVAYCLKNLKRATIVGEKTAGGSAKIKKFRVGDTDFYVTLP

>M1_petMar exon3/4 fused, exon4 run-on, fixed genomic frameshift; four domains: 34-312,327-615,625-914,916-1217
KFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAVSYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLV
DTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYYRPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPA

>M1x_danRer 
FQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPALHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLL
HNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPTNITRELWTLPTLLGERFGKRKDLIVLISKRTIGAAEGVAYILKHLKRAVIIGERSAGGSVRVDKLKIGDSGFYIT

>M1x_takRub single upstream exon 42% frameshift no transcripts three domains: 23-323,325-615,618-907
TLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRNSIKLDILDSDVGYLRIDRIIDEETLLKFGPLLREN
VWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGKTFGGKKDMIVLIGRRTAGAAEAVAYTLKHLNRAIVVGERSAGGSLKVRKFRIAESDFYITMPV
  
>M2_homSap
TLRSALPGVVHCLQEVLKDYYTLVDRVPTLLQHLASMDFSTVVSEEDLVTKLNAGLQAASEDPRLLVRAIGPTETPSWPAPDAAAEDSPGVAPELPEDEAIRQALVDSVFQVSVLPGNVGYLRFDSF
ADASVLGVLAPYVLRQVWEPLQDTEHLIMDLRHNPGGPSSAVPLLLSYFQGPEAGPVHLFTTYDRRTNITQEHFSHMELPGPRYSTQRGVYLLTSHRTATAAEEFAFLMQSLGWATLVGEITAGNLLH

>M2_bosTau
LRRALPGVIQRLQEALREYYTLVDRVPALLSHLAAMDLSSVVSEDDLVTKLNAGLQAVSEDPRLQVQVVRPKEASSGPEEEAEEPPEAVPEVPEDEAVRRALVDSVFQVSVLPGNVGYLRFDSFADA
SVLEVLGPYILHQVWEPLQDTEHLIMDLRQNPGGPSSAVPLLLSYFQSPDASPVRLFSTYDRRTNITREHFSQTELLGRPYGTQRGVYLLTSHRTATAAEELAFLMQSLGWATLVGEITAGSLLHTHT

>M2_monDom
LRRARPGAIQRLMEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPEEATMGEAEEEDATPAANSLPEDESQRQALVDSVFQVSVLPGNVGYLRFDEFADSS
VLGTLAPYVIRQVWEPLQDTNHLIMDLRYNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYDRQTNQTQEHLSRAELLGKPYGAQRGVYLLTSHHTATAAEEFAFLMQSLGRATLVGEITAGSLMHTRTF

>M2_ornAna 
LRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPPRKEEEQKEEEEEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCF
DEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYNRPADVTREYASRAGALEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGR

>M2_galGal 
AVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEMPIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASV
LVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHLFTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTF

>M2_taeGut 
VRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAEKPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVL
VKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVHLFTTYDRRTNHTQEHNSQAELLGQSYGAKRGVYLLTSHHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTRTFP

>M2_anoCar
SLRKAIPNSMSYLVDIIKNNYSMLEQVPVLLQHLSTFDYSSVLSVKDLASKLNAELQTISEDPRLFLRVPASDEAVTSQTDEKVAMASDLPNNEQLMKALVMTVFKVSVLPGNVGYMRFDEFGDATV
LVKLGPYLLQHVWEPLQATDYLIIDLRYNIGGPSSSAVPVLLSYFQDPSAGPVHFFTTYNRLTNQTQAYSSSAEMVGKPYGARRGVYLLTSHNTATAAEEFAYLMQTLGRATLVGEITAGSLSHTHTF

>M2_xenTro
AVRSSITHILLQLSEILVNNYAFSERIPTLLQHLPNLDYSSVISEEDITAKLNYELQSLTEDPRLVLKSKTDSLVMPEDSTQVENLPDDEATLQALVNTVFKVSILPGNIGYLRFDEFADVSVLAKL
GPYIVNTVWDPITVTENLIIDLRYNIGGSSTSIPLLLSYFQEPENRIHLFTIYNRQQNSTNEVYSLPKVLGKPYGSKKGVYVLTSHETATAAEEFAYLMQSLSRATIIGEITSGNLMHSKAFPLDGTR

>M2_xenLae
AVRSSVTHVLHQLCDILANNYAFSERIPTLLQHLPNLDYSTVISEEDIAAKLNYELQSLTEDPRLVLKSKTDTLVMPGDSIQAENIPEDEAMLQALVNTVFKVSILPGNIGYLRFDQFADVSVIAKL
APFIVNTVWEPITITENLIIDLRYNVGGSSTAVPLLLSYFLDPETKIHLFTLHNRQQNSTDEVYSHPKVLGKPYGSKKGVYVLTSHQTATAAEEFAYLMQSLSRATIIGEITSGNLMHSKVFPFDGTQ

>M2_calMil frag 2 domains 6-243,334-531
VTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEPPVGLFTVYNRLTNTTSHTTLPGVGQHVYG
SRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECWLGGGVVPDSIVLAEDTLERTKEII

>M2_petMar 
ELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLPDDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIH
PTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTAEFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNN

>M2x_danRer
SVRKTIPKAVRRVSDIIKRYYSFKDKIPALLNQLAKADYFTVVSEEDLAGKLNHEMQSVFEDPRLLIKATQVLTDDASSEDRSSSDDLTDPLFKLEMISGNNGYLRFDRFPTPEVLLRLEDHIKKKI
WQPVQETENLVIDLRFNTGGSTEALPILLSYMFDTSSSTYLFSIYDSIKNTTFDFHTLNNISGPSYGSTKGVYVLTSYYTAEAGEEFAYLMQSLHRGTVIGEITSGMLLHSKTFQIEQTSLAITVPII

>M2x_takRub
AVRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQVLPHNTGYLRLDRFVRCSEGDKLEEIV
AEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSERGVYVLTSHYTAGAAEEFAYLIQSLHFGTVVGEITSGTLMHSKTFQVEGTDIFIT
 
>M3_homSap
EFHQSLGALVEGTGHLLEAHYARPEVVGQTSALLRAKLAQGAYRTAVDLESLASQLTADLQEVSGDHRLLVFHSPGELVVEEAPPPPPAVPSPEELTYLIEALFKTEVLPGQLGYLRFDAMAELETV
KAVGPQLVRLVWQQLVDTAALVIDLRYNPGSYSTAIPLLCSYFFEAEPRQHLYSVFDRATSKVTEVWTLPQVAGQRYGSHKDLYILMSHTSGSAAEAFAHTMQDLQRATVIGEPTAGGALSVGIYQVG

>M3_bosTau
EFHRSLGELVEGTGRLLEAHYARPEVVGQMGALLRAKLAQGAYRTAVDLESLASQLTADLQEMSGDHRLLVFHSPGEMVAEEAPPPPPVVPSPEELSYLIEALFKTEVLPGQLGYLRFDAMAELETV
KAVGPQLVQLVWQKLVDTAALVVDLRYNPGSYSTAVPLLCSYFFEAEPRRHLYSVFDRATSRVTEVWTLPHVTGQRYGSHKDLYVLVSHTSGSAAEAFAHTMQDLQRATIIGEPTAGGALSVGIYQVG

>M3_monDom
EFHQRLGALVEGTGHLLEAHYALPEVVGQASALLKAKLEHGTYRTAVDFESLASQLTSDLQEVSGDHRLHVFHSPGEPVSEELTPPQKGVPSPEELTYLIEALFKTEVLPGQLGYLRFDMMAEAETV
RAIAPQLVELVWEKLVHTEALVVDLRYNPGGYSTAVPLLCSYFFEAEPRRHLYTIFDRAASQLTEVWTLPQVAGERYGSQKDLYILISHTSGSAAEAFVHTMKDQHRATVIGEPTGGGALSVGIYQVE

>M3_ornAna frag
FHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAA

>M3_galGal 
LLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFHSHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQM
VWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEPRQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPS

>M3_taeGut
FHKNMGVLLEGTGQLLEDHYAIPEVAAKASAMLSTKRAQGGYRSAIDSETLASQLTSDLQEASGDHRLHVFHSHVEPTPEEQLPNVIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAI
GPQLLQMVWNKLVDTDAMIIDMRYNTGGYSTAIPILCSYFFDPEPRKHLYTVFDRSTSRSTEVWTLPQLAGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVVGEPTVGGSLSVGIYRVGNSS

>M3_anoCar
EAHYAIPEMARRVSSMLNSKLAQGGYRTAVDFETLASQLTNDLQETSGDHQLHVFHSHVEPSLEEQSPFKTLTPEELNFIIEALFKVDVLPGNVGYLRFDMMAEFESVKTIEPQILHMVWEKLVETS
AMIVDMRYNTGSYSTAVPMFCSYFFDAEPQQHLYTIIDRSTSQSTEVWTSSQVSGKRYGSTKDLYILISHASGSAAEAFTRSLKDLHRATVIGEPTVGGSLSASIYNIGSTPLYASIPSQIVLSPVSG

>M3_xenTro
FHPSVFALVEGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASQLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIPSAEELNYIIDALFKIEVLQGNVGYLRFDMMADTEIIKAI
GPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSSGTDIWTLPEVVGERYGSTKDIYILTSHMTGSAAEVFTRSMKELNRATIIGEPTSGVSLSVGMYKVGESN

>M3_xenLae
FHPSIFPLVKGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASLLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIPSPEELNYIIDALFKIEVLPGNVGYLRFDMMADTEIIKAI
GPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSTGKDIWTLPEVFGERYGSTKDIYILTSHMTGSAAEVFTRSLKDLNRATLIGEPTSGVSLSVGMYKVGDSN

>M3_calMil frag
GFHAQVAELVESTGKLLAVHYAIPEVAAEVSAVLSAKLTQGLYRSVVDWESLASRLTVDLQETSVWSVSGAEPHVIVQANEAMTVALGIIN

>M3_petMar 
FHAKMASLLELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKH
VGPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRIGDS

>M3/4_braFlo Branchiostoma floridae Region: 9 exons 
MTRPSKVDIVFPIKPFTIPTAHEQVKGEGPVDINKNALCKSADEGHTHPVSIAMAPTAYIVFVALVPTVLSVDWLDVVMGIGDVMADHYLDQDLRALNDQSLLQRWNRTLVHRFQSWSQDDMSDSLR
MEEGLTSELRNITGDETIKVWDFGVYENTTQEPVPREFYNFSTFVDNFKKNREKHINVTMLEGNVGYVSIRSMSHIVDIILPDPEMTEFFLSKMAALNESKAIILDLRYNLGGDREGVVHWASFFFNA

>M3x_takRub
FHQGLRSLIGRTGELLEKHYAIQEVAQKVGEVLLSKWAEGLYRSVVDLESLASQLTADLQEASGDHRLHVFRCDVELESLHGVPKIAAVEEAGFVIDALFKSELLPRNVGYLRFDTMADIEAAKGAA
PRLVKSVWNKLVDTDSLIIDMRYNAGGSSTAVPLWCSYFVDGEPLQHLYTVYDRTTKTRVEVMTLPEVSGQRYDPGKDVYILTSHMTGSAAEAFVRAMRDLNRVTIVGEPTAGGSLSSATYQIGESVL

>M3x_danRer
FHKNIQGLVQEAGDLLEKHYSVPEVAAKVSRLLQSKLTEGLYRSVVDYESLASQLTSDLQETSGDQRLHIFYCETEPETLHDTPKIPSPEEAGFIVEALFKVDVMSGNIGYLRFDMMEDIKVLQAIN
PEFLKVVWNKLVNTDMLIIDVRYNTGGYSTAIPLLCTYFFDAQPLTHIYTLFDRSTATVTKVTTLPDVLGQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSIL
 
>M4_homSap
ALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQIPSPEVFEELIKFSFHTNVLEDNIGYLRFDMFGDGELLTQ
VSRLLVEHIWKKIMHTDAMIIDMRFNIGGPTSSIPILCSYFFDEGPPVLLDKIYSRPDDSVSELWTHAQVVGERYGSKKSMVILTSSVTAGTAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDT

>M4_bosTau
LRAKVPTVLQTAGKLVADNYASPELGVKMAAELSGLQSRYARVTSEAALAELLQADLQVLSGDPHLKTAHIPEDAKDRIPGIVPMQIPSPEVFEDLIKFSFHTNVLEGNVGYLRFDMFGDCELLTQV
SELLVEHVWKKIVHTDALIVDMRFNIGGPTSSISALCSYFFDEGPPILLDKIYNRPNNSVSELWTLSQLEGERYGSKKSMVILTSTLTAGAAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTD

>M4_monDom
LRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHIPEDAKDRIPGIVPMQLPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQV
SDLLVEHVWKKVVHTDGMIIDMRFNIGGPTSSISALCSYFFDEGQEVLLDQIYNRPNDSISEIWTQSQVAGERYGSKKSVIILTSSMTAGAAEEFVYVMQRLGRALVIGEVTSGGCQPPQTYHVDDTD

>M4_ornAna
LRSKVPTVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQIPSAETFEDLIKFSFHTSVMEGNIGYLRFDMFGDCELLTQV
SELMVEHVWKKIVHTDGLIIDMRNIGGPTSSISALCSYFFDEDHPVLLDKIYNRPNDSISEIWTHSHIAGERYGSRKSVVILTSNMTAGAAEEFVSIMKRLGRALVVGEVTGGGCHPPQTYHVDDTHL

>M4_galGal 
ASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYIPEHAKDSIPGILPKQIPSPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDCEL
LTQVSDLLVEHVWKKIVHTDALIIDMRYNIGGYTNSIPILCSYFFDEGHQVLLDKVYDRPSDSVKEIWTQPQLRGERYGSQKGLIILTSAVTAGAAEEFVFIMKRLGRALIIGEQTSGGSHSPQTYQV

>M4_taeGut
ANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLYIPEHAKDSIPGIMPKQIPPPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDSEL
LTQLSDLMIEHVWKKIFHTDALIIDLRYNIGGSTTPIAILCSYFFDEGHPVLLDRVYDRPSDSVKEIWTQPQLKGERYGSQKGLVILTSAVTAGAAEEFVYIMKRLSRALIIGEQTSGGCHSPQTYQV

>M4_anoCar
LFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIPERSKGRILGLVPMQQIPPPEILEDLIKFSLHTNVFENNIGYLRFDMFGDCEL
MSQVSELLVQHVWNKIVNTDALIIDMRYNVGGPACSVPLLCSYFFDEGHPILLDKVYNRPNDTTSNIWTVSKLAGKRYGLNKGLIILTSSVTSGAAEEFAHIMKRLGRAFIIGQKTSGGCHPPQTFHV

>M4_xenTro
KLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQIPSPEMFEDLIKFSFHTDVFEKNLGYIRFDMFADSDLL
NQVSDLLVEHVWKKVVNQDALIIDMrFNIGGPTSSIPTFCSYFFDEGTPVLLDKIYSRTTNAITDVWTLPHLVGNAFGSKKPVIILTSSLTEGAAEEFVYIMKRLGRAYVIGEVTSGGCHPPQTYHVD

>M4_xenLae
KLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPMQIPSPELFEELIKFSFHTDVFEKNIGYIRFDMFADSDLL
NQVSDLLVEHVWKKVVDQDALIIDMRFNIGGPTSSIPIFCSYFFDEGTPVLLDKIYSRTSNAMTDIWTLPDLVGKTFGSKKPLIILTSSLTEGAAEEFVYIMKRLGRAYVVGEVTSGGCHPPQTYHVD

>M4_danRer
KLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPMNPTPEMFIALIKSSFQTDVFENNIGYLRFDMFGDFEHVATIAQIIV
EHVWNKVVDTDALIIDLrNNIGGHASSIAGFCSYFFDADKQIVLDHIYDRPSNTTRDLQTLEQLTGRRYGSKKSVVILTSGVTAGAAEEFVFIMKRLGRAMIIGETTHGGCQPPETFAVGESDIFLSI

>M4_takRub
LRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPMDYSPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVE
HVWNKVVNTDALILDLRNNVGGPTTAIAGFCSYFFDADKLIVLDKLHDRPSGTTTELLTLPELTGVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQVFSVGEIGIFLSIP

>M4_tetNig 
LRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPMDYTPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVE
HVWNKVVNTDALILDLrNNVGGPTTAIAGFCSYFFDADKQNRVGQAVRQASGTTTELLTLSELTGVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQTFRVGETDVFLLIP

>M4_gasAcu 
TLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPMNYSPEMYIELIKVSFHTDVFEDNIGYLRFDMFGDFEEVKAIAQIIV
EHVWNKVVNTDAMIVDLRNNIGGPTTAIAGFCSYFFDSDKQIVLDRLYDRPSGTTTELRTLPELTGTRYGSKKSLVMLTSRATAGAAEEFVYIMKKLGRAMIVGETTAGTSHPPKTFRVGETDIFLSI

>M4_oryLat
LRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPMEYTPEMYIELIKISFHTDVFENNIGYLRFDMFGDFEEVKAIAQVIVE
HVWNKVLHTDAMIIDLRNNVGGPTTAIAGFCSYFFDGDKQILLDKLYDRSTGTTTDLLTLGELTGERYGSKKSLIILASRATAGAAEEFVYIMKRLGRAMIVGETTAGASHPPKVFQVGESDIFLSIP

>M4_calMil frag
LRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQQMPPPEILEDLVKFSYQTKVLENNVGYLRFDMFGDNEMI
TQVSELMAKHVWNVIASTSSLIVDLRYNIGGPTSSIPILCSYFFDDDKTVLLDTVYSRPTDTISEMKAIPQVAGNGSTESSVHSYI

>M4_petMar 
ALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQMPPTESFEDLIKFSFITDVLEGNIGYLRFDLFSDLEAL
EHVAHLLVEHVWKKICDTEILIIDLRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRI