Opsin evolution: orgins of opsins: Difference between revisions
Tomemerald (talk | contribs) |
Tomemerald (talk | contribs) |
||
Line 425: | Line 425: | ||
Melanopsins are well-represented in all three bilateran clades -- the only sequenced genome to date lacking a melanopsin is the acornworm Saccoglossus. Many erratically named genes in arthropods and molluscs are actually simple orthologs at the bilateran ancestor to the first described melanopsin locus in Xenopus. A single melanopsin locus existed in ur-Bilatera. It is not currently possible to specify its syntenic relationships. | Melanopsins are well-represented in all three bilateran clades -- the only sequenced genome to date lacking a melanopsin is the acornworm Saccoglossus. Many erratically named genes in arthropods and molluscs are actually simple orthologs at the bilateran ancestor to the first described melanopsin locus in Xenopus. A single melanopsin locus existed in ur-Bilatera. It is not currently possible to specify its syntenic relationships. | ||
In ecdysozoa, the melanopsin locus duplicated early on with copies specializing to ultraviolet and long wavelengths but evidently remaining under the same strong and unusual selection in the third cytoplasmic loop attributable perhaps to protein-protein interaction invoving the alpha protein specialized to Gq signaling. The ultraviolet melanopsin largely retains the ancestral intronation whereas the longwave form largely lost these but acquired others. The duplication process was segmental rather than retropositional because | In ecdysozoa, the melanopsin locus duplicated early on with copies specializing to ultraviolet and long wavelengths but evidently remaining under the same strong and unusual selection in the third cytoplasmic loop attributable perhaps to protein-protein interaction invoving the alpha protein specialized to Gq signaling. The ultraviolet melanopsin largely retains the ancestral intronation (based on deuterostome and lophotrochozoa outgroup sequences) whereas the longwave form largely lost these but acquired others. The duplication process was segmental rather than retropositional because a distal [[Opsin_evolution:_ancestral_introns#Ancestral_melanopsin_intronation|intron at EVTR 252 00]] is still shared (3' introns are the first to be lost in retropositioning). | ||
These ecdysozoan melanopsin paralogs in turn underwent additional -- in some cases [http://www.ncbi.nlm.nih.gov/pubmed/19534844 many] -- | These ecdysozoan melanopsin paralogs in turn underwent additional expansion -- in some cases [http://www.ncbi.nlm.nih.gov/pubmed/19534844 many duplications] -- depending on the specific lineage. These refine imaging vision and do not have auxillary functions. These end-leaf specializatons and their evolutionary adaptiveness are best pursued in the original journal articles -- the emphasis here is the broader sweep of opsin evolution. | ||
Melanopsins in lophotrochozoans have a simpler history. Taxonomic sampling leaves something to be desired at this point. Like in ecdysozoans, they seem to provide all the opsins used in imaging vision. (Peropsins and rarely cilopsins are also found in this clade.) | Melanopsins in lophotrochozoans have a simpler history. Taxonomic sampling leaves something to be desired at this point. Like in ecdysozoans, they seem to provide all the opsins used in imaging vision. (Peropsins and rarely cilopsins are also found in this clade.) |
Revision as of 13:14, 15 January 2010
See also: Curated Sequences | Tetrachromatic Ancestral Mammal | Ancestral Introns | Informative Indels | Update Blog
Introduction: the origin of opsins
The origin of the first opsins is a bit murky. Opsins are operationally defined here as 7-transmembrane proteins structurally and sequentially homologous to GPCR with (Schiff base) lysine in TM7 in alignment with K296 of bovine rhodopsin (or any established opsin).
This section moves forward in time from the parental gene content of the immediate ancestral genome (greatly facilitated by the new Trichoplax and Monosiga assemblies) that gave rise to the first opsin via gene duplication and neofunctionalization of one copy to photoreception. Subsequent sections work backwards in time, first coalescing separate gene trees of ciliary, melanopsic and other opsins to their respective ur-opsins and ultimately deducing properties of the crown group opsin.
The opsin origination event was not necessarily unique -- GPCR always retain many essential properties via their own evolutionary constraints amd conceivably could have given rise to opsins at widely scattered intervals from rather different parental genes. In this type of history, the minimal gene tree containing all opsins is not 'monophyletic' but instead contains embedded non-opsin GPCR. Nothing prevents an established opsin from later giving rise to a gene duplicate that 'reverts' to a non-K296 GPCR. Conceivably the lysine could be retained even as the photobiological functionality is lost.
In the case of multiple such opsins surviving to the present day, branches will coelesce first to separate parental non-lysine GPCRs, which in turn eventually coelesce -- as all GPCR must do -- to a master parental gene.
A gene tree illustrates these hypothetical complexities at left:
-- opsins arose independently from GPCR at nodes 2,3 and 4
-- these opsins initially coalesce to 3 ancestral opsins
-- the first two groups of opsins coalesce to a parental gene at node 1 whose descendents include 7 GPCR
-- at node 5, an opsin has 'reverted' to a new GPCR, also a descendent of these opsins' parent gene
-- the full set of opsins coalesces at a master parental gene at node 0 with numerous non-separable GPCR descendents
This scenario -- the molecular version of whether 'vision' arose once vs multiple times -- can be ruled out for bilateran opsins (provided the relevent GPCR outgroups have left descendent genes) but still must be considered seriously in the case of cnidarian opsins and perhaps ctenophores and sponges as well. It appears today however that the entire bilateran opsin set forms a single branch excludive of all non-K296 GPCR in the tree generated from the roughly 100,000 known GPCR. (For practical reasons, only near-opsin GPCR can be considered.)
Events 600 million years ago may seem hopelessly inaccessible and indeed many uncertainties will remain even after every relevent genome has been sequenced. However sequencing to date has been phylogenetically lopsided with far too little effort expended on early diverging non-model organisms with strategic tree positions. Yet comparative genomics has already provided substantial insights into certain aspects of opsin evolution:
- The first opsins were not associated with gross morphological structures (such as stalked eyes) that could possibly leave a fossil record (as in trilobites) -- key events took place strictly at the molecular subcellular level. Genomes of extant species (some more than others) are not exactly living fossils because the evolutionary accrual of mutations never ceases.
Cases exist of opsins demonstrably obliterated both by gradual pseudogenization and large scale deletions, confusing the record. Yet opsin genes and even their regulatory regions, when compared across the entire metazoan tree, can furnish reliable reconstructions of opsin content and even sequence at ancestral species divergence nodes.
- Opsins are definitely not the 'original' GPCR because these were already widely deployed at much earlier divergence nodes -- yeast, protozoa, choanflagellates, trichoplax have GPCR but lack opsins. Nor are opsins the prototype for the 'rhodopsin class' R of the GRAFS classification of GPCR which again was established far earlier. Indeed, even the Ralpha subgroup with of rhodopsin class GPCR was well-established prior to the first metazoan opsin.
- Opsins are thus latecomers, not pioneers, to a rapidly expanding paralogous gene clade within already full-featured GPCR. Judging by their closest extant blastp relatives among tens of thousands of GPCR at GenBank, opsins specifically arose as a gene duplication within the peptide receptor subgroup PEP. Indeed, certain of these proteins list opsins among their top ten best back-blast matches (ie have better matches than to almost all non-opsin GPCR). Note here that blast scores can be misleading because the 'floor' of percent identity is about 25% just due to universal conserved residues plus accidental matches.
- Note an 'intermediate' GPCR does not exist: either lysine is present at K296 or it isn't. Reconstructing ancestral states from the best contemporary set of GPCR proteins lacking K296 cannot produce a lysine there by any rational methodology. The 20 encoded amino acids can be clustered into subgroups (eg by polarity or bulk) but ultimately form a unorderable discrete set not furnishing continuum transitional states.
- Most likely the parental gene had several introns and the original opsins inherited this pattern (ie the duplication was segmental rather than retroprocessional as in some cnidarian opsins). The history of introns within opsins is already complex and becomes quite problematic within the enveloping GPCR gene family. Opsins (with the exception of a fragmentary sea urchin melanopsin) lack the ubiquitious phase 21 intron breaking the DRY motif arginine.
- Intracellular targeting of early opsins was likely to cytoplasmic or endoplasmic reticulum membranes as isolated monomers, with limited microvillar or especially ciliary specialization (to motile larva) also plausible. These opsins were the first eyes to the world but only in the sense of indicating the intensity (and later directionality) of sunlight striking the cell utilizing already refined GPCR second messenger signal transduction.
- Opsin creation does not imply saltatory evolution because the basics had been established far earlier -- the 7-transmembrane helical structure with fixed topology, the TM1-TM2 salt bridge N55-D83 that could serve as initial counterion, the DRY ionic lock, the GWS.Y..E.....C..DW........SY region of EC2, the NPxxY terminal helix, the conformational shift upon binding of ligand that could trigger signaling, the Galpha protein binding site needed for the signaling cascade, and an arrestin-type mechanism signaling termination. The earliest opsins contained and continued all of these features from the get-go, adapting them over the course of time to various photoreceptive functions.
- Opsins are unique among GPCR in several respects: they catalyze a mild in-situ enxymatic reaction -- cis-trans photoismorization -- that furnishes the signaling agonist. (This reaction also occurs thermally without enzyme but so does carbon dioxide dissolution in water yet humans have 15 carbonic anhydrases). Cis-retinal, being lipid soluble, does not diffuse through the extracellular mileau to reach its receptor binding site as in all other GPCR. Instead it is covalently bound to a lysine deeply internal to TM7, again unprecedented among GPCR (though other internal charged amino acids can occur, notably the D83 glutamate salt bridge and K90 of ultraviolet opsins).
- Opsins did not arise from flavinoid-based cryptochromes, mechanistically different photoreceptors that evolved much earlier to establish circadian rhythm and eventually magneto-sensing. Cryptochromes are homologous to DNA photolyase repair enzymes, not GPCR.
- Although literature searches turn up scattered assertion about 'opsins' in species such as Chlamydemonas ('chlamyopsin' Z48968) and 'volvoxopsin', not to mention bacterial 'rhodopsins', these amount to abusive terminological metaphors, unwelcome additions to an already complex gene family. These proteins do not have seven transmembrane helices in the same arrangement as GPCR nor possess the slightest sequence homology at deeply conserved GPCR residues, so represent independent evolution of photobiology (along the lines of bat and butterfly wings representing independent origins of flying).
- Conceivably forerunners of opsins bound a related chromophore non-covalently, perhaps an all-trans retinoid in the manner of peropsins. Retinoic acid is sometimes proposed as ancestral ligand but retinoic acid receptors (RAR and RXR) are non-GPCR nuclear hormone receptors that bind all trans-RA or 9-cis-RA but not 13-cis-RA. Furthermore, the GPCR receptors inducible by retinoic acid -- RAIG1 proteins (GPRC5C etc) belong elsewhere in the GRAFS classification, have no particular affiliation with opsins and again do not bind retinoids themselves. The fact that pseudo-opsin chromophores are similar retinoids may be coincidence arising from the ubiquity of metabolic carotenoids (availability) and the restricted number of biochemicals (isoprenoids but not amino acids) with tunable adsorption in the visual range (suitability).
- Very recent experiment has investigated the consequences of K296G in conjunction with replacement of retinal with its ethylamine Schiff base (which mimics the previous situation but with non-covalently bound chromophore). This had no effect on site speificity of photoisomerization nor quantum yield but greatly reduced activation, suggesting the K296 covalent bond transmits structural changes within the protein, with the bond retaining the low-affinity agonist enhancing the duration of activation. This suggests both an intermediate evolutionary stage of inefficient but region-specific photoisomerization prior to the acquisition of K296 and raises the issue of whether some early opsins acquired a distinct enhancement mechanism not involving K296.
- In principle, GPCRs could continue to spawn new clades of opsins from time to time. However, they did not in bilaterans. That is, no gene tree of a bilateran opsin coalesces with a GPCR gene later than the bilateran common ancestor. All bilateran opsins are descended from one of six opsins classes present in the ur-bilateran. Indeed gene tree comprised of all opsins excludes all GPCR, consistent with a unique K296 origination event. However, it remains possible that some cnidarian or ctenophoran opsins arose from a second wing of GPCR with no representative of this opsin surviving in bilaterans.
Two genes in separate species are by definition orthologous only when descended vertically from a single gene in their last common ancestor. It appears that all bilateran opsins -- after accounting for later clade-specific expansions and losses -- are orthologous to either a cilopsin, melanopsin, peropsin, rgropsin, or neuropsin at the bilateran common ancestor. ('Rhabdomeric' protostome opsins do not define a separate class but instead coelesce with vertebrate melanopsins.)
These 5 opsin classes appear not fully coelesced even at the last common ancestor of bilaterans with cnidarians -- while sequence data is woefully limited today in early taxa, it seems both melanopsins and cilopsins classes existed in this ancestor, perhaps in addition other opsin classes no longer represented in bilaterans. Conversely, peropsins have been retained in lophotrochozoan, ecdysozoan, and deuterstome lineages but not in any cnidarian sequence to date. Neuropsins survived solely in chordates, whereas rgropsins are even more restricted to vertebrates, even though they could not have originated there. These latter genes are conceptual analogs of cnidarian-only opsin classes.
All opsins are homologous so any given pair is ultimately orthologous at some earlier common ancestor -- but which one? The species tree itself is confused here on sistering vs independent nodes at cnidarian/ctenophore. The single ctenophore opsin available -- regretably just a distal fragment -- is difficult to classify. The fact that its best blast matches cluster about equally well with melanopsins and cilopsins (to the exclusion of other bilateran classes) suggests that their merger is not far off.
The opsin gene tree can largely be worked out and coordinated with species tree divergences. Despite many efforts at this, some deeper topology remains problematic. It appears from sequence clustering, indel analysis, and especially intron conservation that ((peropsin, rgropsin),neuropsin) is a valid subgroup. Further, this assemblage associates more closely with cilopsins, leaving a final topology to be superimposed on the phylogenetic tree:
gene tree ((((cilopsin,((peropsin,rgropsin),neuropsin)),melanopsin),cnidopsin),GPCRpep); species tree (((((((((echinoderm,acornworm),amphioxus),tunicate),vertebrate),((chelicerate,(crusacean,insect)),(mollusc,annelid))),cnidaria),ctenophore),trichoplax),sponge);
Nearest neighbors of opsins among GPCR
The immediate outgroup of opsins lies among a vast number GPCR receptors. The reference collection defines a close-in subset utilizing human GPCR which have the best prospects for determined ligand. Note blast score order is not ideal because they are squeezed between a 'floor' of ~23% identity attributable universally conserved residues plus accidental matching, and a 'ceiling' of ~30% to remain non-opsin.
None of these GPCR represent the actually parental gene to opsin because they have themselves evolved forward some 600 million years from the putative opsin creation event. Conceivably one or more is also directly descended from it. The consensus line of the alignment below perhaps represents a better approximation to the desired ancestral sequence. It is difficult to reconstruct an ancesteral sequence accurately because non-adjacent opsin residues co-evolve, creating algorithmic errors in methods that neglect this. Some co-evolving residues are suggested by structural studies but not all relationships can be described.
Opsins are not the 'original' GPCR (which are trackable, barely, to yeast) even for the 'rhodopsin' group R (or even its Ralpha subgroup) within the GRAFS classification but rather form a specialized set that arose later as the rhodopsin gene class (which contains the AMIN cluster [adrenalin, serotonin, dopamine, and histamine receptors], MECA branch [peptide and lipid binding receptors] in addition to opsins) underwent significant expansions.
This expansion of the Ralpha class had largely taken place in the last common metazoan ancester shared with Monosiga and Trichoplax (which do not contain opsins), implying the ancestral metazoan lacked them as well. The orphan receptors GPR21 and GPR52 form the immediate outgroup (within the 800 human GPCR) in an oft-cited 2003 study. These have isoleucine at K296; their ligands are still not known as of Dec 2009. Conservation is high throughout deuterostomes; blast matches are restricted within opsins to molluscan melanopsins suggesting Gq signaling.
The melatonin receptor MLTNR1A emerges as a close relative to opsins. Curiously it plays a key role in circadian rhythms and so needs to coordinate with opsin photosensors. N-acetyl-5-methoxytryptamine, the ligand, bears no obvious relationship to cis-retinal however and K296 is lacking, making an immediate parent gene relationship problematic.
Another clue to the origin of opsins might be provided by examining GPCR intron positions and phases to see if shared with ancient introns in opsins. Many non-olfactory GPCR with sequence similarity to opsins have no introns or just one, suggesting the genes duplicated by retroprocessing, perhaps acquiring an intron at unrelated position later. UROPS2 has an intron but it does not seem to correspond to one in any opsin. Cnidarian opsins are either intronless (Nematostellata) or undetermined (just known from processed transcripts).
Closeness in the GRAFS tree does not fully accord with closeness of blastp hit and relatedness of diagnostic regions, suggesting (unsurprisingly) that its topology is slightly wrong at some internal nodes. On average rank in blastp top scores (or by average 5 best blast expectation values), as representatives of all opsin classes are aligned with the GPCR below, the highest scoring ones by far are are the Trichoplax opsins followed by various peptide receptors:
Rank Gene Exp Exons Receptor Ligand 4.2 UROPS2_triAd e-29 2 orphan histamine? (HRH2: best human non-opsin blast match) 5.4 UROPS1_triAd e-28 1 orphan peptide? (SSTR1: best human non-opsin blast match) 5.6 SSTR1_homSap e-26 1 somatostatin peptide 7.2 TACR2_homSap e-25 5 tachykinin peptide 8.1 GALR1_homSap e-24 3 galanin peptide 8.9 MTNR1A_homSa e-23 2 melatonin N-acetyl-5-methoxytryptamine
The biological literature contains various scattered claims about 'opsins' in species such as Chlamydemonas (chlamyopsin Z48968), not to mention bacterial 'rhodopsins'. These do not have the seven transmembrane helices in the same arrangement as GPCR nor significant sequence homology and may represent independent evolution of photobiology (just as bat and butterfly wings represent independent origins of flying).
Trichoplax has two very curious 7-transmembrane proteins that emerge as its best genomic match to opsin queries. While lacking K296 for a Schiff base, their best back-blast to all of GenBank returns almost entirely opsins (rather than nest within other GPCR receptors). While Trichoplax is 600+ million years removed from the common ancestor with eumetazoa, this gene could still offer clues about the immediate GPCR ancestor to opsins.
These Trichoplax genes retain uncanny similarities to opsins in otherwise rapidly changing regions. These two genes are not plausibly derived from an opsin expansion with subsequent loss of K296 because Trichoplax and other early diverging lineages lack opsins. Perhaps these genes should be considered opsins in spite of lacking K296. Recall here Schiff base formation dramatically redshifts the absorption spectrum, yet non-covalently bound retinal still has significant adsorption at optical wavelengths which might be further tuned by Trichoplax binding pocket residues.
Conversely, several cnidarian species exhibit far too many K296-type GPCR for their apparent photoreceptive needs and accompaning lack of overt photobiological anatomical specializations. These may represent divergent gene duplications of valid opsins that have evolved into some other type of GPCR; alternatively they could represent a lineage of pre-opsin GPCR that developed K296 but never acquired an opsinlike light-sensing role nor served as parental gene to bona fide opsins.
Together the Trichoplax pre-opsins lacking K296 and putative cnidarian non-opsins possessing K296 push the opsin-defining envelope to its limits. Given the immense time span separating contemporary genes from ancestral, we can anticipate their computed nesting arrangement within the opsin gene tree relative to a close-in GPCR outgroup with known non-retinal ligands will lack convincing statistical support at the critical nodes. The best way forward is additional sequencing and experimentation with cubomedusae, ctenophores and sponges because these seem to contain conventional opsins that can clarify the positions of the outliers.
In summary, the parental GPCR that gave rise to the first opsin can be localized fairly reliably to the PEP subgroup of R class GPCR within GRAFS but no particular gene there stands out as the definitive pre-opsin. The time span invoved is immense and this gene class has experienced much churning through expansion and contraction cycles, as well as moderately rapid pointwise residue change.
An independent approach to opsin origins might compare intron positions and phases of candidate parental GPCR to those of opsins. The ancestral introns of opsins are easily reconstructed, reducing noise and potential coincidence, but that program is quite difficult to extend to GPCR. Too often, GPCR with relevent sequence similarity to opsins have no introns or just one, suggesting gene duplication by retroprocessing followed by a later intron acquisition at non-historic position followed by more rounds of duplication (as seen in sulfatases).
UROPS2 of trichoplax has one intron but unfortunately it does not correspond to any in opsins. Cnidarian opsins to date have been either intronless (Nematostella) or not determined (known only from processed transcripts). Thus the intronic approach to parental GPCR awaits more extensive sequencing of early genomes.
A third approach to opsin origins considers informative indels and diagnostic residues in the set of all opsins expanded by select GPCR. While perhaps subject to more homoplasy than introns, regions such as extracellular loops TM2 and EC2 do illuminate issues such as ancestral length and define signature residues of opsin classes.
Origin of contemporary opsin classes
Traceback of opsins can begin by selecting certain 'index sequences'. It ultimately does not matter which or how many, but for historical reasons bovine rhodopsin, frog melanopsin, human peropsin, mouse neuropsin and so forth might be used.
Each index sequence is then built out to a larger class of orthologs in nearby species using flanking gene synteny to confirm best-blast. Lineage-specific gene duplications with close affinities (eg from recent clade-specific paralogous expansions such as teleost fish whole genome duplications) are added. Eventually the set collides with an expanding set of another index sequence and all bilateran opsin sequences fall into one of five clusters.
Ciliary opsins (generated from RHO1) forms a cohesive gene clade called here cilopsins that does not coalesce with melanopsins, peropsins, neuropsins, or rgropsins within vertebrates, deuterostomes, or even bilatera. The index gene picks up rod and cone imaging opsins, pinopsin, parapinopsin, parietopsin, very ancient opsin, encephalopsin, teleost multiple tissue, and certain ciliary opsins from protostomes.
Hardly a vertebrate innovation, ciliary opsins appear in early deuterostomes lacking imaging eyes, in both branches of protostomes (initially bee and ragworm), in pre-bilateran cnidarians and quite possibly ctenophores. Sponges are still uncertain because of a 5 year wait on the assembly but the very earliest metazoan genomes (Monosiga and Trichoplax) definitely lack ciliary, indeed any K296 GPCR. If those genomes are representative, then ciliary opsins emerged on the post-Trichoplax stem. Certain cnidarian opsins -- but not all -- already exhibit certain sequence specializations of ciliary opsins.
Ciliary opsins have been totally lost on numerous occasions in numerous lineages, notably 'model' organisms like drosophila and worse nematodes, which have lost all opsins. Hemichordates and non-annelid lophotrochozoans have lost ciliary opsins independently. Other explanations (such as multiple re-emergences of ciliary-like opsins from GPCR or distantly related opsins) are manifestly impossible given intron structure alone.
The earliest deuterostome ciliary ur-opsin is best represented by the TMT class of opsins, in particular by the TMT1 subgroup that has retained important ancestral characteristics in the diagnostic TM2 region. Sequential expansion of TMT1 gave rise to all the other ciliary opsins found in vertebrates, including all rod and cone opsins. This fundamental gene, though retained through ampibian and amniote, curiously was eventually lost in birds and mammals. Transcripts are often annotated as testis libraries suggesting a function in gamete release timing. Its immediate descendent gene TMT2, whose subfunctionalization is unknown, is retained in monotremes and marsupials but lost in all placentals. The best experimental organism for studying TMT1 is probably Xenopus.
Melanopsins, discovered in 1998 in frog lateral line dermal melanophores (as well as hypothalamus, iris, and retinal horizontal cells) form another ancient opsin class. Melanopsins include rhabdomeric arthropod opsins (which have an unnecessary dual nomenclature -- they're melanopsins by multiple independent criteria) and lophotrochozoan melanopsins (which other than scallop, squid and octopus genes lie undocumented within genome projects). One cnidarian opsin from coral classifies as a melanopsin yet closely shares other properties with cnidarian opsins that don't.
Peropsins are a third major class of opsins in the sense of broad but not universal retention. Expanded in deuterostomes, they occur rarely in arthropods but are quite important in lophotrochozoa. Peropsins are the only opsin class retained in hemichordates. Nothing resembling them has been retained in cnidaria, reflecting loss in the two genomes available because their coalescence with cilopsins lies much further in the past.
Neuropsins are a much expanded but little studied group of opsins restricted to living deuterostomes though they did not originate there (unless divergence from another opsin class was exceeding abrupt and then immensely slowed). The neuropsin expansion to 4 genes in the lamprey stem continued unchanged to the amniote ancestor but subsequently contracted to 2 in monotremes and only 1 in marsupials and placentals.
Rgropsins constitute another little-studied group represented today only beyond the tunicate-vertebrate last common ancestor. Again these opsins must have originated far earlier in pre-bilaterans because their ancestral reconstructed sequence is still far from coelescence with other ancestral opsin classes.
Conceivably rgropsins and neuropsins are retained in other bilatera but diverged to the point of unrecognizability. This scenario can be rejected because analytic methods of complete genomes are sufficiently sensitive to locate all GPCR and screen them for K296. This reasoning is applicable to peropsins as well -- they have definitely been lost in all insect and molluscan genomes though fortunately retained in two chelicerate arachnids.
Peropsin, neuropsins and rgropsins are unified by their intronation, sharing three ancestral introns despite numerous differences. This indicate -- given the slow rate of intron gain and loss in most metazoan clades -- that they share deep roots in pre-bilatera, implying near total loss of neuropsins and rgropsins in invertebrates.. None of these introns are shared with cilopsins or melanopsins or for that matter known GPCR.
Opsins, plagued again and again by losses on stem lineages, illustrate why ancestral node-based sequencing is a far better strategy than terminal speciation-based. If sequencing effort is proportional to contemporary species numbers, the millions of opsins from insects (respectively ray-finned fish), opsin evolution will never be illuminated. Each major node requires equal sequencing intensity. Thus onychophoran and tardigrade opsins have a far greater priority than more butterflies or cichlid fish.
Cnidopsins are a taxonomically based collection of opsins that do not all classify satisfactorily within the bilateran opsin system. Much more intensive sampling is needed here because neither Hydra nor Nematostella has remotely the cubomedusan repertoire. Ctenophores currently have a single unexpected opsin gene obtained accidentally in a shotgun project -- obviously much greater sequencing and structural effort is warranted given their currently basal position within the opsin-containing species.
In hindsight, large scale loss of opsin classes should not come as a surprise -- humans lost 12 of 20 opsin loci that otherwise persisted from lamprey stem to amniote ancestor to living frogs, lizards and birds. This is characteristic of GPCR evolution overall (notably the olfactory subgenome): collapse of a large gene clade, followed by later massive expansion but retention to contemporary species only in scattered lineages.
This can result in two species having similar number of GPCR genes but a very poor correspondence between them. This pattern of gene churning (cycles of explosive expansion followed by mass die-offs) differs dramatically from gene histories of ribosomal proteins or catabolic enzymes (eg homogenistate dioxygenase) retained in all species as single copies (ancient birth, never death). Other genes like globins exhibit moderate expansion to several copies accompanying a trend to organismal specializational complexity with little evidence of contraction (occasional births, rarely deaths). Still other gene classes, for example selenoproteins, seem headed systemically for oblivion (births but trending to extinction) in the sense of the cysteine replacement rachet.
Once over this conceptual hurdle, cycles of expansion and contraction in the GPCR gene family can be repeatedly invoked on various branches of the phylogenetic tree to explain many aspects of opsin classification. After several such cycles, the utility of terms such as ortholog and paralog are stretched to the breaking point -- words become inadequate to describe the gene tree.
Vertebrate ciliary opsins
Vertebrates have a very peculiar history of deuterostome ancestors (echinoderms, hemichordates, cephalochordates, and urochordates) that, despite retaining opsins, never had nor developed imaging vision, followed by rapid expansion and divergence of the opsin gene family to contemporary gene numbers and four-color, oil drop-enhanced vision imaging vision by the time of divergence with jawed vertebrates.
This cannot be attributed to supposed 1R and 2R whole genome expansions -- indeed the observed lack of sistering in the ciliary opsin gene tree actively conflicts with such a scenario (if all 'ohnlogs' are lost, whole genome duplication is irrelevent). The lamprey genome itself is a completely unsatisfactory assembly in terms of small contig size (almost never allowing adjacent gene determination) and lacking coverage of genes that must be present but are not. The assembly is not in fact high coverage (6x claimed) because that calculation erroneously assumes genome size similar to mammal (3 gbp).
The amphioxus assembly, while far better, suffers from numerous paired end read misassemblies and lack of followup. The first two assemblies of tunicates, though deeply flawed, established that their genome is extremely diverged by any measure. Thus tunicates serve as a poor but essential immediate vertebrate outgroup. Given these facts (and a comparable human coding gene count of 20,000), one of the great mysteries of the genomic era is how hundreds of journal claims for 1R and 2R survived peer review.
It is sometimes argued that effective predation on protostomes was the driving force for the evolution of imaging vision. If so, this was lamprey-style suction and boring because from the opsin standpoint, eyes had fully matured well before the appearance of any jaws that could seize prey. Since the fossil record establishes protostomes developed imaging vision far earlier in the Cambrian than deuterostomes, perhaps {font color="blue>evading arthropod predation provided the selective advantage.
The next 500 million years of vertebrate evolution did not bring any further innovation (other than clade specific specializations) and some major clades, notably mammals, backslid to the point where their color vision today is far inferior to that of the common ancestor with lamprey. This period of relative stasis in opsin gene number has a long list of narrow exceptions. These exceptions have perhaps drawn too much experimental attention away from the broader sweep of opsin evolution.
Examples of gains include sharks (extra copy of LWS), zebrafish (greatly expanded RHO2 repertoire and retained RHO1 retrogene), primates (tandem LWS copy) and so forth, with the most sweeping gain being whole genome duplication at some point after telost fish divergence. A duplicated gene can only be retained for an adaptive reason.
Examples of losses include dolphin (SWS1), chickens (TMT1), cave fish and blind mole rat (opsin pseudogenes), platypus (SWS1) and so forth, with the most remarkable loss being the massive attrition era in mammals (60% of all opsin loci lost in placentals, forseen by G Wall in 1942). Opsin gene loss is generally not adaptive ('less is more') but simply neutral drift ('use it or lose it').
It follows from both standard tree analysis and consideration of diagnostic regions that the very earliest ciliary opsins in deuterostomes were of TMT1 class. Only these opsins continue the ancestral pattern in the N D P C iron triangle centered in the second transmembrane helix, which was established already in pre-cnidaria, indeed predecessor GPCR, and continues today in most other opsin classes, including ecdysozoan and cnidaria ciliary opsins.
Encephalopsin is not basal because of two key derived features: T/S/N substituted for the immensely conserved diagnostic proline kink in TM2 with its profound impact on signaling and the carboxy terminal VxPx* ciliary targeting motif. These are features of all other imaging and near-imaging ciliary opsins, unlike basal TMT genes. Encephalopsin thus served as immediate parental gene for the cascade of gene duplications giving rise to these specialized opsins.
Encephalopsin has possible early counterparts in two amphioxus opsins (Amphiop4 and Amphiop5) also exhibiting threonine substitution of this proline and close but not definitive blast clustering but lacking any sign of VxPx* motif. However support otherwise is ambiguous, perhaps due to rapid divergence in amphioxus. In particular, no flanking gene synteny occurs between these amphioxus opsins and any vertebrate (or invertebrate) opsin. Vertebrate encephalopsins themselves provide no clue to their parent gene through flanking genes or intron structure.
Tunicate genomes have not retained any gene resembling encephalopsin but instead contain a parapinopsin-like gene with valine for proline and a full VAPA* motif in C. savignyi. No echinoderm or acornworm opsin bears any particular affinity to encephalopsin. Protostomal ciliary opsins most closely align with TMT and share most of its diagnostic residues, though again no synteny remains.
So when did the encephalopsin locus arise? That might be clarified by sequencing hagfish and additional early deuterostomes, but for now it appears that a distinctive encephalopsin locus was formed by an intron-preserving segmental genome duplication in the cephalochordate stem that subsequently lost the ancestral proline. This gene diverged in quite different ways in amphioxus, tunicates, and vertebrates. In tunicates, the locus duplicated again with one copy specializing to today's parapinopsin and the other being lost (along with all TMT loci). In vertebrates, both copies were retained, one descending to contemporary encephalopsins and the other to parapinopsins and their various imaging and near-imaging ciliary opsins.
Through cascading segmental gene duplications, the TMT1 ciliary ur-opsin gave rise directly and indirectly to all other ciliary opsins observed in living deuterostomes. The ur-opsin likely retained the ancestral ciliary opsin form and function even as its daughter genes have neofunctionalized to new roles. It cannot be naively modeled from bovine RHO1 (the most recently derived of all imaging opsins) because the latter lacks the induced proline kink in TM2.
The current phylogenetic distribution of TMT1 extends from sea urchin (but not acornworm) to amphioxus and tunicate through chonrichtyes and teleost fish to frog and lizard, with ortholgy mostly validatable by syntenic location. Remarkably birds, platypus, marsupials, and all placental mammals have lost the ciliary ur-opsin which requires two independent events. Lizard flanking gene order is fully preserved in chicken but no pseudogene debris remains at the site.
This is a familiar story in opsins ... an old gene fades out mid-amniote but otherwise continues on for another 310 million years (Wall hypothesis plus birds). The graphic below summarizes the history of gene gain and loss at 24 opsin loci in deuterostomes from the earliest Cambrian to the present day. The genes and species are sorted to illustrate sequential loss in the human lineage. The other sort order would provide a conventional species tree vs gene tree view. Species are clustered by color to indicate subtrees, for example (lizard,(finch,chicken)).
What is the function of the ciliary ur-opsin in the contemporary organisms that retain it? To determine whether it has ever been studied under another name, tblastn of each TMT1 in the curated reference collection can be used against GenBank transcripts and gene deposits (which often provide the necessary PubMed id). Transcript data from non-pooled tissues might at least determine some sites of expression; however no data is available for frog or lizard (ie no data for tetrapods since other species have lost the gene).
Note first that the gene appears to have undergone a segmental duplication in the chondrichthyes stem. Both loci persisted through teleost fish; copies created by whole genome duplication were not retained. Only one locus, presumably more fundamental, persisted into frog and lizard. It defines the TMT locus in earlier diverging species via best-blast and synteny and so the second TMT locus by default. These are not to be confused with a later TMT gene duplication that arose in fish and persisted through lizard, birds, platypus and marsupials but not placentals.
One of the three distinct TMT genetic loci was specifically studied in adult eyes and embryonic cell lines of zebrafish. Little has happened since (other than bioinformatics): TMT genes from Tetraodon, Gasterosteus and Oryzias still lack articles and informative transcripts. Zebrafish TMT genes remain crazily annotated at GenBank (as adiponectin, with pipeline analysis of genomic dna mislabelled as mrna). None of the zebrafish genes has never been studied though two transcripts are available from delimited libararies (developing eggs with support cells). Pimephales promelas has TMT transcripts from brain and testis; Oncorhynchus mykiss has transcripts from testis.
In summary, while thousands of articles address the highly derived RHO1 locus, the much more fundamental cilary ur-opsin TMT has scarcely been investigated. Until its core function is better understood, the early history of ciliary opsins in vertebrates will remain a mystery. Without this core role for TMT that allowed it to be retained for tens of millions of years in the pre-vision era, no ciliary opsin would have been available for later expansion into imaging ciliary opsins.
Below is a preliminary assessment of the three TMT loci and the related encephalopsin locus. The right column revises nomenclature relative to that of the reference collection (second column). Gene names arose in the pre-genomic era when the full paralog complement, phylogenetic distribution and sites of expression were not understood. With complete genomes now in hand, a final and sensible nomenclature can be envisioned.
The TM2 region aligned proves useful for defining the diagnostic residues and indels of these four gene classes necessary to sort out their evolutionary relationships. As the names suggest, this is (TMT1,(TMT2,(TMT3,ENC))) rooted by protostome ciliary opsins which indicate TMT1 is the original ur-opsin. The final column contains information on synteny that validates the proposed history of gene duplication. Gene order is barely conserved but enough for the duplication history to be unravelled.
ENC_homSap ENCEPH_hom NN LLVLVLYYKFQRLRTPTHLLLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ... ENC_otoGar ENCEPH_oto NN LLVLVLYYKFPRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ... ENC_loxAfr ENCEPH_lox NN LLVLVLYYKFQRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ... ENC_pteVam ENCEPH_pte NN LLVLVFYYKFQQVRTPFYLFLVNISFS D LLVS-FFGV T FTFVSCLRNGWVWDT VGC ... ENC_musMus ENCEPH_mus GN LLVLLLYSKFPRLRTPTHLFLVNLSLG D LLVS-LFGV T FTFASCLRNGWVWDA VGC ... ENC_canDom ENCEPH_can CH FCPQKGFLEFQRLRTPTHLLLVNLSLS D LLVS-LFGV T FTFVSCLRNGWVWDS VGC ... ENC_monDom ENCEPH_mon NN LLVLVLYYKFQRLRTPTHLFLVNISFN D LLVS-LFGV T FTFVSCLRSGWVWDS VGC syn(-EXO1 -WDR64 +ENC -KMO +FH +RGS7) ENC_galgal ENCEPH_gal NN LLVLVLYYKFKRLRTPTNLFLVNISLS D LLVS-VCGV S LTFMSCLRSRWVWDA AGC syn(-EXO1 -WDR64 +ENC -PIGM +RGS7) ENC_anoCar ENCEPH_ano NN LLVLVLYAKFKRLRTPTHLFLVNISLS D LLVS-LFGV S FTFGSCLRHRWVWDA AGC syn(-EXO1 -WDR64 +ENC -PIGM +RGS7) ENC_xenTro ENCEPH_xen NN LLVLILYCKFKRLQTPTNLLFFNTSLC H FVFS-LLAI T FTFMSCVRGSWAFSV EMC syn(-ASAH3L -ACER2 +ENC -ADFP -DENND4C) ENC_danRer ENCEPH_dan NN IIVIILYSRYKRLRTPTNLLIVNISVS D LLVS-LTGV N FTFVSCVKRRWVFNS ATC syn(-MTRF1L -TMEM63B +ENC -KMO +IDE -MARCH5 +CPEB3 -BTAF1) ENC_takRub ENCEPH_tak NN FVVLALYCRFKRLRTPTNLLLVNISLS D LLVS-LFGI N FTFAACVQGRWTWTQ ATC syn(-ABLIM1 -PTK7 +ENC -KMO +IDE +CPEB4 -CCNJ) ENC_gasAcu ENCEPH_gas NN VVVIVLYCKFKRLRTPTNLLVVNISLS D LLVS-VIGI N FTFVSCIRGGWTWSR ATC syn(FAM82A CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4 -CCNJ) ENC_oryLat ENCEPH_ory NN LLVILLYCKFKRLRTPTSLLLVNISLS D LLVS-VVGI N FTLASCVKGRWMWSQ ATC syn(CYP1B1 CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4 -BTAF1 -CCNJ) ENC_calMil ENCEPH_cal NN ILVLLLYYKFKRLRTPTNLLLVNISVS D LLVS-VFGL S FTFVSCTQGRWGWDS AAC --- ENC_squAca ENCEPH_squ NN LLMLVLYCKFKRLRTPTNLFLVNISIS D LLLS-VFGV I FTFVSCVKGRWVWDS AAC --- ENC_petMar ENCEPH_pet NN LLLVALFVGFKRLQTPTNLLLVNISLS D LLVS-VFGN T LTLVSCVRRRWVWGN GGC --- ENC_braFlo ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn( -ZFYVE1 +RTF1 +ENC -CES1 -POMT2) ENC_braBel ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNVSVA D LLVS-VCGN T LSFASAVQHRWLWGR PGC --- ENC_braFlo TMT5_braFl SN GAVVLLFLKFRQLRTPFNMLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn(NKX2 -ZFYVE1 +RTF1 +ENC ERF1 TMED9 LARS2) ENC_braBel TMT5_braBe SN GAVVVLFLKFPQLRTPFNLLLLNMAVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC --- TMT3_monDom TMT_monDom SN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LSGT T LSFASSIQGRWIGGK HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2 RALY SLC5A7 +SULT1E1) TMT3_macEug TMT_macEug NN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LTGT T LSFASSIRGRWIAGY HGC --- TMT3_ornAna TMT_ornAna NN LIVLILFCKFKALRNPVNMIMLNISAS D MLVC-VSGT T LSFASNISGRWIGGD PGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -UCHL3 +TBCID4) TMT3_galGal TMT_galGal NN LIVLILFCKFKTLRNPVNMLLLNISIS D MLVC-ISGT T LSFASNIHGKWIGGE HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7 +SULT1C4) TMT3_taeGut TMT_taeGut NN LIVLILFCKFKTLRNPVNMLLLNISVS D MLVC-ISGT T LSFASNIRGKWIGGD HAC ... TMT3_anoCar TMT_anoCar NN LVVLILFCKFKTLRNPVNMLLLNISAS D MLVC-ISGT T LSFVSNIYGRWIGGE HGC syn( +TMT3 -ST6GAL2_overlap +SLC5A7 RANBP2) TMT3_xenTro TMT_xenTro NN FVVLILFCKFKTLRTPVNMMLLNISAS D MLVC-VSGT T LSFTSSIKGKWIGGE YGC syn( -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7) TMT3_danRer TMT_danRer NN LVVLVLFCKFKTLRTPVNMLLLNISIS D MLVC-MFGT T LSFASSVRGRWLLGR HGC syn( -UXS1 +TMT3 -ST6GAL2_overlap +GPR89A -PDZK1l) TMT3_tetNig TMT_tetNig NN FIVLLLFCKFKKLRTPVNVLLLNISVS D MLVC-LFGT T LSFASSLRGRWLLGR SGC syn( -UXS1 +TMT3 -ST6GAL2_overlap) TMT3_takRub TMT_takRub NN FVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR IGC ... TMT3_gasAcu TMT_gasAcu NN LVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSLRGKWLLGR SGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -TFDP2 POU2) TMT3_oryLat TMT_oryLat NN FVVLILFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR GGC ... TMT2_anoCar TMTa_anoC NN LLVLVLFCRNKVLRSPINLLLMNISLS D LMIC-IVGT P FSFAASTQGKWLIGP AGC syn(VAMP PER2 HES6 TUBA1 GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5) TMT2_xenTro TMTa_xenT NN LVVLILFCQYKVLRSPINMLLMNISLS D LMVC-ILGT P FSFAASTQGHWLIGE IGC syn(VAMP PER2 HES6 GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5) TMT2_danRer TMTb_danRe NN TLVLVLFCRYKVLRSPMNCLLISISVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC syn(-PTCHD1 -PHEX -CNKSR2 SH3KBP1 -MAP3K15 TNK2 +TMT2 -MYEOV2 -MAP4K4 PRMT6) TMT2_tetNig TMTb_tetNi SN LLVLALFCRFRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLLGR AGC syn(MYO3A GAD2 ARHGAP21 TFR2 +TMT2 MYEOV2 SH3KBP1 MAP3K15 PHEX PTCHD1) TMT2_takRub TMTb_takRu SN FLVLALFCRYRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ... TMT2_gasAcu TMTb_gasAc SN FLVLALFCRYRALRTPMNLLLVSISAS D LLVS-MVGT P FSFAASTQGRWLIGR AGC syn(PTCHD1 PHEX MAP3K15 SH3KBP1 +TMT2 -MYEOV2 ARHGAP21 MYO3A) TMT2_oryLat TMTb_oryLa SN LLVLALFCRYRALRTPMNLLLVSISVS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ... TMT1_danRer TMTa1_danR NN LLVLVLFGRYKVLRSPINFLLVNICLS D LLVC-VLGT P FSFAASTQGRWLIGD TGC syn( RAB25 PBX3 TNK2 +TMT1 WAC +LPPR4 +AGL PTBP1) TMT1_tetNig TMTa_tetNi SN LLVLVLFCRFKVLRSPINLLLVNISVS D LLVC-VLGT P FSFAASTQGRWLIGA AGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1 WAC RAB18 YME1L1 ABI1) TMT1_takRub TMTa_takRu NN LLVLVLFCRYKMLRSPINLLLMNISIS D LLVC-VLGT P FSFAASTQGRWLIGE AGC syn( LRRN3 CALD1 TNK2 +TMT1 RAB18 YME1L1 ABI1 TLK1 EDRNB) TMT1_gasAcu TMTa_gasAc NN LLVLVLFCRYKMLRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1 WAC RAB18 YME1L1 ABI1) TMT1_oryLat TMTa_oryLa NN LLVLVLFCRYKILRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC ... TMT1_pimPin TMTa_pimPr NN TLVLILFCRYKVLRSPMNYLLVSIAVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC --- TMT1_oncMyk TMTa_oncMy SN LFVLLVFARFQVLRTPINLILLNISVS D MLVC-IFGT P FSFAASLYGRWLIGA HGC --- TMT1_calMil TMTa1_calM NN LLVLVLFCKYKVLRSPMNMLLLNISVS D MLVC-ICGT P FSFAASVQGRWLVGE QGC --- TMT2_calMil TMTa2_calM NN LLVLLLFVCFKEIRTPLNMILLNISLS D LSVC-VFGT P FSFAASIYRRWLIGH KGC --- TMT1_braFlo TMTx_braFl NN STTLYLVGRYKQLRTPFNILMVNLSVS D LLMC-VLGT P FSFVSSLHGRWMFGH SGC syn(TNPPO2 HECTD3 ABCCA4 PRPRA TMT ATP5D TMT PTPRA FDE4A PTPRA PYRNXN1 TMT2_braFlo TMTy_braFl TN LLTVLVFWCFKSLRTPFHLYLGGIALS D LLVA-ALGS P FAVASAVGERWLFGR AVC syn(ZFYVE1 FBXL4 RTF1 TMT CES4 TMTY POMT2 GSTZ1) TMT1_strPur TMTPIN_str NN GIVMILFARFPSLRHPINSFLFNVSLS D LIIS-CLAS P FTFASNFAGRWLFGD LGC syn(ARG2 NEK9 FAM164A ZC3H14 TMT PRPF39 YIPF4 SPATA5) TMT2_strPur ENCEPH_str GN SVVLFLFAWDRHLRTPTNMFLLSLTIS D WLVT-VVGI P FVTASIYAHRWLFAH VGC --- TMT1_apiMel TMT_apiMel AN LLVAIVIVKDAQLWTPVNVILFNLVFG D FLVS-IFGN P VAMVSAATGGWYWGY KMC syn(HEX MAK FASN SPTBN4 PSMA3 TMT LSM11 SEC23A KNSL8) TMT1a_anoGam TMT1_anoGa LN IFVIALMYKDVQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWLYGK SIC ... TMT1b_anoGam TMT2_anoGa LN LFVIALMCKDMQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWIFGR TLC ... TMT1_aedAeg TMT_aedAeg LN LFVIALMCKDVQLWTPINIILFNLVCS D FSVS-IIGN P FTLTSAISRHWIFGR TVC ... TMT1_culPip TMT_culPip LN LFVIALMCKEVQLWTPMNIILLNLVCS D FSVS-IVGN P FTLSSAISHRWLFGR KLC ... TMT1_triCas TMT_triCas LN LTVIIFMLKERQLWSPLNIILFNLVVS D FLVS-VLGN P WTFFSAINYGWIFGE TGC ... TMT1_bomMor TMT_bomMor LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGH TMC ... TMT1_helVir TMT_helVir LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGK TMC ... TMT1_rhoPro TMT_rhoPro GN LIVIIIMCRDKNLWTPVNFILFNVIVS D FSVA-ALGN P FTLASAIAKRWFFGQ SMC ... TMT1_acyPis TMT_acyPis FN TCVIFIMIRDTRLWTPQNVIIFNLATS D LAVS-VLGN P VTLAAAITKGWIFGQ TIC ... TMT1a_dapPul TMTa_dapPu MN IVVVVIILNDSQKMTPLNWMLLNLACS D GAIA-GFGT P ISAAAALKFTWPFSH ELC ... TMT1b_dapPul TMTb_dapPu MN VVVVIVILNDSQRMTPLNWMLLNLACS D GAIA-GFGT P ISTAAALEFGWPFSQ ELC ... 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 | | | | | | | | | | | | | | | ENCEPH_homSap FSPGTYERLALLLGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFSGSLFGIVSIATLTVLAYERYIRVV-----HARVINFSWAWRAITYIWLYSLAWAGAPLLGWNRYI ENCEPH_loxAfr FRSGTYERLALLVGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLFLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFSSSLFGIASITTLTVLAYERYIRVV-----HARVINFSWAWRAITYIWLYSLAWSGAPLLGWNRYI ENCEPH_canFam IPAAVLDIESQAPKDESLYFSICHFCPQKGFLEFQRLRTPTHLLLVNLSLSDLLVSLFGVTFTFVSCLRNGWVWDSVGCVWDGFSSSLFGIVSITTLTVLAYERYIRVV-----HARVINFSWAWRAITYIWLYSLAWSGAPLLGWNRYI ENCEPH_monDom FSPGTYELLALLIATIGLLGLCNNLLVLVLYYKFQRLRTPTHLFLVNISFNDLLVSLFGVTFTFVSCLRSGWVWDSVGCAWDGFSNTLFGIVSIMTLTVLAYERYNRIV-----HAKVINFSWAWRAITYIWLYSLVWTGAPLLGWNRYT ENCEPH_galGal FSAGTYELLALLIATIGTLGVCNNLLVLVLYYKFKRLRTPTNLFLVNISLSDLLVSVCGVSLTFMSCLRSRWVWDAAGCVWDGFSNSLFGIVSIMTLTVLAYERYIRVV-----HAKVIDFSWSWRAITYIWLYSLAWTGAPLLGWNRYT ENCEPH_anoCar FSAGTYELLALLVAAIGLLGLCNNLLVLVLYAKFKRLRTPTHLFLVNISLSDLLVSLFGVSFTFGSCLRHRWVWDAAGCVWDGFSNSLFGIVSIMTLTVLAYERYIRVV-----HARVIDFSWSWRAITYIWLYSLAWTGAPLLGWNHYT ENCEPH_danRer FADETYKLLTFTIGSIGVLGFCNNIIVIILYSRYKRLRTPTNLLIVNISVSDLLVSLTGVNFTFVSCVKRRWVFNSATCVWDGFSNSLFGIVSIMTLSGLAYERYIRVV-----HAKVVDFPWAWRAITHIWLYSLAWTGAPLLGWNRYT ENCEPH_tetNig FAVHTYRLLAAAIGAIGVLGFCNNLAVAALYWRFRRLRTPTNLLLLNISLSDLLVSLLGVNFTFAACVQGRWTWNQATCVWDGFSNSLFGIVSIMTLAALAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYSLAWTGAPLLGWNRYT ENCEPH_takRub FSGDTYRVLAFTIGTIGAFGFCNNFVVLALYCRFKRLRTPTNLLLVNISLSDLLVSLFGINFTFAACVQGRWTWTQATCVWDGFSNSLFGIVSIMTLAALAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYALAWTGAPLLGWNRYT ENCEPH_gasAcu FAVGTYKLLAFAIGTIGVFGFCNNVVVIVLYCKFKRLRTPTNLLVVNISLSDLLVSVIGINFTFVSCIRGGWTWSRATCIWDGFSNSLFGIVSIMTLASLAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYSLVWTGAPLLGWNRYT ENCEPH_oryLat FAVGTYKLLTVIIGTIGVFGFCNNLLVILLYCKFKRLRTPTSLLLVNISLSDLLVSVVGINFTLASCVKGRWMWSQATCVWDGFSNSLFGIVSIMTLAALAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYSLAWTGAPLLGWNRYT ENCEPH_choDri FSPNTYKLLAVIIGTIGIVGFCNNILVLLLYYKFKRLRTPTNLLLVNISVSDLLVSVFGLSFTFVSCTQGRWGWDSAACVWDGFSHSLFGTVSIVTLTVLAYERYIRVV-----NAKATNFPWAWRAITYTWFYSLAWSGAPLLGWNRYT ENCEPH_petMar FSAATFRLLAGVVGTIGVAGFLNNLLLVALFVGFKRLQTPTNLLLVNISLSDLLVSVFGNTLTLVSCVRRRWVWGNGGCVWDGFSNSLFGIVSISTLTALSYERYARLI-----KAQVLDFSWAWRAVTYTWLYSAAWTGAPLLGWSRYV ENCEPH_xenTro FTEDTYHFLALIVATVGFLGLVNNLLVLILYCKFKRLQTPTNLLFFNTSLCHFVFSLLAITFTFMSCVRGSWAFSVEMCVFHGFSKNLLGIVSFGTLTVVAYERYARVV-----YGKYVNSSWSKRSITFVWVYSLAWTGFPLIGWNLYT TMT2_monDom LSRTGHTIVAVFLGIILIFGSISNFIVLVLFCKFKVLRNPVNMLLLNISISDMLVCLSGTTLSFASSIQGRWIGGKHGCRWYGFANSCFGIVSLISLAILSYERYRTLTLC--PGQ-GADYQKALLAVAGSWLYSLVWTVPPLIGWSSYG TMT2_macEug LSRTGHTVTAVFLGLILILGVINNFIVLVLFCKFKVLRNPVNMLLLNISISDMLVCLTGTTLSFASSIRGRWIAGYHGCRWYGFANSCFGIVSLISLAVLSYERYRTLTLC--PRQ-GTDYHKALLAVAGSWLYSLIWTVPPLIGWSSYG TMT2_ornAna LSRTGHTMVAVFLGIILVFGFMNNLIVLILFCKFKALRNPVNMIMLNISASDMLVCVSGTTLSFASNISGRWIGGDPGCRWYGFVNSCLGIVSLISLAVLSYERYRTLTLH--PKQ-STDYQKAVLAVGASWIYSLIWTIPPLLGWSSYG TMT2_galGal LSRNGHTVVAVFLGFILFFGFLNNLIVLILFCKFKTLRNPVNMLLLNISISDMLVCISGTTLSFASNIHGKWIGGEHGCRWYGFVNSCFGIVSLISLAVLSYERYSTLTLC--NKR-SDDYRKALLAVGGSWVYSLLWTVPPLLGWSSYG TMT2_taeGut LSRSGHTVVAVFLGLILFFGFLNNLIVLILFCKFKTLRNPVNMLLLNISVSDMLVCISGTTLSFASNIRGKWIGGDHACRWYGFVNSCFGVVSLISLAVLSYERYNTLTLC--HKR-SDDFRKALLAVAGSWIYSLVWTVPPLLGWSSYG TMT2_anoCar LSRMGHNIVAVFLGLILVFGFLNNLVVLILFCKFKTLRNPVNMLLLNISASDMLVCISGTTLSFVSNIYGRWIGGEHGCRWYGFVNSCFGIVSLISLAILSYERYSTLTQT--NKR-GSDYQKALLGVGGSWLYSLIWTVPPLIGWSSYG TMT2_xenTro LSRTGHTVVAIFLGFILIFGFLNNFVVLILFCKFKTLRTPVNMMLLNISASDMLVCVSGTTLSFTSSIKGKWIGGEYGCQWYGFVNSCFGIVSLISLAILSYERYSTLTLY--NKG-GPNFKKALLAVASSWLYSLVWTVPPLLGWSSYG TMT2a_danRer LSRAGFIALSVFLGFIMTFGFFNNLVVLVLFCKFKTLRTPVNMLLLNISISDMLVCMFGTTLSFASSVRGRWLLGRHGCMWYGFINSCFGIVSLISLVVLSYDRYSTLTVY--HKR-APDYRKPLLAVGGSWLYSLIWTVPPLLGWSSYG TMT2b_danRer LSRTGHNVVAVILGSILIFGTLNNLVVLVLFCKFKTLRTPVNMLLLNISVSDMLVCLFGTTLSFAASIRGRWLVGRHGCMWYGFVNSCFGIVSLISLAILSYDRYSTLTVY--NKR-APDYSKPLLAVGGSWLYSLFWTVPPLLGWSSYG TMT2_tetNig LSPTGFVVLSVVLGFIITFGFLNNFIVLLLFCKFKKLRTPVNVLLLNISVSDMLVCLFGTTLSFASSLRGRWLLGRSGCNWYGFINSCFGIVSLISLVILSHDRYSTLTVY--NKQ-GINYRKPLLAVGGTWLYSLLWTVPPLLGWSSYG TMT2_takRub LSPTGFVVLSVVLGFIMTFGFLNNFVVLLLFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSIRGRWLLGRIGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVY--NKQ-GINYRKPLLAVGGTWLYSLFWTVPPLLGWSSYG TMT2_oryLat LSQAGFVVLSVVLGFIMTFGFLNNFVVLILFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSIRGRWLLGRGGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVY--NKG-GLNYRKPLLAVGGSWLYSLFWTVPPLLGWSSYG TMT2_gasAcu LSPTGFVVLSVMLGFIMTFGFVNNLVVLLLFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSLRGKWLLGRSGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVY--NKA-GPDYRKPLLAIGGSWLYSLFWTVPPLLGWSSYG TMT3_xenTro LSRTAHSVVAVCLGCILVLGSLYNSFVLLIFVKFTAIRTPINMILLNISVSDLLVCIFGTPFSFVSSVSGGWLLGQQGCKWYGFCNSLFGLVSMISLSMLSYERYLTVLKC--TKADMTDYKKSWLCIIVSWLYSLCWTLPPLIGWSSYG TMT3_calMil LSQSGHTTVAVFLGIILVLGCVNNLLVLLLFVCFKEIRTPLNMILLNISLSDLSVCVFGTPFSFAASIYRRWLIGHKGCKWYGFANSLFGLVSMISLSMLSYERYLTVLKC--TKADMTDYKKSWLCIIVSWLYSLCWTLPPLIGWSSYG TMT3_danRer LSRTGHTVTAVCLGAILLLGCLNNLFVLLVFARFRTLWTPINLILLNISVSDILVCLFGTPFSFASSLYGKWLLGHHGCKWYGFANSLFGIVSLMSLSILSYERYAALLRA--TKADVSDFRRAWLCVAGSWLYSLLWTLPPFLGWSNYG TMT3_oncMyk LGRTGHTVVAVFLGVIFLLGFLSNLFVLLVFARFQVLRTPINLILLNISVSDMLVCIFGTPFSFAASLYGRWLIGAHGCKWYGFANSLFGIVSLVSLAILSYERYSTILCY--TKADPSDYKKAWLAIAGAWLYSLVWTVPPFFGWSSYG TMT3_tetNig LSRSSHTAVAVLLGVILVAGILSNSLVLLLFVKYRSLWTPINLILLNINLSDILVCVFGTPFSFAASLQGRWLIGEGGCMWYGFANSLFGIVSLVSLSVLSYERCTVVLQP--SQVDVSDFRKARFCVGGSWLYALLWTSPPLLGWSSYG TMT3_takRub MSRTGHTVVAVMLGTILLAGVFGNSVVFLVFVKYRSLRTPINLILLNISLSDILVCVFGTPLSFAASLKGRWLLGERGCEWYGFANSLFGIVSLVSLSVLSYERYTVVLQP--TQVDVSYFRKAWFCVGGSWLYALFWTLPPLLGWSRYG TMT3_oryLat LSRTGHTAVAVCLGFILVAGILNNFLTLLVFAKFRSLWTPINLILLNISLSDILVCVLGTPFSFAASVRGRWLIGESGCKWYAFANSLFGIVSLVSLSVLSYERYITVLHS--SQADLSNFRKAWFCVGGSWLYSLLWTLPPFLGWSSYG TMT1a_anoCar LSPTGHLITAICLGVIGSLGFLNNLLVLVLFCRNKVLRSPINLLLMNISLSDLMICIVGTPFSFAASTQGKWLIGPAGCVWYGFANTFFGTVSLISLAVLSYERYCTMMGT--TEADATNYKKVWMGIFLSWIYSLFWSLPPLFGWSSYG TMT1a_xenTro LSPTGHLLVAVFLGVIGSLGFFNNLVVLILFCQYKVLRSPINMLLMNISLSDLMVCILGTPFSFAASTQGHWLIGEIGCIWYGFVNTLFGTVSLVSLAVLSYERYCTMLRS--TEADLTNYKKAWLGILVSWIYSLVWTLPPLFGWSKYG TMT1a1_danRer LSPTGHLVVAVCLGFIGTFGFLNNTLVLVLFCRYKVLRSPMNCLLISISVSDLLVCVLGTPFSFAASTQGRWLIGRAGCVWYGFINSFLGVVSLISLAVLSYERYCTMMGS--TQADSTNYRKVVIGIAFSWIYSMVWTLPPLFGWSCYG TMT1a_pimPro LSPTGHLVVAVCLGFIGTFGFLNNTLVLILFCRYKVLRSPMNYLLVSIAVSDLLVCVLGTPFSFAASTQGRWLIGRAGCVWYGFINSCLGVVSLISLAVLSYERYCTMMGA--TQADSTNYKKVAMGIAFSWIYSMVWTLPPLFGWSCYG TMT1a2_danRer LSPTGHILVAVSLGFIGTFGFLNNLLVLVLFGRYKVLRSPINFLLVNICLSDLLVCVLGTPFSFAASTQGRWLIGDTGCVWYGFANSLLGIVSLISLAVLSYERYCTMMGS--TEADATNYKKVIGGVLMSWIYSLIWTLPPLFGWSRYG TMT1a_tetNig LTPTGNLVVSVFLGLIGTSGLVSNLLVLVLFCRFKVLRSPINLLLVNISVSDLLVCVLGTPFSFAASTQGRWLIGAAGCVWYGFVNSLFGIVSLISLAVLSFERYSTMMTP--TEADSSNYCKVCLGIGLSWVYSLLWTVPPLLGWSSYG TMT1a_takRub LTPTGNLVVSVFLGFIGTFGLVNNLLVLVLFCRYKMLRSPINLLLMNISISDLLVCVLGTPFSFAASTQGRWLIGEAGCVWYGFANSLFGVVSLISLAVLSFERYSTMMTP--TEADPSNYCKVCLGITLSWVYSLVWTVPPLFGWSSYG TMT1a_gasAcu LTPTGHLVVAVCLGFIGTLGLMNNLLVLVLFCRYKMLRSPINLLLINISISDLLVCVLGTPFSFAASTQGRWLIGEGGCVWYGFANSLFGIVSLISLAVLSYERYSTMVAP--TEADSSNYHKISLGITLSWVYSLIWTAPPLFGWSHYG TMT1a_oryLat LTPTGHLIVAVCLGFIGTFGLVNNLLVLVLFCRYKILRSPINLLLINISISDLLVCVLGTPFSFAASTQGRWLIGEGGCVWYGFANSLCGIVSLISLAVLSYERYSTMMTP--AEADSSNYRKISLGIILSWGYSLLWTLPPLFGWSHYG TMT1a_calMil LSRTGLTVVAVCLGIIMVLGFLNNLLVLVLFCKYKVLRSPMNMLLLNISVSDMLVCICGTPFSFAASVQGRWLVGEQGCKWYGFANSLFGIVSLMSLTILSYDRYITITGT--TEADITNYNKTIVGIALSWIYSLMWTLPPLFGWSNYG TMT1b_tetNig LSQRGHLVVAVCLGAIGTVGFLSNLLVLALFCRFRALRTPMNLMLVSISASDLLVSVLGTPFSFAASTQGRWLLGRAGCVWYGFVNACLGIVSLISLAVLSYERYCTMMAS--TMASNRDYRPVLLGICFSWFYSLAWTVPPLLGWSRYG TMT1b_takRub LSQRGHLVVAVCLGFIGTVGFLSNFLVLALFCRYRALRTPMNLMLVSISASDLLVSVLGTPFSFAASTQGRWLIGRAGCVWYGFVNACLGIVSLISLAVLSYERYCTMVSS--TIASNRDYRPVLGGICFSWFYSLAWTVPPLLGWSRYG TMT1b_gasAcu LSPKGHLVVAVCLGFIGTFGFLSNFLVLALFCRYRALRTPMNLLLVSISASDLLVSMVGTPFSFAASTQGRWLIGRAGCVWYGFVNACLGIVSLISLAVLSFERYSTMVKP--TVADGRDFRPALGGIAFSWLYSVAWTVPPLLGWSEYG TMT1b_oryLat LSPTGHLVVAVCLGLIGTCGFLSNLLVLALFCRYRALRTPMNLLLVSISVSDLLVSVLGTPFSFAASTQGRWLIGRAGCVWYGFINACLGIVSLISLAVLSYERYSTVMTP--NMADGRDFRPALGGICFSWLYSVAWTVPPLLGWSRYG TMT1a_braFlo LSPTGHLVVAAILALIGVLGIVNNSTTLYLVGRYKQLRTPFNILMVNLSVSDLLMCVLGTPFSFVSSLHGRWMFGHSGCEWYGFICNFLGIVSLITLTVISYERYLLMKRL--PNERILSYRAVALAVVFIWCYSLLWTAPPLVGWSSYG TMTq_braFlo VEFSGFDTVAVVIAAIGIAGFLSNGAVVLLFLKFRQLRTPFNMLLLNMSVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANHLFGLVSLISLAVISYERYRMVVKPKGPGSSYLTYNKVGLAIIFIYLYCLLWTTLPIVGWSSYQ TMTq_braBel VEFFGYDAVAGVIAIIGVVGFVSNGAVVVLFLKFPQLRTPFNLLLLNMAVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANHLFGLVSLISLAVISFLRYRMVVKPKGPGSSYLTYTKVGLAILFIYLYCLLWTTLPIAGWSSYQ TMTp_braFlo FSDAGYTAIATCLALIGFVGFTNNFVVILLIGCHRQLRTPFNLLLLNMSVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANSLFGIVSLVTLSALAFERYCVVVR----SSDMLTYKSSLVVITFIWLYSLLWTSLPLLGWSSYQ TMTp_braBel FSDAGYTAIATGLALIGLVGSMNNFVVILLIGCHRQLRTPFNLLLLNVSVADLLVSVCGNTLSFASAVQHRWLWGRPGCVWYGFANSLFGIVSLVTLSALAFERYCVVVR----SSEMLTYKSSLGMIAFIWMYSLLWTSLPLLGWSSYQ TMT1a_strPur VSRTTYNYLTVYTGFLTIFGILNNGIVMILFARFPSLRHPINSFLFNVSLSDLIISCLASPFTFASNFAGRWLFGDLGCTLYAFLVFVAGTEQIVILAALSIQRCMLVVRP--FTAQKMTHRWALFFISLTWIYSLIICVPPLFGWNRYT TMT1a_plaDum FGPTSYVITAIYLCIVGVIGTLSNGVIMYLYFKDKSLRSPMNLLFVNLAMSDFTVAFFGAMFQFGLTCTRKYMSPGMACDFYGFITFLGGLASEMNLFIISVERYLAVVRP--FDVGNLTNRRVIAGGVFVWLYSLVFAGGPLVGWSSYR TMT1b_plaDum FTATDYNICAAYLFFIACLGVSLNVLVLVLFIKDRKLRSPNNFLYVSLALGDLLVAVFGTAFKFIITARKTLLREEDGCKWYGFITYLGGLAALMTLSVIAFVRCLAVLRL--GSFTGLTTRMGVAAMAFIWIYSLAFTLAPLLGWNHYI TMT1_apiMel VSPVMYIGAAIALGFIGFFGFTANLLVAIVIVKDAQLWTPVNVILFNLVFGDFLVSIFGNPVAMVSAATGGWYWGYKMCLWYAWFMSTLGFASIGNLTVMAVERWLLVARP--MQALSIRPQHAVILASFVWIYALSLSLPPLFGWGSYG TMT1p_anoGam MAPWAYNGAAVTLFFIGFFGFFLNIFVIALMYKDVQLWTPMNIILFNLVCSDFSVSIIGNPLTLTSAISHRWLYGKSICVAYGFFMSLLGIASITTLTVLSYERFCLISRP--FAAQNRSKQGACLAVLFIWSYSFALTSPPLFGWGAYV TMT1q_anoGam MAPWAYNASAVTLFFIGFFGFFLNLFVIALMCKDMQLWTPMNIILFNLVCSDFSVSIIGNPLTLTSAISHRWIFGRTLCVAYGFFMSLLGITSITTLTVLSYERYCLISRP--FSSRNLTRRGAFLAIFFIWGYSFALTSPPLFGWGAYV TMT1_aedAeg MESWAYVASAVTLFFIGFFGFFLNLFVIALMCKDVQLWTPINIILFNLVCSDFSVSIIGNPFTLTSAISRHWIFGRTVCIAYGFFMSLLGITSITTLTVLSYERFCLISHP--FSSRSLSRRGAVFAILFIWSYSFALTSPPLFGWGAYV TMT1_culPip MPPWAYVATAVVLFFIGFFGFFLNLFVIALMCKEVQLWTPMNIILLNLVCSDFSVSIVGNPFTLSSAISHRWLFGRKLCVAYGFFMSLLGITSITTLTVLSYERFYLISRP--FSSRSLSRRGALGAVLLIWCYSFALTSPPLFGWGAYV TMT1_bomMor MPRWGYVASAFVLFLIGFFGFFLNLMVILLMFKDRQLWTPLNIILFNLVCSDFSVSVLGNPFTLISALFHRWIFGHTMCVLYGFFMALLGITSITTLTVISFERYLMVTRP--LTSRHLSSKGAVLSIMFIWTYSLALTTPPLLGWGNYV TMT1_rhoPro MPSAGFLAASIILFLIGFLGFFGNLIVIIIMCRDKNLWTPVNFILFNVIVSDFSVAALGNPFTLASAIAKRWFFGQSMCVAYGFFMALLGITSINSLTVLALERYLIVSQP--VSHGSLSRPTASDIVGSIWLYSFVITIPPLVGWGEYG TMT1_triCas IPVEGYIAAAVVLFCIGFFGFSLNLTVIIFMLKERQLWSPLNIILFNLVVSDFLVSVLGNPWTFFSAINYGWIFGETGCTIYGFIMSLLSITSITTLTVLAFERYLLIARP--FRNNALNFHSAALSVFSIWLYSLSLTIPPLIGWGEYV TMT1_acyPis ISDAIYLGAAIVLSIIGIVGFIFNTCVIFIMIRDTRLWTPQNVIIFNLATSDLAVSVLGNPVTLAAAITKGWIFGQTICVIYGFFMALFGIASITTLTVLAYDRYLMIRYP--FSSSRLTKETALYAIAGIWIYAFAVTGPPLFGWNRYV TMTr_strPur FTTEAHLLAGSFLTLVFIISIIGNSVVLFLFAWDRHLRTPTNMFLLSLTISDWLVTVVGIPFVTASIYAHRWLFAHVGCIIYAFIMTFLGLNSLMSHAVIAVDRYLVITKP--HFGIVVTYPKAFLMISIPWVFSFAWAVFPLAGWGEFT TMT1c_braFlo FTTEQHLLMAVWLGFIGSFGFVTNLLTVLVFWCFKSLRTPFHLYLGGIALSDLLVAALGSPFAVASAVGERWLFGRAVCVWYAFVNYFLSIVSIVTMATMSFSRYWVIIRPQ-SAPRLDTVYGACVVNALAWCYSFFWTIMPVLGWSRFT PPINa_cioInt ANRSTYSFLCVYMTFVFLLSCSLNILVIVATLKNKVLRQPLNYIIVNLAVVDLLSGFVGGFISIAANGAGYFFWGKTMCQIEGYFVSNFGVTGLLSIAVMAFERYFVICKP--FGPVRFEEKHSIFGIVITWVWSMFWNTPPLIFWDGYD PPINa_cioSav ADRSVYSFLAVYMTFICLISCSLNILVITATLKNKVLRQPLNYIIVNLAVVDLLSGLVGGVISIFANGAGYFFWGKFMCQVEGYTVSNFGVTGLLSIAVMAFERYFVICKP--FGPVRFEEKHAVIGIAVTWIWAMFWNTPPLIFWDGYD PPINb_cioInt AERHIYTILAVYMTFIFLLAVSLNGFVIIATMKNKKLRQPLNYIIINLSIADFLSGLVGGFIGMISNSAGYFYFGKTVCILEGYIVSVAGVCGLMSISVMAFERYFVVCKP--YGPFTLTNTHAALGIGFTWTWSVLWSTPGLIWLDGYV PPINb_cioSav ANRSTYSGLCVFMSFVFVLAVPLNLLVIVATYKNKDLRRPINYIIVNLAVADLTCSVVGGLLGVLNNGAGYYFLGKSVCIFEGYVMSVTGVCGILSITVMAFERYFVVCKP--FGQTNLKWSHAITGIVFTWTWSVIWHTPGLFFWNGYE Consensus av l !g G N V l k LrtP N l n s sDll! G f f s rW g gC wygF sl GivS $ vlsy#RY a ! W YSl wt pPL GW Y Prim.cons. LSPTGYLVVAVFLGFIGTFGFLNNLLVLVLFCKFKRLRTPINLLLLNISVSDLLVSVFGTPFSFASSIRGRWLWGRAGCVWYGFANSLFGIVSLISLAVLSYERYSTVVRPKGTKADVLDYRKA2LAIGFSWLYSLAWTVPPLLGWSSYG 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 | | | | | | | | | | | | | | | ENCEPH_homSap LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLVVPLGVIAHCYGHILYSIRMLRCVEDLQTIQV--IKILKYEKKLAKMCFLMIFTFLVCWMPYIVICFLVVNGHGHLVTPTISIVSYLFAKSNTVYNPVIYVFMIRKFRRSLLQLLCLRL ENCEPH_loxAfr LDTHGLACTVDWKSNNSSDSSFVLFLFLGCLVVPVGVIAHCYGHILYSIRMLRCVEDLQTIQV--IKILRHEKKLAKMCLFMIFTFLICWMPYIVICFLVVNGYGHLVTPTISIVSYLFAKSSTVYNPVIYTFMIRKFRRSLLQLLCFRL ENCEPH_canFam LDVHGLGCTVDWKSKDANDSFFVLFLFLGCLVVPMGVIVHCYGHILYSIRMLRCVEDLQTIQV--IKILRYEKKVAKMCFLMIFIFLIFWMPYIVICFLVVNGYGHLVTPTVSIVSYLFAKSSTVYNPVIYIIMIRKFRRSLLQLLCFRP ENCEPH_monDom LEIHGLGCSVDWKSKDPNDSSFVIFLFFGCLMLPVGVMAYCYGHILYAIRMLRCVEELQTIQV--IKILRYEKKVAKMCFLMIAIFLFCWMPYAVICLLVANGYGSLVTPTVAIIASLFAKSSTAYNPIIYIFMSRKFRRCLLQLLCFRL ENCEPH_galGal LEIHGLGCSMDWKSKDPNDTSFVLLFFLGCLVAPVVIMAYCYGHILYAVRMLRCVEDFQTSQV--IKLLKYEKKVAKMCFLMISTFLICWMPYAVVSLLVTYGYSNLVTPTVAIIPSFFAKSSTAYNPVIYIFMSRKFRQCLLQLLCFRL ENCEPH_anoCar LEIHGLGCSVDWQSKEPSDSSFVLFFFLGCLAAPVGIMAYCYGHILHAIRMLRCVEDLQSIQV--IKILRYEKKVAKMCFLMVTTFLICWMPYAVVSLLIAYGYGHLITPTVAIIPSFFAKSSTAYNPVIYIFMSRKFRRCLVQLFCVQF ENCEPH_danRer LEVHQLGCSLDWASKDPNDASFILFFLLGCFFVPVGVMVYCYGNILYTVKMLRSIQDLQTVQT--IKILRYEKKVAVMFLMMISCFLVCWTPYAVVSMLEAFGKKSVVSPTVAIIPSLFAKSSTAYNPVIYAFMSRKFRRCMLQMLCSRL ENCEPH_tetNig LEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVHMIRSIQDLQTVQI--IKILRYEKKVSVMFFLMISCFLLCWTPYAVVSMMVAFGRKSMVSPTVAIIPSFFAKSSTAYNPVIYVFMSRKFRRCLLQLLCSRL ENCEPH_takRub LEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVQMIRSIQDLQTVQI--IKILRYEKKVSVMFFLMISCFLLCWTPYAVVSMMVAFGRRSMVSPTMAIIPSFFAKSSTAYNPLIYVFMSRKFRHCLLQLLCSRL ENCEPH_gasAcu LEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVQMLRSIQDLQTVQI--IKILRYEKKVAVMFLLMISCFLLCWTPYAVVSMMEAFGRKNMVSPTVAIIPSFFAKSSTAYNPLICVFMSRKFRRCLMQLLCSRV ENCEPH_oryLat LEIHQLGCSLDWASKDPNDAAFILLFLLGCFFVPVGIMIYCYGNILYAVRMLRSIEDLQTVQI--IKILRYEKKVAAMFLLMISCFLVCWTPYAVVSMMEAFGKKSMVSPTVAIVPSFFAKSSTAYNPLIYVFMNRKFRRCFLQLLCSKI ENCEPH_choDri LEMHRLGCSVTWELEKPSDTSFILFLFLGSLLIPVGVIAYCYGNI-YTIRMLQSIEDFQTARF--AKTLTNEMNSSKMCFFMISVAFSCWLPYAVTSFMVVYGCTDVITPTITIIFSLLAKSSAISYPIIYIFMSRKFRWCLMQLLCFRL ENCEPH_petMar LEKHGLGCSIDWASSNPPDAAFVLFFFLGCLAAPLLVMGFCFGRIALAITQFRKLDRLQTPRV--LKARCSERKVSAVCLLMMLLFLLCWSPYAVASLFVASGFEHLVSPPVSIVPSLLAKSNAVCNPLLFLLMSGNFFRCLRTMFFTLR ENCEPH_xenTro FETHKLDCSFEWTATDPKDTAFVLLFFLACITLPLSIMAYCYGYILYEIQKLRSVKNIQNFQE--ITILDYEIKMAKMCLLMMLTFLIGWMPYTILSLLVTSGYSKFITPTITVMPSLLAIASAAYNPVIHIFTIKKFRQCLVQLLFHNF TMT2_monDom TEGAGTSCSVHWTSKSVESVSYIMCLFIFCLVIPILVMVYFYGRLLYAVKQ----VGKIRKTA----ARKREYHVLFMVVTAVICYLICWVPYGMIALLATFGPPGVVSPVANVVPSILAKSSTVCNPIIYVLMNKQFYKCFLILFHCQP TMT2_macEug TEGAGTSCSVHWTSKSVESVSYIMCLFIFCLVIPILFMVYFYGRLLYTVKQ----VGKIRKSA----ARKREYHVLFMVVTAVICYLICWVPYGMIALLATFGPPGVVSPVANVVPSILAKSSTVCNPIIYILMNKQFYKCFLILFHCQP TMT2_ornAna TEGAGTSCSVHWSSKSPVSVSYIVCLFIFCLVIPVLVMIYCYGRLLYAVKQ----IGKARKTA----ARKREYHVLFMVITTVICYLVCWMPYGVTALLATFGQPGTVSPEASVIPSILAKSSTVCNPIIYILMNKQFYKCFLILFHCQP TMT2_galGal IEGAGTSCSVRWSSETAESTSYIICLFIFCLVIPVMVMMYCYGRLLYAVKQ----VGKIHKNT----ARKREYHVLFMVITTVICYLVCWIPYGVIALLATFGKPGVVTPVASIIPSILAKSSTVCNPIIYILMNKQFYKCFRQLFHCQP TMT2_taeGut VEGAGTSCSVRWSSESAESTSYIICLFVFCLVVPVMVMMYCYGRLLYAVKQ----VGKIHKNA----ARKREYHVLFMVIPTVICYLVCWIPYGVIALLATFGKPGAVTPITSIIPSILAKSSTVCNPIIYILMNKQFYKCFRQLFHCQP TMT2_anoCar LEGAGTSCSVRWTSETLESVTYIICLFIFCLAIPVLVMIYCYARLFYAVKQ----VGKLRKTS----ARKREFHVLFMIITTIICYLICWMPYGVIALLATFGRPGLVSPVASVIPSILAKSSTVFNPIIYILMNKQFYKCFLMLLHCQP TMT2_xenTro REGAGTSCSVRWTSESVESVSYIICLFIFCLALPVFVMLYCYGRLLYAVKQ----VGKIRKIA----ARKREYHVLFMVITTVICYLLCWLPYGVVALLATFGRPGVISPVASVVPSILAKSSTVFNPIIYILMNKQFYKCFLILFHCHP TMT2a_danRer LEGAGTSCSVSWTQRTAESHAYIICLFVFCLGLPVLVMVYCYGRLLYAVKQ----VGKIRKTA----ARKREYHVLFMVITTVVCYLLCWMPYGVVAMMATFGRPGIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFRILFCCQR TMT2b_danRer LEGAGTSCSVTWTANTPQSHSYIICLFIFCLGIPVLVMVYCYSRLICAVKQ----VGRIRKTA----ARRREYHILFMVITTVVCYLLCWMPYGVVAMMATFGRPGIISPIASVVPSLLAKSSTVINPLIYILMNKQFYRCFLILIHCKH TMT2_tetNig IEGAGTSCSVSWTVQTAQSHAYIICLFIFCLGLPVLVMVYCYSRLLWAVKQ----VGKIRKTS----ARKREYHILFMVVTTAACYLVCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYKCFLILFHCSH TMT2_takRub IEGAGTSCSVSWTVQTAQSHAYIICLFTFCLGIPILVMIYCYSRLLWAVKQ----VGRIRKTA----ARKREYHILFMVVTTAACYLVCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYKCFLILFHCGH TMT2_oryLat LEGAGTSCSVSWTANTAQSHAYIICLFIFCLGLPILVMIYCYSRLLLAVKQ----VGKIRKTA----ARKREYHILFMVLTTAACYLLCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFLILFHCDH TMT2_gasAcu IEGAGTSCSVSWTVQTAQSHAYIICLFTFCLGLPMLVMIYCYSRLLLAVKQ----VGRIRKTA----ARRREYHILFMVLTTAACYMLCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFLILFHCKH TMT3_xenTro LESSGTTCSVVWHSKSSNNISYIVCLFLFCLVLPLFIMIFCYGHIVRVIRG----VCRINMTT----AQKREHRLLFMVVCMVTCYLLCWMPYGLVSLMTAFGKPGMITPTVSIIPSILAKSSTFINPLIYIFMNKQFYRCFIALIKCES TMT3_calMil LESSGTTCSVVWHSKSSNNISYIVCLFLFCLVLPLFIMIFCYGHIVRVIRG----VGKINQMT----AQTREHRILLMVISMVTFYLLCWLPYGTVALIGTFGNADLITPTCSVIPSILAKSSTVINPVIYVIMNKQFYRCFIALIKCES TMT3_danRer PEGPGTTCSVQWHLRSTSSISYVMCLFIFCLLLPLVLMIFCYGKILLLIKG----VTKINLLT----AQRRENHILLMVVTMVSCYLLCWMPYGVVALLATFGRTGLITPVTSIVPSVLAKSSTVVNPVIYVLFNNQFYRCFVAFLKCQG TMT3_oncMyk PEGPGTTCSVQWHQRSSGNISYVTCLFIFCLLLPLLLMMFCYGKILFAIRG----VAKINQSS----AQRRETHVLVMVVSMVSCYLLCWMPYGVVALLATFGQVGLVSPTTSIVPSILAKSSTFLNPVIYGLLNNQFYRCFLAFMSCGS TMT3_tetNig PEGAGTTCSVQWQLRSPASVSYVLCLLVFCLLLPFLVMVYSYGRILVAIRR----VGRINQLT----AQRREQHILLMVLSMVSCYMLCWMPYGIMALVATFGKLGLVTPMVSVVPSILAKFSTVVNPIIYMFFNNQFYRCFMAFIRCQK TMT3_takRub PEGPGTMCSVQWHLRSPANISYVLCLFIFCLLLPLVVMVYSYGRIWVAVRR----AGRINLLT----AQRREQHILWMVLSMVSCYMLCWMPYGIIALVATLGRLGPISPAVSVVPSILAKFSTVVNPVIYMFFNNQFYRCFMAFVRCQK TMT3_oryLat PEGPGTTCSIQWHLRSPTSVSYVLCLFIFCLVLPLVLMVYSYGRILVALRR----VGKINLLA----AQRREQRILVMVFSMVSCYILCWMPYGIVALMATFGRKGLVTPLTSVIPSILAKFSTVVNPVIYVFFNSQFYRCLVAFVRCSG TMT1a_anoCar PEGPGTTCSVNWHSRDANNISYIICLFIFCLVIPFIVIVYCYGKLLCAIKK----VSGVTQGM----AQTREQRVLIMVVVMIICFLLCWLPYGIVALIATFGKPGLITPSASIIPSVLAKSSTVYNPVIYIFLNKQFYRCFCALLKCGK TMT1a_xenTro PEGPGTTCSVNWHSRDANNISYIVCLFIFCLALPFAVIVYCYGRLLFAIKQ----VSGVSKSS----SRAREQRVLIMVIVMVVCFLLCWLPYGVMALVATFGKPGIISPSASIIPSVLAKSSTVYNPIIYIFLNKQFYRCFTALIHCNK TMT1a1_danRer PEGPGTTCSVNWAARTPNNVSYIVCLFVFCLILPFIVIVYSYGRLLQAITQ----VSRINTVV----SRKREQRVLFMVVTMVVCYLLCWLPYGIMALLATFGHPGLVTPAASIVPSLLAKSSTVINPIIYIFMNKQFCRCFHALIMCTT TMT1a_pimPro PEGPGTTCSVNWAARTANNVSYIICLFFFCLILPFIVIVYSYGRLLQAITQ----VSRINTVV----SRKREQRVLFMVITMVVCYLLCWLPYGIMALLAAFGRPGLVTPAASIVPSVLAKTSTVINPIIYIFMNKQFCRCFHALIMCTT TMT1a2_danRer PEGPGTTCSVDWTTKTANNISYIICLFIFCLIVPFLVIIFCYGKLLHAIKQ----VSSVNTSV----SRKREHRVLLMVITMVVFYLLCWLPYGIMALLATFGAPGLVTAEASIVPSILAKSSTVINPVIYIFMNKQFYRCFRALLNCDK TMT1a_tetNig PEGPGTTCSVNWTAKTANSVSYIICLFVFCLILPFLVIVFCYGKLLCAIRQ----VSGVNASM----SRRREQRVLFMVVVMVICYLLCWLPYGVVALLATFGPPGLVTPAASIIPSILAKSSTVINPVIYVFMNKQFSRCFLSLLCCED TMT1a_takRub PEGPGTTCSVNWTAKTTNSISYIICLFVFCLIVPFLVIVFCYGKLLCAIRQ----VSGINAST----SRKREQRVLCMVVIMVICYLLCWLPYGVVALLATFGPPDLVTPEASIIPSVLAKSSTVINPIIYVFMNKQFYRCFLALLCCQD TMT1a_gasAcu PEGPGTTCSVDWTARTANSISYIICLFVFCLIVPFLVIVFCYGKLLCAIRQ----VSGINASL----SRKREQRVLFMVVIMVVCYLLCWLPYGIMALMATFGPPGLITPVASIIPSVLAKTSTVINPVIYVFMNKQFYRCFKALLRCEA TMT1a_oryLat PEGPGTTCSVDWTAKTANNISYIICLFVFCLIVPFMVIVFCYGKLLYAIKQ----VSGINVSV----SRKREQRVLFMVVIMVICYLLCWLPYGIMALLATFGPPDLVTPEASIIPSVLAKTSTAINPVIYVFMNKQFYRCFKALLRCEA TMT1a_calMil PEGPGTTCSVNWQSKEVSSKSYIICLFIFCLLMPFLVIVYCYGKLVLAVRK----VSA-NNSM----GRTRENKLLIMVTFMIICFLLCWLPYGIVALLATFGSPGLITPTASIIPSVLAKTSTVYNPIIYIFMNKQFYRCFKALLRCEA TMT1b_tetNig PEGPGTTCSVDWRTQTPNNISYIVCLFAFCLLLPFCVILYSYGKLLHTIRQ----VSSVSSAV----TRRREHRVLVMVVAMVVCYLICWLPYGVTALLATFGPPNLLTPEATITPSLLAKFSTVINPFIYIFMNKQFYRCFRAFLSCSS TMT1b_takRub PEGPGTTCSVDWRTQTPNNISYIVCLFTFCLLLPFFVILYSYGKLLHTIRQ----VRRVSSTV----TRRREHRVLVMVVAMVVCYLICWLPYGVTALLATFGPPNLLTPEATITPSLLAKFSTVINPFIYIFMNKQFYRCFRAFLNCST TMT1b_gasAcu PEGPGTTCSVDWKTQTANNISYIVCLFVFCLVLPFCVILYSYSRLLQAIRQ----VSVVSSVV----TRHREQRVLAMVVVMVACYLVCWLPYGVAALLATFGPRDLLSPEASITPSLLAKFSTVVNPFIYIFMNKQFYRCFRAFLSCST TMT1b_oryLat PEGPGTTCSVDWKTQTPNNISYIICLFTFCLLLPFGVIVYSYGKMLRVIRQ----VRSMSSVV----TRRREQRVLVMVVTMVVCYLVCWLPYGIAALLATFGPRDLLTPAASITPSLLAKFSTVINPLIYIFMNKQFYRCFWAFFCCST TMT1a_braFlo PEGYGISCSVNWESRTANDTSYIVAYFVGCLVFPVAIIVISYTRLILYMRQ---QAPSAPMQM----LVRREKRVTKMVVVMIMGFTICWTPYTIVALIVTCGGEGIITPAAATVPALFAKSSVVYNAAIYVAMNNQFRKCFLRSLNCRS TMTq_braFlo LEGPKISCSVAWEEHSLSNTSYIVAIFIMCLLLPLLIIIYSYCRLWYKVKK---GSQNLPPAI--RKSSQKEQKIARMVVVMITCFLVCWLPYGAMALVVSFGGESLISPTAAVVPSLLAKSSTCYNPLVYFAMNNQFRRYFQDLLCCGR TMTq_braBel LEGPKIGCSVAWEEHSWSNTSYIVVLFITCLFAPLLIIVYSYYRLWHKVKQ---GSRNLPAAM--RKSSQKEQKIAMMVIVMITCFMVCWLPYGAMALVVTFGGERLISHTAAVVPSLLAKSSTCYNPVVYFAMNSQFRRYFQDLLCCGR TMTp_braFlo FEGHNVGCSVNWVQHNPDNVSYIVTLMVTCFFVPMVVVCWSYAWIWRTVRM---SSE-AKPEC--GNSQNAGRLVTTMVVVMIICFLVCWTPYAVMALIVTFGADHLVTPTASVIPSLVAKSSTAYNPIIYVLMNNQFREFLLARLQ--- TMTp_braBel FEGHSVGCSVNWVKHNVNNVSYIITLMVTCFFVPMVVVCWSYACIWRTVRM---SAE-MKSEF--GNPQNTGRLVTTMVVVMIVCFLVCWTPYTVMALIVTFGADHLVTPTASVIPSLVAKSSTAYNPIIYVLMNNQFREFLLARLR--- TMT1a_strPur YEGPGTACSVAWNSPSPGDTSYIIFIFVLVLVIPFGIIIFCYGLLVYAVKK----ISRTQAAL--SSEAKADRKVSKMIFIMILFFLIAWTPYTGFSLYVTFGKNVVITPLAGTFPPFFAKLCTIHNPIIYFLLNKQFKDALIQLFCCGE TMT1a_plaDum PEGLGTWCSISWQDRSMNTMSYVTAVFLGCYFFPVSIIIFCYFNVWRKVKE----AADAQGGA--GTAGKAEKSIFRMSVIMVTCYLTAWTPYAIVCLIASYGPPNGLPIYAEVLPSLFAKSSQVYNPIIYVLMNKPYRSALVSLVCRGR TMT1b_plaDum PEGLATWCSIDWLSDETSDKSYVFAIFIFCFLVPVLIIVVSYGLIYDKVRK----VAKT---G--GSVAKAEREVLRMTLLMVSLFMLAWSPYAVICMLASFGPKDLLHPVATVIPAMFAKSSTMYNPLIYVFMNKQFRRSLKVLLGMGV TMT1_apiMel PEAGNVSCSVSWEDPVTNSDTYIGFLFVLGLIVPVFTIVSSYAAIVLTLKK------VRK-RA--GASGRREAKITKMVALMITAFLLAWSPYAALAIAAQYFNAK-PSATVAVLPALLAKSSICYNPIIYAGLNNQFSRFLKKIFDA-- TMT1p_anoGam NEAANISCSVNWESQTANATSYIIFLFIFGLILPLAVIIYSYINIVLEMRK------NSA-RV--GRVNRAERRVTSMVAVMIVAFMVAWTPYAIFALIEQFGPPELIGPGLAVLPALVAKSSICYNPIIYVGMNTQFRAAFWRIRRSNG TMT1q_anoGam QEAANISCSVNWESQTKNATTYIIFLFVFGLVVPLIVIVYSYTNIIVNMRE------NSA-RV--GRINRAEQRVTSMVAVMIVAFMVAWTPYAIFALIEQFGPPELIGPGLAVLPALVAKSSICYNPIIYVGMNTQFRAAFSRVRNK-- TMT1_aedAeg NEAANISCSVNWESQTLNATSYIIFLFVFGLVVPLVVIVYSYTNIVVNMKR------NAA-RV--GRINRAEKRVTRMVFVMVLAFMIAWTPYAVFALIEQFGPTDIISPALGVLPALIAKSSICYNPIIYVGMNTQFRAAFNRVRNN-- TMT1_culPip NEAANISCSVNWETQTLNATTYIIYLFVFGLVVPLTVIVYSYTNIIVNMKK------NAA-RV--GRINRAEKRVTTMVAVMVIAFMVAWTPYSVFALMEQFGPPDVIGPGLAVLPALIAKSSICYNPIIYVGMNTQFRAAFNRVRHD-- TMT1_bomMor NEAANIQCSVNWHEQSTNTLTYIMFLFAMGQILPLSVITFSYVNIIRTLKR------NSQ-RL--GRVSRAEARATAMVFIMIIAFTVAWTPYSLFALMEQFGGVH-ISPVVSIIPALCAKSSICWNPIIYIGLNTQFRAAFNRVRHD-- TMT1_rhoPro LEAANISCSINWETRSHSSTSYILFLFTFGFFIPIIVISYSYMNIILTMKK------STM-NA--GRVNKAESRVTWMIFVMIFAFFLAWTPYAILALMIAFFDSN-VSPAIATIPAIFAKTSICYNPFIYAGLNTQFRAAFNRVRHD-- TMT1_triCas HEAANLSCSVNWEEKSPNSTSYILYLFAFGLFLPLVIITFSYVNIILTMRR------NAAFRV--GQVSKAENKVAYMIFIMIIAFLTAWSPYAIMALIVQFGDAALVTPGMAVIPALLAKSSICYNPVIYIGLNAQVKGAKWVSGLIYL TMT1_acyPis NESANISCSIDWESGEHS--NYVIYIFVFGLFLPVTVIIYSYVSLVVTVRK------AEK-II--GQATKAECRVAIMVAVMILAFLTAWMPYSVLALMIAFGGVH-ISPVVSIIPALCAKSSICWNPIIYIGLNTQFRSAWKRFLNIQD TMTr_strPur YEGTGAWCSVRWDSDQPQIMSYVLAMMFLTFISSIVIMMYCYICIFLTTRRMPRWATSNSIKTHERNRRRREQKLLKTLIAIAIAFLVAWSPYAITSMIVVFGGSELLSLTATTLPSLFAKSSVMINPIIYAVTSRVFRKSLKKMLTPGC TMT1c_braFlo QVAAMTVCSLDWDHHTPLSKSYIPVAFLTCLFLPLGVIIFSVFKTTMHLRRAAEVEDEVPNEV------RAGRKTTRITLVMAGCWLVAWLPYACMALVIAAGGR--VSPTVEVLATKFAKTSYIVNTIIYLVMEKEFRKSLVLLLFDPF PPINa_cioInt TEGLGTSCAPNWFVKEKRERLFIILYFVFCFVIPLAVIMICYGKLILTLRQ-------IAKESSLSGGTSPEGEVTKMVVVMVTAFVFCWLPYAAFAMYNVVNPEAQIDYALGAAPAFFAKTATIYNPLIYIGLNRQFRDCVVRMIFNGR PPINa_cioSav TEGLGTSCAPNWFVKGNTERLFIILYFVFCFLIPLAIIVLCYGKLILQLRQ-------IAKESSLSGGTSPEGEVTKMVVVMVTAFVICWLPYAAFAMYNVVNPEAQIDYALGAAPAFFAKTATIYNPLIYIGLNRQFRDCVVRMIFNGR PPINb_cioInt PEGLGTSCAPNWFSKNKSERIFIFVYFVFCFFIPLLVIIICYGKIVLFLKQ-------ATRQSSASSNRQADNKVTKMVLVMISAFLICWTPYGVLSLYNAINPDKQLDYGLGAVPVFFAKTANIYNPLIYIGLNKQFRDGVIKMVFRGR PPINb_cioSav PEGFGTSCAPNWFSQQKSERIFIFAYFAFCFLTPLTIIFACYLKLILFIRK-------VSKKSMVNEADRRDFEVTRMVFVMIAAFLICWLPYGCLSMYNAIHPDNLLSYGIGSVPAFFAKTATIYNPIIYMGLNKKFRDGVIRMLFKGR Consensus Eg g CSv W s%! lF cl P !i cYg # v Mv M! %$ cW PY a$ fG p Ps AKsSt NP IY $n qFr Prim.cons. PEGAGTSCSVDWTSKTPNS2SYIICLF2FCLVLPVLVIVYCYGR2LYAVRQLRSVVGKINKQVSLGKARRREQRVLFMVVVMVICFLLCWLPYGVVALLATFGPPGLVTPTASIIPSLLAKSSTVYNPIIYIFMNKQFRRCFLALLCC22
Origins of melanopsins
Melanopsins are well-represented in all three bilateran clades -- the only sequenced genome to date lacking a melanopsin is the acornworm Saccoglossus. Many erratically named genes in arthropods and molluscs are actually simple orthologs at the bilateran ancestor to the first described melanopsin locus in Xenopus. A single melanopsin locus existed in ur-Bilatera. It is not currently possible to specify its syntenic relationships.
In ecdysozoa, the melanopsin locus duplicated early on with copies specializing to ultraviolet and long wavelengths but evidently remaining under the same strong and unusual selection in the third cytoplasmic loop attributable perhaps to protein-protein interaction invoving the alpha protein specialized to Gq signaling. The ultraviolet melanopsin largely retains the ancestral intronation (based on deuterostome and lophotrochozoa outgroup sequences) whereas the longwave form largely lost these but acquired others. The duplication process was segmental rather than retropositional because a distal intron at EVTR 252 00 is still shared (3' introns are the first to be lost in retropositioning).
These ecdysozoan melanopsin paralogs in turn underwent additional expansion -- in some cases many duplications -- depending on the specific lineage. These refine imaging vision and do not have auxillary functions. These end-leaf specializatons and their evolutionary adaptiveness are best pursued in the original journal articles -- the emphasis here is the broader sweep of opsin evolution.
Melanopsins in lophotrochozoans have a simpler history. Taxonomic sampling leaves something to be desired at this point. Like in ecdysozoans, they seem to provide all the opsins used in imaging vision. (Peropsins and rarely cilopsins are also found in this clade.)
Melanopsins in vertebrates do not provide imaging vision in any living deuterostome and probably never have. Gene retention is very high and the rate of evolution has been quite slow, indicative of important roles. A single locus duplication occured post-chondricthyest pre-teleost and continued on through living fish, frogs, lizard, and birds but not any mammal. This could not reflect whole genome duplication because that would be restricted to fish in conflict with observed continuing synteny and blast clustering. Indeed, two rounds of WGD would have produced 8 copies of the two pre-existing paralogs but none of these survived in any of the five available fish genomes.
(to be continued shortly)
See also: Curated Sequences | Tetrachromatic Ancestral Mammal | Ancestral Introns | Informative Indels | Update Blog