Opsin evolution: orgins of opsins: Difference between revisions

From genomewiki
Jump to navigationJump to search
Line 174: Line 174:


In summary, while thousands of articles address the highly derived RHO1 locus, the much more fundamental cilary ur-opsin TMT has scarcely been investigated. Until its core function is better understood, the early history of ciliary opsins in vertebrates will remain a mystery. Without this core role for TMT that allowed it to be retained for tens of millions of years in the pre-vision era, no ciliary opsin would have been available for later expansion into imaging ciliary opsins.
In summary, while thousands of articles address the highly derived RHO1 locus, the much more fundamental cilary ur-opsin TMT has scarcely been investigated. Until its core function is better understood, the early history of ciliary opsins in vertebrates will remain a mystery. Without this core role for TMT that allowed it to be retained for tens of millions of years in the pre-vision era, no ciliary opsin would have been available for later expansion into imaging ciliary opsins.
Below is a preliminary assessment of the three TMT loci and the related encephalopsin locus. The right column revises nomenclature relative to that of the reference collection (second column). Gene names arose in the pre-genomic era when the full paralog complement, phylogenetic distribution and sites of expression were not understood. With complete genomes now in hand, a final and sensible nomenclature can be envisioned.
The TM2 region aligned proves useful for defining the diagnostic residues and indels of these four gene classes necessary to sort out their evolutionary relationships. As the names suggest, this is (TMT1,(TMT2,(TMT3,ENC))) rooted by protostome ciliary opsins which indicate TMT1 is the original ur-opsin. The final column contains information on synteny that validates the proposed history of gene duplication. Gene order is barely conserved but enough for the duplication history to be unravelled.
<font color="green">ENC_homSap    ENCEPH_hom NN LLVLVLYYKFQRLRTPTHLLLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_otoGar    ENCEPH_oto NN LLVLVLYYKFPRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_loxAfr    ENCEPH_lox NN LLVLVLYYKFQRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_pteVam    ENCEPH_pte NN LLVLVFYYKFQQVRTPFYLFLVNISFS D LLVS-FFGV T FTFVSCLRNGWVWDT VGC ...
ENC_musMus    ENCEPH_mus GN LLVLLLYSKFPRLRTPTHLFLVNLSLG D LLVS-LFGV T FTFASCLRNGWVWDA VGC ...
ENC_canDom    ENCEPH_can CH FCPQKGFLEFQRLRTPTHLLLVNLSLS D LLVS-LFGV T FTFVSCLRNGWVWDS VGC ...
ENC_monDom    ENCEPH_mon NN LLVLVLYYKFQRLRTPTHLFLVNISFN D LLVS-LFGV T FTFVSCLRSGWVWDS VGC syn(-EXO1  -WDR64  +ENC -KMO    +FH +RGS7)
ENC_galgal    ENCEPH_gal NN LLVLVLYYKFKRLRTPTNLFLVNISLS D LLVS-VCGV S LTFMSCLRSRWVWDA AGC syn(-EXO1  -WDR64  +ENC        -PIGM +RGS7)
ENC_anoCar    ENCEPH_ano NN LLVLVLYAKFKRLRTPTHLFLVNISLS D LLVS-LFGV S FTFGSCLRHRWVWDA AGC syn(-EXO1  -WDR64  +ENC        -PIGM +RGS7)
ENC_xenTro    ENCEPH_xen NN LLVLILYCKFKRLQTPTNLLFFNTSLC H FVFS-LLAI T FTFMSCVRGSWAFSV EMC syn(-ASAH3L -ACER2  +ENC    -ADFP -DENND4C)
ENC_danRer    ENCEPH_dan NN IIVIILYSRYKRLRTPTNLLIVNISVS D LLVS-LTGV N FTFVSCVKRRWVFNS ATC syn(-MTRF1L -TMEM63B +ENC -KMO +IDE -MARCH5 +CPEB3 -BTAF1)
ENC_takRub    ENCEPH_tak NN FVVLALYCRFKRLRTPTNLLLVNISLS D LLVS-LFGI N FTFAACVQGRWTWTQ ATC syn(-ABLIM1 -PTK7    +ENC -KMO +IDE        +CPEB4        -CCNJ)
ENC_gasAcu    ENCEPH_gas NN VVVIVLYCKFKRLRTPTNLLVVNISLS D LLVS-VIGI N FTFVSCIRGGWTWSR ATC syn(FAM82A  CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4        -CCNJ)
ENC_oryLat    ENCEPH_ory NN LLVILLYCKFKRLRTPTSLLLVNISLS D LLVS-VVGI N FTLASCVKGRWMWSQ ATC syn(CYP1B1  CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4 -BTAF1 -CCNJ)
ENC_calMil    ENCEPH_cal NN ILVLLLYYKFKRLRTPTNLLLVNISVS D LLVS-VFGL S FTFVSCTQGRWGWDS AAC ---
ENC_squAca    ENCEPH_squ NN LLMLVLYCKFKRLRTPTNLFLVNISIS D LLLS-VFGV I FTFVSCVKGRWVWDS AAC ---
ENC_petMar    ENCEPH_pet NN LLLVALFVGFKRLQTPTNLLLVNISLS D LLVS-VFGN T LTLVSCVRRRWVWGN GGC ---
ENC_braFlo    ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn(    -ZFYVE1 +RTF1 +ENC -CES1 -POMT2)
ENC_braBel    ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNVSVA D LLVS-VCGN T LSFASAVQHRWLWGR PGC ---
ENC_braFlo    TMT5_braFl SN GAVVLLFLKFRQLRTPFNMLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn(NKX2 -ZFYVE1 +RTF1 +ENC ERF1 TMED9 LARS2)
ENC_braBel    TMT5_braBe SN GAVVVLFLKFPQLRTPFNLLLLNMAVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC ---</font>
<font color="blue">TMT3_monDom  TMT_monDom SN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LSGT T LSFASSIQGRWIGGK HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2 RALY SLC5A7 +SULT1E1)
TMT3_macEug  TMT_macEug NN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LTGT T LSFASSIRGRWIAGY HGC ---
TMT3_ornAna  TMT_ornAna NN LIVLILFCKFKALRNPVNMIMLNISAS D MLVC-VSGT T LSFASNISGRWIGGD PGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -UCHL3  +TBCID4)
TMT3_galGal  TMT_galGal NN LIVLILFCKFKTLRNPVNMLLLNISIS D MLVC-ISGT T LSFASNIHGKWIGGE HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7 +SULT1C4)
TMT3_taeGut  TMT_taeGut NN LIVLILFCKFKTLRNPVNMLLLNISVS D MLVC-ISGT T LSFASNIRGKWIGGD HAC ...
TMT3_anoCar  TMT_anoCar NN LVVLILFCKFKTLRNPVNMLLLNISAS D MLVC-ISGT T LSFVSNIYGRWIGGE HGC syn(            +TMT3 -ST6GAL2_overlap +SLC5A7 RANBP2)
TMT3_xenTro  TMT_xenTro NN FVVLILFCKFKTLRTPVNMMLLNISAS D MLVC-VSGT T LSFTSSIKGKWIGGE YGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7)
TMT3_danRer  TMT_danRer NN LVVLVLFCKFKTLRTPVNMLLLNISIS D MLVC-MFGT T LSFASSVRGRWLLGR HGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap    +GPR89A -PDZK1l)
TMT3_tetNig  TMT_tetNig NN FIVLLLFCKFKKLRTPVNVLLLNISVS D MLVC-LFGT T LSFASSLRGRWLLGR SGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap)
TMT3_takRub  TMT_takRub NN FVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR IGC ...
TMT3_gasAcu  TMT_gasAcu NN LVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSLRGKWLLGR SGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -TFDP2 POU2)
TMT3_oryLat  TMT_oryLat NN FVVLILFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR GGC ...</font>
<font color="brown">TMT2_anoCar  TMTa_anoC  NN LLVLVLFCRNKVLRSPINLLLMNISLS D LMIC-IVGT P FSFAASTQGKWLIGP AGC syn(VAMP PER2 HES6 TUBA1 GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5)
TMT2_xenTro  TMTa_xenT  NN LVVLILFCQYKVLRSPINMLLMNISLS D LMVC-ILGT P FSFAASTQGHWLIGE IGC syn(VAMP PER2 HES6      GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5)
TMT2_danRer  TMTb_danRe NN TLVLVLFCRYKVLRSPMNCLLISISVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC syn(-PTCHD1 -PHEX -CNKSR2 SH3KBP1 -MAP3K15 TNK2 +TMT2 -MYEOV2 -MAP4K4 PRMT6)
TMT2_tetNig  TMTb_tetNi SN LLVLALFCRFRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLLGR AGC syn(MYO3A GAD2 ARHGAP21 TFR2 +TMT2  MYEOV2 SH3KBP1 MAP3K15 PHEX PTCHD1)
TMT2_takRub  TMTb_takRu SN FLVLALFCRYRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ...
TMT2_gasAcu  TMTb_gasAc SN FLVLALFCRYRALRTPMNLLLVSISAS D LLVS-MVGT P FSFAASTQGRWLIGR AGC syn(PTCHD1 PHEX MAP3K15 SH3KBP1 +TMT2 -MYEOV2 ARHGAP21 MYO3A)
TMT2_oryLat  TMTb_oryLa SN LLVLALFCRYRALRTPMNLLLVSISVS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ...
<font color="purple">TMT1_danRer  TMTa1_danR NN LLVLVLFGRYKVLRSPINFLLVNICLS D LLVC-VLGT P FSFAASTQGRWLIGD TGC syn(      RAB25 PBX3 TNK2 +TMT1  WAC +LPPR4 +AGL PTBP1)
TMT1_tetNig  TMTa_tetNi SN LLVLVLFCRFKVLRSPINLLLVNISVS D LLVC-VLGT P FSFAASTQGRWLIGA AGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1  WAC RAB18 YME1L1 ABI1)
TMT1_takRub  TMTa_takRu NN LLVLVLFCRYKMLRSPINLLLMNISIS D LLVC-VLGT P FSFAASTQGRWLIGE AGC syn(    LRRN3 CALD1 TNK2 +TMT1      RAB18 YME1L1 ABI1 TLK1 EDRNB)
TMT1_gasAcu  TMTa_gasAc NN LLVLVLFCRYKMLRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1  WAC RAB18 YME1L1 ABI1)
TMT1_oryLat  TMTa_oryLa NN LLVLVLFCRYKILRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC ...
TMT1_pimPin  TMTa_pimPr NN TLVLILFCRYKVLRSPMNYLLVSIAVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC ---
TMT1_oncMyk  TMTa_oncMy SN LFVLLVFARFQVLRTPINLILLNISVS D MLVC-IFGT P FSFAASLYGRWLIGA HGC ---
TMT1_calMil  TMTa1_calM NN LLVLVLFCKYKVLRSPMNMLLLNISVS D MLVC-ICGT P FSFAASVQGRWLVGE QGC ---
TMT2_calMil  TMTa2_calM NN LLVLLLFVCFKEIRTPLNMILLNISLS D LSVC-VFGT P FSFAASIYRRWLIGH KGC ---
TMT1_braFlo  TMTx_braFl NN STTLYLVGRYKQLRTPFNILMVNLSVS D LLMC-VLGT P FSFVSSLHGRWMFGH SGC syn(TNPPO2 HECTD3 ABCCA4 PRPRA TMT ATP5D TMT PTPRA FDE4A PTPRA PYRNXN1
TMT2_braFlo  TMTy_braFl TN LLTVLVFWCFKSLRTPFHLYLGGIALS D LLVA-ALGS P FAVASAVGERWLFGR AVC syn(ZFYVE1 FBXL4 RTF1 TMT CES4 TMTY POMT2 GSTZ1)
TMT1_strPur  TMTPIN_str NN GIVMILFARFPSLRHPINSFLFNVSLS D LIIS-CLAS P FTFASNFAGRWLFGD LGC syn(ARG2 NEK9 FAM164A ZC3H14 TMT PRPF39 YIPF4 SPATA5)
TMT2_strPur  ENCEPH_str GN SVVLFLFAWDRHLRTPTNMFLLSLTIS D WLVT-VVGI P FVTASIYAHRWLFAH VGC ---</font>
<font color="red">TMT1_apiMel  TMT_apiMel AN LLVAIVIVKDAQLWTPVNVILFNLVFG D FLVS-IFGN P VAMVSAATGGWYWGY KMC syn(HEX MAK FASN SPTBN4 PSMA3 TMT LSM11 SEC23A KNSL8)
TMT1a_anoGam  TMT1_anoGa LN IFVIALMYKDVQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWLYGK SIC ...
TMT1b_anoGam  TMT2_anoGa LN LFVIALMCKDMQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWIFGR TLC ...
TMT1_aedAeg  TMT_aedAeg LN LFVIALMCKDVQLWTPINIILFNLVCS D FSVS-IIGN P FTLTSAISRHWIFGR TVC ...
TMT1_culPip  TMT_culPip LN LFVIALMCKEVQLWTPMNIILLNLVCS D FSVS-IVGN P FTLSSAISHRWLFGR KLC ...
TMT1_triCas  TMT_triCas LN LTVIIFMLKERQLWSPLNIILFNLVVS D FLVS-VLGN P WTFFSAINYGWIFGE TGC ...
TMT1_bomMor  TMT_bomMor LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGH TMC ...
TMT1_helVir  TMT_helVir LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGK TMC ...
TMT1_rhoPro  TMT_rhoPro GN LIVIIIMCRDKNLWTPVNFILFNVIVS D FSVA-ALGN P FTLASAIAKRWFFGQ SMC ...
TMT1_acyPis  TMT_acyPis FN TCVIFIMIRDTRLWTPQNVIIFNLATS D LAVS-VLGN P VTLAAAITKGWIFGQ TIC ...
TMT1a_dapPul  TMTa_dapPu MN IVVVVIILNDSQKMTPLNWMLLNLACS D GAIA-GFGT P ISAAAALKFTWPFSH ELC ...
TMT1b_dapPul  TMTb_dapPu MN VVVVIVILNDSQRMTPLNWMLLNLACS D GAIA-GFGT P ISTAAALEFGWPFSQ ELC ...


=== Origins of melanopsins ===
=== Origins of melanopsins ===

Revision as of 12:22, 27 December 2009

Introduction: the origin of opsins

OpsinOrigins.jpg

The origin of the first opsins is a bit murky. Opsins are operationally defined here as 7-transmembrane proteins structurally and sequentially homologous to GPCR with (Schiff base) lysine in TM7 in alignment with K296 of bovine rhodopsin (or any established opsin).

This section moves forward in time from the parental gene content of the immediate ancestral genome (greatly facilitated by the new Trichoplax and Monosiga assemblies) that gave rise to the first opsin via gene duplication and neofunctionalization of one copy to photoreception. Subsequent sections work backwards in time, first coalescing separate gene trees of ciliary, melanopsic and other opsins to their respective ur-opsins and ultimately deducing properties of the crown group opsin.

The opsin origination event was not necessarily unique -- GPCR always retain many essential properties via their own evolutionary constraints amd conceivably could have given rise to opsins at widely scattered intervals from rather different parental genes. In this type of history, the minimal gene tree containing all opsins is not 'monophyletic' but instead contains embedded non-opsin GPCR. Nothing prevents an established opsin from later giving rise to a gene duplicate that 'reverts' to a non-K296 GPCR. Conceivably the lysine could be retained even as the photobiological functionality is lost.

In the case of multiple such opsins surviving to the present day, branches will coelesce first to separate parental non-lysine GPCRs, which in turn eventually coelesce -- as all GPCR must do -- to a master parental gene.

PolyphylOpsins.png

A gene tree illustrates these hypothetical complexities at left:

-- opsins arose independently from GPCR at nodes 2,3 and 4

-- these opsins initially coalesce to 3 ancestral opsins

-- the first two groups of opsins coalesce to a parental gene at node 1 whose descendents include 7 GPCR

-- at node 5, an opsin has 'reverted' to a new GPCR, also a descendent of these opsins' parent gene

-- the full set of opsins coalesces at a master parental gene at node 0 with numerous non-separable GPCR descendents

This scenario -- the molecular version of whether 'vision' arose once vs multiple times -- can be ruled out for bilateran opsins (provided the relevent GPCR outgroups have left descendent genes) but still must be considered seriously in the case of cnidarian opsins and perhaps ctenophores and sponges as well. It appears today however that the entire bilateran opsin set forms a single branch excludive of all non-K296 GPCR in the tree generated from the roughly 100,000 known GPCR. (For practical reasons, only near-opsin GPCR can be considered.)

Events 600 million years ago may seem hopelessly inaccessible and indeed many uncertainties will remain even after every relevent genome has been sequenced. However sequencing to date has been phylogenetically lopsided with far too little effort expended on early diverging non-model organisms with strategic tree positions. Yet comparative genomics has already provided substantial insights into certain aspects of opsin evolution:

  • The first opsins were not associated with gross morphological structures (such as stalked eyes) that could possibly leave a fossil record (as in trilobites) -- key events took place strictly at the molecular subcellular level. Genomes of extant species (some more than others) are not exactly living fossils because the evolutionary accrual of mutations never ceases.

Cases exist of opsins demonstrably obliterated both by gradual pseudogenization and large scale deletions, confusing the record. Yet opsin genes and even their regulatory regions, when compared across the entire metazoan tree, can furnish reliable reconstructions of opsin content and even sequence at ancestral species divergence nodes.

  • Opsins are definitely not the 'original' GPCR because these were already widely deployed at much earlier divergence nodes -- yeast, protozoa, choanflagellates, trichoplax have GPCR but lack opsins. Nor are opsins the prototype for the 'rhodopsin class' R of the GRAFS classification of GPCR which again was established far earlier. Indeed, even the Ralpha subgroup with of rhodopsin class GPCR was well-established prior to the first metazoan opsin.
  • Opsins are thus latecomers, not pioneers, to a rapidly expanding paralogous gene clade within already full-featured GPCR. Judging by their closest extant blastp relatives among tens of thousands of GPCR at GenBank, opsins specifically arose as a gene duplication within the peptide receptor subgroup PEP. Indeed, certain of these proteins list opsins among their top ten best back-blast matches (ie have better matches than to almost all non-opsin GPCR). Note here that blast scores can be misleading because the 'floor' of percent identity is about 25% just due to universal conserved residues plus accidental matches.
  • Note an 'intermediate' GPCR does not exist: either lysine is present at K296 or it isn't. Reconstructing ancestral states from the best contemporary set of GPCR proteins lacking K296 cannot produce a lysine there by any rational methodology. The 20 encoded amino acids can be clustered into subgroups (eg by polarity or bulk) but ultimately form a unorderable discrete set not furnishing continuum transitional states.
  • Most likely the parental gene had several introns and the original opsins inherited this pattern (ie the duplication was segmental rather than retroprocessional as in some cnidarian opsins). The history of introns within opsins is already complex and becomes quite problematic within the enveloping GPCR gene family. Opsins (with the exception of a fragmentary sea urchin melanopsin) lack the ubiquitious phase 21 intron breaking the DRY motif arginine.
  • Intracellular targeting of early opsins was likely to cytoplasmic or endoplasmic reticulum membranes as isolated monomers, with limited microvillar or especially ciliary specialization (to motile larva) also plausible. These opsins were the first eyes to the world but only in the sense of indicating the intensity (and later directionality) of sunlight striking the cell utilizing already refined GPCR second messenger signal transduction.
  • Opsin creation does not imply saltatory evolution because the basics had been established far earlier -- the 7-transmembrane helical structure with fixed topology, the TM1-TM2 salt bridge N55-D83 that could serve as initial counterion, the DRY ionic lock, the GWS.Y..E.....C..DW........SY region of EC2, the NPxxY terminal helix, the conformational shift upon binding of ligand that could trigger signaling, the Galpha protein binding site needed for the signaling cascade, and an arrestin-type mechanism signaling termination. The earliest opsins contained and continued all of these features from the get-go, adapting them over the course of time to various photoreceptive functions.
  • Opsins are unique among GPCR in several respects: they catalyze a mild in-situ enxymatic reaction -- cis-trans photoismorization -- that furnishes the signaling agonist. (This reaction also occurs thermally without enzyme but so does carbon dioxide dissolution in water yet humans have 15 carbonic anhydrases). Cis-retinal, being lipid soluble, does not diffuse through the extracellular mileau to reach its receptor binding site as in all other GPCR. Instead it is covalently bound to a lysine deeply internal to TM7, again unprecedented among GPCR (though other internal charged amino acids can occur, notably the D83 glutamate salt bridge and K90 of ultraviolet opsins).
  • Opsins did not arise from flavinoid-based cryptochromes, mechanistically different photoreceptors that evolved much earlier to establish circadian rhythm and eventually magneto-sensing. Cryptochromes are homologous to DNA photolyase repair enzymes, not GPCR.
  • Although literature searches turn up scattered assertion about 'opsins' in species such as Chlamydemonas ('chlamyopsin' Z48968) and 'volvoxopsin', not to mention bacterial 'rhodopsins', these amount to abusive terminological metaphors, unwelcome additions to an already complex gene family. These proteins do not have seven transmembrane helices in the same arrangement as GPCR nor possess the slightest sequence homology at deeply conserved GPCR residues, so represent independent evolution of photobiology (along the lines of bat and butterfly wings representing independent origins of flying).
  • Conceivably forerunners of opsins bound a related chromophore non-covalently, perhaps an all-trans retinoid in the manner of peropsins. Retinoic acid is sometimes proposed as ancestral ligand but retinoic acid receptors (RAR and RXR) are non-GPCR nuclear hormone receptors that bind all trans-RA or 9-cis-RA but not 13-cis-RA. Furthermore, the GPCR receptors inducible by retinoic acid -- RAIG1 proteins (GPRC5C etc) belong elsewhere in the GRAFS classification, have no particular affiliation with opsins and again do not bind retinoids themselves. The fact that pseudo-opsin chromophores are similar retinoids may be coincidence arising from the ubiquity of metabolic carotenoids (availability) and the restricted number of biochemicals (isoprenoids but not amino acids) with tunable adsorption in the visual range (suitability).
  • In principle, GPCRs could continue to spawn new clades of opsins from time to time. However, they did not in bilaterans. That is, no gene tree of a bilateran opsin coalesces with a GPCR gene later than the bilateran common ancestor. All bilateran opsins are descended from one of six opsins classes present in the ur-bilateran. Indeed gene tree comprised of all opsins excludes all GPCR, consistent with a unique K296 origination event. However, it remains possible that some cnidarian or ctenophoran opsins arose from a second wing of GPCR with no representative of this opsin surviving in bilaterans.

Two genes in separate species are by definition orthologous only when descended vertically from a single gene in their last common ancestor. It appears that all bilateran opsins -- after accounting for later clade-specific expansions and losses -- are orthologous to either a cilopsin, melanopsin, peropsin, rgropsin, or neuropsin at the bilateran common ancestor. ('Rhabdomeric' protostome opsins do not define a separate class but instead coelesce with vertebrate melanopsins.)

These 5 opsin classes appear not fully coelesced even at the last common ancestor of bilaterans with cnidarians -- while sequence data is woefully limited today in early taxa, it seems both melanopsins and cilopsins classes existed in this ancestor, perhaps in addition other opsin classes no longer represented in bilaterans. Conversely, peropsins have been retained in lophotrochozoan, ecdysozoan, and deuterstome lineages but not in any cnidarian sequence to date. Neuropsins survived solely in chordates, whereas rgropsins are even more restricted to vertebrates, even though they could not have originated there. These latter genes are conceptual analogs of cnidarian-only opsin classes.

All opsins are homologous so any given pair is ultimately orthologous at some earlier common ancestor -- but which one? The species tree itself is confused here on sistering vs independent nodes at cnidarian/ctenophore. The single ctenophore opsin available -- regretably just a distal fragment -- is difficult to classify. The fact that its best blast matches cluster about equally well with melanopsins and cilopsins (to the exclusion of other bilateran classes) suggests that their merger is not far off.

The opsin gene tree can largely be worked out and coordinated with species tree divergences. Despite many efforts at this, some deeper topology remains problematic. It appears from sequence clustering, indel analysis, and especially intron conservation that ((peropsin, rgropsin),neuropsin) is a valid subgroup. Further, this assemblage associates more closely with cilopsins, leaving a final topology to be superimposed on the phylogenetic tree:

gene tree    ((((cilopsin,((peropsin,rgropsin),neuropsin)),melanopsin),cnidopsin),GPCRpep);
species tree (((((((((echinoderm,acornworm),amphioxus),tunicate),vertebrate),((chelicerate,(crusacean,insect)),(mollusc,annelid))),cnidaria),ctenophore),trichoplax),sponge);

Nearest neighbors of opsins among GPCR

The immediate outgroup of opsins lies among a vast number GPCR receptors. The reference collection defines a close-in subset utilizing human GPCR which have the best prospects for determined ligand. Note blast score order is not ideal because they are squeezed between a 'floor' of ~23% identity attributable universally conserved residues plus accidental matching, and a 'ceiling' of ~30% to remain non-opsin.

None of these GPCR represent the actually parental gene to opsin because they have themselves evolved forward some 600 million years from the putative opsin creation event. Conceivably one or more is also directly descended from it. The consensus line of the alignment below perhaps represents a better approximation to the desired ancestral sequence. It is difficult to reconstruct an ancesteral sequence accurately because non-adjacent opsin residues co-evolve, creating algorithmic errors in methods that neglect this. Some co-evolving residues are suggested by structural studies but not all relationships can be described.

OpsinOutgroup.jpg

Opsins are not the 'original' GPCR (which are trackable, barely, to yeast) even for the 'rhodopsin' group R (or even its Ralpha subgroup) within the GRAFS classification but rather form a specialized set that arose later as the rhodopsin gene class (which contains the AMIN cluster [adrenalin, serotonin, dopamine, and histamine receptors], MECA branch [peptide and lipid binding receptors] in addition to opsins) underwent significant expansions.

This expansion of the Ralpha class had largely taken place in the last common metazoan ancester shared with Monosiga and Trichoplax (which do not contain opsins), implying the ancestral metazoan lacked them as well. The orphan receptors GPR21 and GPR52 form the immediate outgroup (within the 800 human GPCR) in an oft-cited 2003 study. These have isoleucine at K296; their ligands are still not known as of Dec 2009. Conservation is high throughout deuterostomes; blast matches are restricted within opsins to molluscan melanopsins suggesting Gq signaling.

The melatonin receptor MLTNR1A emerges as a close relative to opsins. Curiously it plays a key role in circadian rhythms and so needs to coordinate with opsin photosensors. N-acetyl-5-methoxytryptamine, the ligand, bears no obvious relationship to cis-retinal however and K296 is lacking, making an immediate parent gene relationship problematic.

Another clue to the origin of opsins might be provided by examining GPCR intron positions and phases to see if shared with ancient introns in opsins. Many non-olfactory GPCR with sequence similarity to opsins have no introns or just one, suggesting the genes duplicated by retroprocessing, perhaps acquiring an intron at unrelated position later. UROPS2 has an intron but it does not seem to correspond to one in any opsin. Cnidarian opsins are either intronless (Nematostellata) or undetermined (just known from processed transcripts).

Closeness in the GRAFS tree does not fully accord with closeness of blastp hit and relatedness of diagnostic regions, suggesting (unsurprisingly) that its topology is slightly wrong at some internal nodes. On average rank in blastp top scores (or by average 5 best blast expectation values), as representatives of all opsin classes are aligned with the GPCR below, the highest scoring ones by far are are the Trichoplax opsins followed by various peptide receptors:

Rank  Gene          Exp    Exons  Receptor      Ligand

4.2   UROPS2_triAd  e-29   2      orphan        histamine? (HRH2:  best human non-opsin blast match)
5.4   UROPS1_triAd  e-28   1      orphan        peptide?   (SSTR1: best human non-opsin blast match)
5.6   SSTR1_homSap  e-26   1      somatostatin  peptide
7.2   TACR2_homSap  e-25   5      tachykinin    peptide
8.1   GALR1_homSap  e-24   3      galanin       peptide
8.9   MTNR1A_homSa  e-23   2      melatonin     N-acetyl-5-methoxytryptamine 

The biological literature contains various scattered claims about 'opsins' in species such as Chlamydemonas (chlamyopsin Z48968), not to mention bacterial 'rhodopsins'. These do not have the seven transmembrane helices in the same arrangement as GPCR nor significant sequence homology and may represent independent evolution of photobiology (just as bat and butterfly wings represent independent origins of flying).

Trichoplax has two very curious 7-transmembrane proteins that emerge as its best genomic match to opsin queries. While lacking K296 for a Schiff base, their best back-blast to all of GenBank returns almost entirely opsins (rather than nest within other GPCR receptors). While Trichoplax is 600+ million years removed from the common ancestor with eumetazoa, this gene could still offer clues about the immediate GPCR ancestor to opsins.

These Trichoplax genes retain uncanny similarities to opsins in otherwise rapidly changing regions. These two genes are not plausibly derived from an opsin expansion with subsequent loss of K296 because Trichoplax and other early diverging lineages lack opsins. Perhaps these genes should be considered opsins in spite of lacking K296. Recall here Schiff base formation dramatically redshifts the absorption spectrum, yet non-covalently bound retinal still has significant adsorption at optical wavelengths which might be further tuned by Trichoplax binding pocket residues.

Conversely, several cnidarian species exhibit far too many K296-type GPCR for their apparent photoreceptive needs and accompaning lack of overt photobiological anatomical specializations. These may represent divergent gene duplications of valid opsins that have evolved into some other type of GPCR; alternatively they could represent a lineage of pre-opsin GPCR that developed K296 but never acquired an opsinlike light-sensing role nor served as parental gene to bona fide opsins.

Together the Trichoplax pre-opsins lacking K296 and putative cnidarian non-opsins possessing K296 push the opsin-defining envelope to its limits. Given the immense time span separating contemporary genes from ancestral, we can anticipate their computed nesting arrangement within the opsin gene tree relative to a close-in GPCR outgroup with known non-retinal ligands will lack convincing statistical support at the critical nodes. The best way forward is additional sequencing and experimentation with cubomedusae, ctenophores and sponges because these seem to contain conventional opsins that can clarify the positions of the outliers.

In summary, the parental GPCR that gave rise to the first opsin can be localized fairly reliably to the PEP subgroup of R class GPCR within GRAFS but no particular gene there stands out as the definitive pre-opsin. The time span invoved is immense and this gene class has experienced much churning through expansion and contraction cycles, as well as moderately rapid pointwise residue change.

An independent approach to opsin origins might compare intron positions and phases of candidate parental GPCR to those of opsins. The ancestral introns of opsins are easily reconstructed, reducing noise and potential coincidence, but that program is quite difficult to extend to GPCR. Too often, GPCR with relevent sequence similarity to opsins have no introns or just one, suggesting gene duplication by retroprocessing followed by a later intron acquisition at non-historic position followed by more rounds of duplication (as seen in sulfatases).

UROPS2 of trichoplax has one intron but unfortunately it does not correspond to any in opsins. Cnidarian opsins to date have been either intronless (Nematostella) or not determined (known only from processed transcripts). Thus the intronic approach to parental GPCR awaits more extensive sequencing of early genomes.

A third approach to opsin origins considers informative indels and diagnostic residues in the set of all opsins expanded by select GPCR. While perhaps subject to more homoplasy than introns, regions such as extracellular loops TM2 and EC2 do illuminate issues such as ancestral length and define signature residues of opsin classes.

Origin of contemporary opsin classes

Traceback of opsins can begin by selecting certain 'index sequences'. It ultimately does not matter which or how many, but for historical reasons bovine rhodopsin, frog melanopsin, human peropsin, mouse neuropsin and so forth might be used.

Each index sequence is then built out to a larger class of orthologs in nearby species using flanking gene synteny to confirm best-blast. Lineage-specific gene duplications with close affinities (eg from recent clade-specific paralogous expansions such as teleost fish whole genome duplications) are added. Eventually the set collides with an expanding set of another index sequence and all bilateran opsin sequences fall into one of five clusters.

Ciliary opsins (generated from RHO1) forms a cohesive gene clade called here cilopsins that does not coalesce with melanopsins, peropsins, neuropsins, or rgropsins within vertebrates, deuterostomes, or even bilatera. The index gene picks up rod and cone imaging opsins, pinopsin, parapinopsin, parietopsin, very ancient opsin, encephalopsin, teleost multiple tissue, and certain ciliary opsins from protostomes.

Hardly a vertebrate innovation, ciliary opsins appear in early deuterostomes lacking imaging eyes, in both branches of protostomes (initially bee and ragworm), in pre-bilateran cnidarians and quite possibly ctenophores. Sponges are still uncertain because of a 5 year wait on the assembly but the very earliest metazoan genomes (Monosiga and Trichoplax) definitely lack ciliary, indeed any K296 GPCR. If those genomes are representative, then ciliary opsins emerged on the post-Trichoplax stem. Certain cnidarian opsins -- but not all -- already exhibit certain sequence specializations of ciliary opsins.

Ciliary opsins have been totally lost on numerous occasions in numerous lineages, notably 'model' organisms like drosophila and worse nematodes, which have lost all opsins. Hemichordates and non-annelid lophotrochozoans have lost ciliary opsins independently. Other explanations (such as multiple re-emergences of ciliary-like opsins from GPCR or distantly related opsins) are manifestly impossible given intron structure alone.

The earliest deuterostome ciliary ur-opsin is best represented by the TMT class of opsins, in particular by the TMT1 subgroup that has retained important ancestral characteristics in the diagnostic TM2 region. Sequential expansion of TMT1 gave rise to all the other ciliary opsins found in vertebrates, including all rod and cone opsins. This fundamental gene, though retained through ampibian and amniote, curiously was eventually lost in birds and mammals. Transcripts are often annotated as testis libraries suggesting a function in gamete release timing. Its immediate descendent gene TMT2, whose subfunctionalization is unknown, is retained in monotremes and marsupials but lost in all placentals. The best experimental organism for studying TMT1 is probably Xenopus.

Melanopsins, discovered in 1998 in frog lateral line dermal melanophores (as well as hypothalamus, iris, and retinal horizontal cells) form another ancient opsin class. Melanopsins include rhabdomeric arthropod opsins (which have an unnecessary dual nomenclature -- they're melanopsins by multiple independent criteria) and lophotrochozoan melanopsins (which other than scallop, squid and octopus genes lie undocumented within genome projects). One cnidarian opsin from coral classifies as a melanopsin yet closely shares other properties with cnidarian opsins that don't.

OpsinLoss.jpg

Peropsins are a third major class of opsins in the sense of broad but not universal retention. Expanded in deuterostomes, they occur rarely in arthropods but are quite important in lophotrochozoa. Peropsins are the only opsin class retained in hemichordates. Nothing resembling them has been retained in cnidaria, reflecting loss in the two genomes available because their coalescence with cilopsins lies much further in the past.

Neuropsins are a much expanded but little studied group of opsins restricted to living deuterostomes though they did not originate there (unless divergence from another opsin class was exceeding abrupt and then immensely slowed). The neuropsin expansion to 4 genes in the lamprey stem continued unchanged to the amniote ancestor but subsequently contracted to 2 in monotremes and only 1 in marsupials and placentals.

Rgropsins constitute another little-studied group represented today only beyond the tunicate-vertebrate last common ancestor. Again these opsins must have originated far earlier in pre-bilaterans because their ancestral reconstructed sequence is still far from coelescence with other ancestral opsin classes.

Conceivably rgropsins and neuropsins are retained in other bilatera but diverged to the point of unrecognizability. This scenario can be rejected because analytic methods of complete genomes are sufficiently sensitive to locate all GPCR and screen them for K296. This reasoning is applicable to peropsins as well -- they have definitely been lost in all insect and molluscan genomes though fortunately retained in two chelicerate arachnids.

Peropsin, neuropsins and rgropsins are unified by their intronation, sharing three ancestral introns despite numerous differences. This indicate -- given the slow rate of intron gain and loss in most metazoan clades -- that they share deep roots in pre-bilatera, implying near total loss of neuropsins and rgropsins in invertebrates.. None of these introns are shared with cilopsins or melanopsins or for that matter known GPCR.

Opsins, plagued again and again by losses on stem lineages, illustrate why ancestral node-based sequencing is a far better strategy than terminal speciation-based. If sequencing effort is proportional to contemporary species numbers, the millions of opsins from insects (respectively ray-finned fish), opsin evolution will never be illuminated. Each major node requires equal sequencing intensity. Thus onychophoran and tardigrade opsins have a far greater priority than more butterflies or cichlid fish.

Cnidopsins are a taxonomically based collection of opsins that do not all classify satisfactorily within the bilateran opsin system. Much more intensive sampling is needed here because neither Hydra nor Nematostella has remotely the cubomedusan repertoire. Ctenophores currently have a single unexpected opsin gene obtained accidentally in a shotgun project -- obviously much greater sequencing and structural effort is warranted given their currently basal position within the opsin-containing species.

In hindsight, large scale loss of opsin classes should not come as a surprise -- humans lost 12 of 20 opsin loci that otherwise persisted from lamprey stem to amniote ancestor to living frogs, lizards and birds. This is characteristic of GPCR evolution overall (notably the olfactory subgenome): collapse of a large gene clade, followed by later massive expansion but retention to contemporary species only in scattered lineages.

This can result in two species having similar number of GPCR genes but a very poor correspondence between them. This pattern of gene churning (cycles of explosive expansion followed by mass die-offs) differs dramatically from gene histories of ribosomal proteins or catabolic enzymes (eg homogenistate dioxygenase) retained in all species as single copies (ancient birth, never death). Other genes like globins exhibit moderate expansion to several copies accompanying a trend to organismal specializational complexity with little evidence of contraction (occasional births, rarely deaths). Still other gene classes, for example selenoproteins, seem headed systemically for oblivion (births but trending to extinction) in the sense of the cysteine replacement rachet.

Once over this conceptual hurdle, cycles of expansion and contraction in the GPCR gene family can be repeatedly invoked on various branches of the phylogenetic tree to explain many aspects of opsin classification. After several such cycles, the utility of terms such as ortholog and paralog are stretched to the breaking point -- words become inadequate to describe the gene tree.

Vertebrate ciliary opsins

CilOpDecline.jpg

Vertebrates have a very peculiar history of explosive opsin expansion during the brief lamprey stem that reached contemporary gene numbers and adaptive functionalities. This cannot be attributed to supposed 1R and 2R whole genome expansions -- indeed the observed lack of sistering in the ciliary opsin gene tree conflicts with this scenario.

This expansive era was followed by 500 myr of relative stasis in opsin gene number for many lineages, though a long list of mostly narrow exceptions could be drawn up. These exceptions have drawn too much experimental attention away from the broader sweep of opsin evolution.

Examples of gains include sharks (extra copy of LWS), zebrafish (greatly expanded RHO2 repertoire and retained RHO1 retrogene), primates (tandem LWS copy) and so forth, with the most sweeping gain being whole genome duplication in early telost fish. Each gene gain is presumably retained for some adaptive reason.

Examples of losses include dolphin (SWS1), chickens (TMT1), cave fish and blind mole rat (opsin pseudogenes), platypus (SWS1) and so forth, with the most remarkable loss being the massive attrition era in mammals (60% of all opsin loci lost in placentals, forseen by G Wall in 1942). Opsin gene loss is generally not adaptive ('less is more') but simply neutral drift ('use it or lose it').

It follows from both standard tree analysis and consideration of diagnostic regions that the very earliest ciliary opsins in deuterostomes were of TMT1 class. Only these opsins continue the ancestral pattern in the N D P C iron triangle centered in the second transmembrane helix, which was established already in pre-cnidaria and continues today in most other opsin classes, including ecdysozoa and cnidaria ciliary opsins.

Through cascading segmental gene duplications, the TMT1 ciliary ur-opsin gave rise directly and indirectly to all other ciliary opsins observed in living deuterostomes. The ur-opsin likely retained the ancestral ciliary opsin form and function even as its daughter genes have neofunctionalized to new roles. It cannot be naively modeled from bovine RHO1 (the most recently derived of all imaging opsins) because the latter lacks the induced proline kink in TM2.

The current phylogenetic distribution of TMT1 extends from sea urchin (but not acornworm) to amphioxus and tunicate through chonrichtyes and teleost fish to frog and lizard, with ortholgy mostly validatable by syntenic location. Remarkably birds, platypus, marsupials, and all placental mammals have lost the ciliary ur-opsin which requires two independent events. Lizard flanking gene order is fully preserved in chicken but no pseudogene debris remains at the site. This is a familiar story in opsins ... an old gene fades out mid-amniote but otherwise continues on for another 310 million years (Wall hypothesis plus birds).

What is the function of the ciliary ur-opsin in the contemporary organisms that retain it? To determine whether it has ever been studied under another name, tblastn of each TMT1 in the curated reference collection can be used against GenBank transcripts and gene deposits (which often provide the necessary PubMed id). Transcript data from non-pooled tissues might at least determine some sites of expression; however no data is available for frog or lizard (ie no data for tetrapods since other species have lost the gene).

Note first that the gene appears to have undergone a segmental duplication in the chondrichthyes stem. Both loci persisted through teleost fish; copies created by whole genome duplication were not retained. Only one locus, presumably more fundamental, persisted into frog and lizard. It defines the TMT locus in earlier diverging species via best-blast and synteny and so the second TMT locus by default. These are not to be confused with a later TMT gene duplication that arose in fish and persisted through lizard, birds, platypus and marsupials but not placentals.

One of the three distinct TMT genetic loci was specifically studied in adult eyes and embryonic cell lines of zebrafish. Little has happened since (other than bioinformatics): TMT genes from Tetraodon, Gasterosteus and Oryzias still lack articles and informative transcripts. Zebrafish TMT genes remain crazily annotated at GenBank (as adiponectin, with pipeline analysis of genomic dna mislabelled as mrna). None of the zebrafish genes has never been studied though two transcripts are available from delimited libararies (developing eggs with support cells). Pimephales promelas has TMT transcripts from brain and testis; Oncorhynchus mykiss has transcripts from testis.

In summary, while thousands of articles address the highly derived RHO1 locus, the much more fundamental cilary ur-opsin TMT has scarcely been investigated. Until its core function is better understood, the early history of ciliary opsins in vertebrates will remain a mystery. Without this core role for TMT that allowed it to be retained for tens of millions of years in the pre-vision era, no ciliary opsin would have been available for later expansion into imaging ciliary opsins.

Below is a preliminary assessment of the three TMT loci and the related encephalopsin locus. The right column revises nomenclature relative to that of the reference collection (second column). Gene names arose in the pre-genomic era when the full paralog complement, phylogenetic distribution and sites of expression were not understood. With complete genomes now in hand, a final and sensible nomenclature can be envisioned.

The TM2 region aligned proves useful for defining the diagnostic residues and indels of these four gene classes necessary to sort out their evolutionary relationships. As the names suggest, this is (TMT1,(TMT2,(TMT3,ENC))) rooted by protostome ciliary opsins which indicate TMT1 is the original ur-opsin. The final column contains information on synteny that validates the proposed history of gene duplication. Gene order is barely conserved but enough for the duplication history to be unravelled.

ENC_homSap    ENCEPH_hom NN LLVLVLYYKFQRLRTPTHLLLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_otoGar    ENCEPH_oto NN LLVLVLYYKFPRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_loxAfr    ENCEPH_lox NN LLVLVLYYKFQRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_pteVam    ENCEPH_pte NN LLVLVFYYKFQQVRTPFYLFLVNISFS D LLVS-FFGV T FTFVSCLRNGWVWDT VGC ...
ENC_musMus    ENCEPH_mus GN LLVLLLYSKFPRLRTPTHLFLVNLSLG D LLVS-LFGV T FTFASCLRNGWVWDA VGC ...
ENC_canDom    ENCEPH_can CH FCPQKGFLEFQRLRTPTHLLLVNLSLS D LLVS-LFGV T FTFVSCLRNGWVWDS VGC ...
ENC_monDom    ENCEPH_mon NN LLVLVLYYKFQRLRTPTHLFLVNISFN D LLVS-LFGV T FTFVSCLRSGWVWDS VGC syn(-EXO1   -WDR64   +ENC -KMO     +FH +RGS7)
ENC_galgal    ENCEPH_gal NN LLVLVLYYKFKRLRTPTNLFLVNISLS D LLVS-VCGV S LTFMSCLRSRWVWDA AGC syn(-EXO1   -WDR64   +ENC        -PIGM +RGS7)
ENC_anoCar    ENCEPH_ano NN LLVLVLYAKFKRLRTPTHLFLVNISLS D LLVS-LFGV S FTFGSCLRHRWVWDA AGC syn(-EXO1   -WDR64   +ENC        -PIGM +RGS7)
ENC_xenTro    ENCEPH_xen NN LLVLILYCKFKRLQTPTNLLFFNTSLC H FVFS-LLAI T FTFMSCVRGSWAFSV EMC syn(-ASAH3L -ACER2   +ENC     -ADFP -DENND4C)
ENC_danRer    ENCEPH_dan NN IIVIILYSRYKRLRTPTNLLIVNISVS D LLVS-LTGV N FTFVSCVKRRWVFNS ATC syn(-MTRF1L -TMEM63B +ENC -KMO +IDE -MARCH5 +CPEB3 -BTAF1)
ENC_takRub    ENCEPH_tak NN FVVLALYCRFKRLRTPTNLLLVNISLS D LLVS-LFGI N FTFAACVQGRWTWTQ ATC syn(-ABLIM1 -PTK7    +ENC -KMO +IDE         +CPEB4        -CCNJ)
ENC_gasAcu    ENCEPH_gas NN VVVIVLYCKFKRLRTPTNLLVVNISLS D LLVS-VIGI N FTFVSCIRGGWTWSR ATC syn(FAM82A  CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4        -CCNJ)
ENC_oryLat    ENCEPH_ory NN LLVILLYCKFKRLRTPTSLLLVNISLS D LLVS-VVGI N FTLASCVKGRWMWSQ ATC syn(CYP1B1  CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4 -BTAF1 -CCNJ)
ENC_calMil    ENCEPH_cal NN ILVLLLYYKFKRLRTPTNLLLVNISVS D LLVS-VFGL S FTFVSCTQGRWGWDS AAC ---
ENC_squAca    ENCEPH_squ NN LLMLVLYCKFKRLRTPTNLFLVNISIS D LLLS-VFGV I FTFVSCVKGRWVWDS AAC ---
ENC_petMar    ENCEPH_pet NN LLLVALFVGFKRLQTPTNLLLVNISLS D LLVS-VFGN T LTLVSCVRRRWVWGN GGC ---
ENC_braFlo    ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn(     -ZFYVE1 +RTF1 +ENC -CES1 -POMT2)
ENC_braBel    ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNVSVA D LLVS-VCGN T LSFASAVQHRWLWGR PGC ---
ENC_braFlo    TMT5_braFl SN GAVVLLFLKFRQLRTPFNMLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn(NKX2 -ZFYVE1 +RTF1 +ENC ERF1 TMED9 LARS2)
ENC_braBel    TMT5_braBe SN GAVVVLFLKFPQLRTPFNLLLLNMAVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC ---
TMT3_monDom   TMT_monDom SN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LSGT T LSFASSIQGRWIGGK HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2 RALY SLC5A7 +SULT1E1)
TMT3_macEug   TMT_macEug NN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LTGT T LSFASSIRGRWIAGY HGC ---
TMT3_ornAna   TMT_ornAna NN LIVLILFCKFKALRNPVNMIMLNISAS D MLVC-VSGT T LSFASNISGRWIGGD PGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -UCHL3  +TBCID4)
TMT3_galGal   TMT_galGal NN LIVLILFCKFKTLRNPVNMLLLNISIS D MLVC-ISGT T LSFASNIHGKWIGGE HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7 +SULT1C4)
TMT3_taeGut   TMT_taeGut NN LIVLILFCKFKTLRNPVNMLLLNISVS D MLVC-ISGT T LSFASNIRGKWIGGD HAC ...
TMT3_anoCar   TMT_anoCar NN LVVLILFCKFKTLRNPVNMLLLNISAS D MLVC-ISGT T LSFVSNIYGRWIGGE HGC syn(            +TMT3 -ST6GAL2_overlap +SLC5A7 RANBP2)
TMT3_xenTro   TMT_xenTro NN FVVLILFCKFKTLRTPVNMMLLNISAS D MLVC-VSGT T LSFTSSIKGKWIGGE YGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7)
TMT3_danRer   TMT_danRer NN LVVLVLFCKFKTLRTPVNMLLLNISIS D MLVC-MFGT T LSFASSVRGRWLLGR HGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap    +GPR89A -PDZK1l)
TMT3_tetNig   TMT_tetNig NN FIVLLLFCKFKKLRTPVNVLLLNISVS D MLVC-LFGT T LSFASSLRGRWLLGR SGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap)
TMT3_takRub   TMT_takRub NN FVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR IGC ...
TMT3_gasAcu   TMT_gasAcu NN LVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSLRGKWLLGR SGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -TFDP2 POU2)
TMT3_oryLat   TMT_oryLat NN FVVLILFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR GGC ...
TMT2_anoCar   TMTa_anoC  NN LLVLVLFCRNKVLRSPINLLLMNISLS D LMIC-IVGT P FSFAASTQGKWLIGP AGC syn(VAMP PER2 HES6 TUBA1 GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5)
TMT2_xenTro   TMTa_xenT  NN LVVLILFCQYKVLRSPINMLLMNISLS D LMVC-ILGT P FSFAASTQGHWLIGE IGC syn(VAMP PER2 HES6       GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5)
TMT2_danRer   TMTb_danRe NN TLVLVLFCRYKVLRSPMNCLLISISVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC syn(-PTCHD1 -PHEX -CNKSR2 SH3KBP1 -MAP3K15 TNK2 +TMT2 -MYEOV2 -MAP4K4 PRMT6)
TMT2_tetNig   TMTb_tetNi SN LLVLALFCRFRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLLGR AGC syn(MYO3A GAD2 ARHGAP21 TFR2 +TMT2  MYEOV2 SH3KBP1 MAP3K15 PHEX PTCHD1)
TMT2_takRub   TMTb_takRu SN FLVLALFCRYRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ...
TMT2_gasAcu   TMTb_gasAc SN FLVLALFCRYRALRTPMNLLLVSISAS D LLVS-MVGT P FSFAASTQGRWLIGR AGC syn(PTCHD1 PHEX MAP3K15 SH3KBP1 +TMT2 -MYEOV2 ARHGAP21 MYO3A)
TMT2_oryLat   TMTb_oryLa SN LLVLALFCRYRALRTPMNLLLVSISVS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ...
TMT1_danRer   TMTa1_danR NN LLVLVLFGRYKVLRSPINFLLVNICLS D LLVC-VLGT P FSFAASTQGRWLIGD TGC syn(      RAB25 PBX3 TNK2 +TMT1  WAC +LPPR4 +AGL PTBP1)
TMT1_tetNig   TMTa_tetNi SN LLVLVLFCRFKVLRSPINLLLVNISVS D LLVC-VLGT P FSFAASTQGRWLIGA AGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1  WAC RAB18 YME1L1 ABI1)
TMT1_takRub   TMTa_takRu NN LLVLVLFCRYKMLRSPINLLLMNISIS D LLVC-VLGT P FSFAASTQGRWLIGE AGC syn(     LRRN3 CALD1 TNK2 +TMT1      RAB18 YME1L1 ABI1 TLK1 EDRNB)
TMT1_gasAcu   TMTa_gasAc NN LLVLVLFCRYKMLRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1  WAC RAB18 YME1L1 ABI1)
TMT1_oryLat   TMTa_oryLa NN LLVLVLFCRYKILRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC ...
TMT1_pimPin   TMTa_pimPr NN TLVLILFCRYKVLRSPMNYLLVSIAVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC ---
TMT1_oncMyk   TMTa_oncMy SN LFVLLVFARFQVLRTPINLILLNISVS D MLVC-IFGT P FSFAASLYGRWLIGA HGC ---
TMT1_calMil   TMTa1_calM NN LLVLVLFCKYKVLRSPMNMLLLNISVS D MLVC-ICGT P FSFAASVQGRWLVGE QGC ---
TMT2_calMil   TMTa2_calM NN LLVLLLFVCFKEIRTPLNMILLNISLS D LSVC-VFGT P FSFAASIYRRWLIGH KGC ---
TMT1_braFlo   TMTx_braFl NN STTLYLVGRYKQLRTPFNILMVNLSVS D LLMC-VLGT P FSFVSSLHGRWMFGH SGC syn(TNPPO2 HECTD3 ABCCA4 PRPRA TMT ATP5D TMT PTPRA FDE4A PTPRA PYRNXN1
TMT2_braFlo   TMTy_braFl TN LLTVLVFWCFKSLRTPFHLYLGGIALS D LLVA-ALGS P FAVASAVGERWLFGR AVC syn(ZFYVE1 FBXL4 RTF1 TMT CES4 TMTY POMT2 GSTZ1) 
TMT1_strPur   TMTPIN_str NN GIVMILFARFPSLRHPINSFLFNVSLS D LIIS-CLAS P FTFASNFAGRWLFGD LGC syn(ARG2 NEK9 FAM164A ZC3H14 TMT PRPF39 YIPF4 SPATA5)
TMT2_strPur   ENCEPH_str GN SVVLFLFAWDRHLRTPTNMFLLSLTIS D WLVT-VVGI P FVTASIYAHRWLFAH VGC ---
TMT1_apiMel   TMT_apiMel AN LLVAIVIVKDAQLWTPVNVILFNLVFG D FLVS-IFGN P VAMVSAATGGWYWGY KMC syn(HEX MAK FASN SPTBN4 PSMA3 TMT LSM11 SEC23A KNSL8)
TMT1a_anoGam  TMT1_anoGa LN IFVIALMYKDVQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWLYGK SIC ...
TMT1b_anoGam  TMT2_anoGa LN LFVIALMCKDMQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWIFGR TLC ...
TMT1_aedAeg   TMT_aedAeg LN LFVIALMCKDVQLWTPINIILFNLVCS D FSVS-IIGN P FTLTSAISRHWIFGR TVC ...
TMT1_culPip   TMT_culPip LN LFVIALMCKEVQLWTPMNIILLNLVCS D FSVS-IVGN P FTLSSAISHRWLFGR KLC ...
TMT1_triCas   TMT_triCas LN LTVIIFMLKERQLWSPLNIILFNLVVS D FLVS-VLGN P WTFFSAINYGWIFGE TGC ...
TMT1_bomMor   TMT_bomMor LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGH TMC ...
TMT1_helVir   TMT_helVir LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGK TMC ...
TMT1_rhoPro   TMT_rhoPro GN LIVIIIMCRDKNLWTPVNFILFNVIVS D FSVA-ALGN P FTLASAIAKRWFFGQ SMC ...
TMT1_acyPis   TMT_acyPis FN TCVIFIMIRDTRLWTPQNVIIFNLATS D LAVS-VLGN P VTLAAAITKGWIFGQ TIC ...
TMT1a_dapPul  TMTa_dapPu MN IVVVVIILNDSQKMTPLNWMLLNLACS D GAIA-GFGT P ISAAAALKFTWPFSH ELC ...
TMT1b_dapPul  TMTb_dapPu MN VVVVIVILNDSQRMTPLNWMLLNLACS D GAIA-GFGT P ISTAAALEFGWPFSQ ELC ...

Origins of melanopsins

(to be continued shortly)