Opsin evolution: Cytoplasmic face: Difference between revisions
Tomemerald (talk | contribs) |
Tomemerald (talk | contribs) |
||
Line 1: | Line 1: | ||
=== Comparative genomics of the cytoplasmic face of GPCR proteins === | === Comparative genomics of the cytoplasmic face of GPCR proteins === | ||
The cytoplasmic | The cytoplasmic face of an opsin (or any GPCR) is comprised of three disjoint connecting loops and the carboxy terminus. It is presumably responsible for all interactions with downstream signal relaying partners because these latter are cytoplasmic proteins having no physical access to the extracellular loops or transmembrane segments. Here it must be noted that photoisomerization and retinal release from Schiff base deep within the transmembrane region must drive a significant change in conformation in the cytoplasmic face that differentiates its inactive from active states. | ||
For bioinformatic purposes, it is convenient to 'reorganize' each linear protein sequence into its intracellular, membrane and outer regions for separate consideration. This is done below for the cytoplasmic face for 500 curated opsins from each of the 20 vertebrate opsin genetic loci using multiple representatives for each phylogenetic node and intense bracketing at eras of functional transition (eg between DRY and GRY opsins of RGR class). A range of non-opsin GPCR are included to define properties common to all members of this large gene family (not specific to opsins). | |||
The two critical goals in GPCR research are | The two critical goals in GPCR research are finding the natural ligands (which largely concerns the extracellular and transmembrane regions) notably for orphan receptors and to determining their specific Galpha signaling partner among the 17 such paralogs in the vertebrate genome. For vertebrate opsins, the ligand is known (11-cis retinal or related) but the signaling partner generally is not. For example, does RGR opsin signal at all, to what regulatory effect, and what is the meaning of the abrupt shift in the DRY motif to GRY at boreoeuthere divergence? | ||
DRY loop motif transmemb | DRY loop motif transmemb Le 7 9 signaling | ||
ENCEPH_hom ERYIRVVHARVINFSW AWRAITYIW 16 V A G? | ENCEPH_hom ERYIRVVHARVINFSW AWRAITYIW 16 V A G? | ||
RGR_homSap GRYHHYCTRSQLAWNS AVSLVLFVW 16 C R G? | RGR_homSap GRYHHYCTRSQLAWNS AVSLVLFVW 16 C R G? | ||
Line 23: | Line 23: | ||
NEUR1_homS DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L G? | NEUR1_homS DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L G? | ||
NEUR2_galG VCCLKICFPAYGNRFRRKH GQILIACAW 19 C P G? | NEUR2_galG VCCLKICFPAYGNRFRRKH GQILIACAW 19 C P G? | ||
NEUR3_galG IRFLVTNSSKSNSNKISKNT VHILITFIW 20 N S G? | |||
NEUR4_ornA TRYIKGCHPHRGHFINTAN ISVALILIW 19 C P G? | |||
TMT_monDom ERYRTL-TLCPGQGADYQK ALLAVAGSW 19 - L G? | TMT_monDom ERYRTL-TLCPGQGADYQK ALLAVAGSW 19 - L G? | ||
MEL1_homSa DRYLVITRPLATFGVASKRR AAFVLLGVW 20 T P Gq | MEL1_homSa DRYLVITRPLATFGVASKRR AAFVLLGVW 20 T P Gq |
Revision as of 11:23, 30 January 2009
Comparative genomics of the cytoplasmic face of GPCR proteins
The cytoplasmic face of an opsin (or any GPCR) is comprised of three disjoint connecting loops and the carboxy terminus. It is presumably responsible for all interactions with downstream signal relaying partners because these latter are cytoplasmic proteins having no physical access to the extracellular loops or transmembrane segments. Here it must be noted that photoisomerization and retinal release from Schiff base deep within the transmembrane region must drive a significant change in conformation in the cytoplasmic face that differentiates its inactive from active states.
For bioinformatic purposes, it is convenient to 'reorganize' each linear protein sequence into its intracellular, membrane and outer regions for separate consideration. This is done below for the cytoplasmic face for 500 curated opsins from each of the 20 vertebrate opsin genetic loci using multiple representatives for each phylogenetic node and intense bracketing at eras of functional transition (eg between DRY and GRY opsins of RGR class). A range of non-opsin GPCR are included to define properties common to all members of this large gene family (not specific to opsins).
The two critical goals in GPCR research are finding the natural ligands (which largely concerns the extracellular and transmembrane regions) notably for orphan receptors and to determining their specific Galpha signaling partner among the 17 such paralogs in the vertebrate genome. For vertebrate opsins, the ligand is known (11-cis retinal or related) but the signaling partner generally is not. For example, does RGR opsin signal at all, to what regulatory effect, and what is the meaning of the abrupt shift in the DRY motif to GRY at boreoeuthere divergence?
DRY loop motif transmemb Le 7 9 signaling ENCEPH_hom ERYIRVVHARVINFSW AWRAITYIW 16 V A G? RGR_homSap GRYHHYCTRSQLAWNS AVSLVLFVW 16 C R G? RGR2_gasAc DRYHQYCTRQKLFWST TLTMSAIIW 16 C R G? RHO1_homSa ERYVVVCKPMSNFRFGENH AIMGVAFTW 19 C P GNAT1 RHO2_galGa ERYIVVCKPMGNFRFSATH AMMGIAFTW 19 C P GNAT2 SWS2_ornAn ERFLVICKPLGNLSFRGTH AIFGCAATW 19 C P GNAT2 PIN_galGal ERYVVVCRPLGDFQFQRRH AVSGCAFTW 19 C P G? SWS1_homSa ERYIVICKPFGNFRFSSKH ALTVVLATW 19 C P GNAT2 LWS_homSap ERWMVVCKPFGNVRFDAKL AIVGIAFSW 19 C P GNAT2 VAOP_galGa ERYIVICRPVGNMRLRGKH AAQGIAFVW 19 C P Gt PARIE_utaS ERYNVVCQPLGTLQMSTKR GYQLLGFIW 19 C P Gd+Go PPIN_xenTr DRVFVVCKPMGTLTFTPKQ ALAGIAASW 19 C P Gt PER_homSap DRYLTICLPDVGRRMTTNT YIGLILGAW 19 C P Go NEUR1_homS DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L G? NEUR2_galG VCCLKICFPAYGNRFRRKH GQILIACAW 19 C P G? NEUR3_galG IRFLVTNSSKSNSNKISKNT VHILITFIW 20 N S G? NEUR4_ornA TRYIKGCHPHRGHFINTAN ISVALILIW 19 C P G? TMT_monDom ERYRTL-TLCPGQGADYQK ALLAVAGSW 19 - L G? MEL1_homSa DRYLVITRPLATFGVASKRR AAFVLLGVW 20 T P Gq MEL2_anoCa DRYCVITKPLQSIKRTSKKR TCIIIVFVW 20 T P Gq
While it might seem straightforward to thread any opsin onto its best fit among the five newly available crystallographic structures, that does not work for distantly related paralogs beyond the universal 7-transmembrane feature because loop regions can be of quite different length and so lack discernable alignability, having diverged greatly in amino acid sequence (even though they are all ultimately homologous).
While these structures entail various compromises (such as replacemente of C3 by lysozylme and deletion of carboxy tail to enable stable crystallization), they are hugely important to annotation transfer of sequence/function relationships via comparative genomics. Yet most of the 18 vertebrate opsin orthology classes have only remote models to date and even these can be indeterminate for mid-loop C2 residues (indicative of flexible conformation).
Gene PDB Protein PubMed Best human opsin Next Best Signaling RHO1_bosTau 1JFP 3C9M 2J4Y bovine rod rhodopsin 17825322 RHO1_homSap 93% SWS1_homSap 45% Gt GNAT1 raises cGMP MEL1_todPac 2Z73 2ZIY squid melanopsin 18480818 MEL1_homSap 43% PER1_homSap 30% Gq GNAQ? inositol trisphosphate ADORA2A_homSap 3EML adenosine receptor 2A 18832607 MEL1_homSap 27% ENCEPH_homSap 27% Gs GNAT3 raises cAMP ADRB1_melGal 2VT4 beta 1 adrenergic receptor 18594507 MEL1_homSap 29% ENCEPH_homSap 25% Gs GNAT3 raises cAMP ADRB2_homSap 2R4R beta 2 adrenergic receptor 17962520 MEL1_homSap 28% PER1_homSap 29% Gs GNAT3 raises cAMP
It has not proven feasible to predict loop conformations ab initio or from peptide libraries; it is folly to consider individual loop structure in isolation (rather than the cytoplasmic face in its entirety) or fail to specify the activation state being computed. Any predicted structure and special roles for individual residues must be consistent with the comparative genomics of close and even distant orthologs because binding relationships to Galpha and other proteins do not change rapidly in evolutionary time (as seen from heterologous substitution experiments). Even when a cytoplasmic loop seems to lack a definable structure, individual residues can be conserved over vast branch length times. That conservation must ultimately be explained.
Two new high resolution structures of squid melanopsin establish that the cytoplasmic face is not structurally homologous as a whole across paralogous opsin classes. We knew this already from comparative genomics alone but not specifically why. The xray structure exhibits unprecedented rigid extensions of transmembrane helices 5 and 6 of order 25 angstroms out into the cytoplasm, greatly constraining the intermediate residues of cytoplasmic loop C3. The proximal carboxy terminus also contributes importantly to the overall structure here.
The squid melanopsin structure, used at SwissModel, could readily predict the structure of the cytoplasmic face of all opsins of melanopsin class, of which 48 vertebrate sequences, 9 lophotrochozoan, 43 arthropod, and 1 cnidarian sequences are available here. The Gq signalling partner will be used throughout these melanopsins, yet what features the Galpha protein specifically recognizes in the cytoplasmic face remains obscure. It cannot really be the helical extensions per se because the Gq protein is structurally still homologous to its 15 paralogs (in vertebrates) of different signaling types.
The second cytoplasmic loop
In squid melanopsin, first six residues of cytoplasmic loop C2 also form an extensional helix in squid melanopsin beginning with the DRY motif and surprisingly terminating three residues before the deeply conserved proline (normally a helix breaker as in adrenergic receptors). This proline alone cannot define the two states through its cis and trans configurations because glycine or leucine can also characterize whole opsin orthology classes at this position. The last 3 residues of basic character HRR of loop C2 also preface a transmembrane helix as RAR do in turkey receptor.
Cytoplasmic loop C2 has conserved length of 16-20 in all opsins with much more rigid constraint within individual opsin classes (eg all vertebrate imaging opsins have length 19. The structure of the C2 loop of over 100 melanopsins can readily be modelled based on its closest match among the determined structures, currently squid melanopsin or bovine rhodopsin, with adenosine and adrenergic receptors serving as 'structural outgroup'.
On the basis of length (19 to rhodopsin, 20 to melanopsin), all the opsins except encephalopsin and RGR (both 16 residues) and TMT (18 residues subsequent to a deletion in amniote stem) have a structural model. This model is further constrained by predictable helical extensions of transmembrane helices into the cytoplasm, leaving only the mid-loop region to be predicted. It's not clear whether observed residue conservation -- both within and across orthology classes -- derives from structural importance or instead to Galpha binding specificity requirements.
The adenosine and adrenergic receptor structures -- however useful they might be for annotation transfer to the other 350 non-oderant human GPCR -- ultimately will not prove helpful in modeling the second cytoplasmic loop of opsins (squid melanopsin does that better already). Note C2 in these three structures is consistently stablized by a mid-loop hydrogen bond to the DRY residues. This constraint is not observed in squid melanopsins or other metazoan opsin classes; indeed it is not feasible because no hydrogen bond-capable residue consistently occurs there (in the comparative genomics sense of conserved residue). Ancestrally, this mid-loop bridge might be a derived feature fairly early in the stem of non-opsin GPCR.
(to be continued)
The carboxy-terminal tail
This distinctive region has quite baffling length variation across -- and sometimes within -- opsin classes. The extent of conservation also differs greatly, with no real universally conserved residues past the end of the seventh transmembrane helix. The observed terminal conservation pattern for a given opsin must be indicative of its functional importance, even as that stands today insufficiently explained by arrestin phosphoserine or cysteine palmitylation sites, opsin dimerization or other membrane macro organization, or interaction with Galpha proteins. Some interactions would seem to require commonality across all orthology classes (or larger assemblages such as ciliary opsins) while others do not.
The first hand-gapped alignment below illustrates these issues using RGR from 53 species. The alignment begins inside the last transmembrane segment with the Schiff base lysine K and continues past the NAxxY motif at a deeply invariant length (totallying 19 residues) to the "YR" motif found in almost all GPCR. This marks the beginning of the carboxy terminal cytoplasmic tail, which in RGR is fairly fixed at 23 residues, remain alignable and may extend the transmembrane helix but bear no resemblance to any other opsin or GPCR.
The degree of conservation establishes selection is at work. It appears that RGR must terminate in several charged (characteristically basic) residues irregardless of length indels. These could possibly associate electrostatically with membrane phospholipid or be important to initial establishment of topology. Mammals have in effect lost the YR motif though most have an R one residue later. This does not quite coincide with the advent of ERY or GRY mammals in cytoplasmic loop C2.
Conservation of G.WQ.L..Q has persisted for tens of billions of years and cannot be explained by helix or beta sheet per se -- possibly it is constrained by interaction with parts of the other cytoplasmic face. It appears that arrestin could recognize phospserine or threonine in almost all species but palmityolation cannot be widespread. A few species, such as guinea pig, microbat and armadillo may be exhibiting early stages of pseudogenization or at least partial loss of function.
Absent any experimental information or relevent 3D structure or capacity for annotation transfer from homologous regions, the specifics of individual residue and residue patch conservation will remain difficult to explain.
K..PT.NA..YaLG.E.yr .G.Wq.L..q..........k.K >RGR_homSap KMVPTINAINYALGNEMVC RGIWQCLSPQKRE-----KDRTK RGR_homSap KMVPTINAINYALGNEMVC RGIWQCLSPQKREKDRTK >RGR_panTro KMVPTINAINYALGNEMVC RGIWQCLSPQKSE-----KDRTK RGR_panTro ................... ...........S...... >RGR_gorGor KMVPTINAINYALGNEMVC RGIWQCLSPQKSK-----KDRTK RGR_ponPyg ................... ...........S...... >RGR_ponPyg KMVPTINAINYALGNEMVC RGIWQCLSPQKSE-----KDRTK RGR_gorGor ................... ...........SK..... >RGR_nomLeu KMVPTINAVNYALGNEMVC RGIWQCLSPQKSE-----KDRAK RGR_nomLeu ........V.......... ...........S....A. >RGR_macMul KMVPTINAINYALGNEMVC RGIWQCLSPQKSE-----KDRAK RGR_macMul ................... ...........S....A. >RGR_papHam KMVPTINAINYALGNEMVC RGIWQCLSPQKSE-----KDRAK RGR_papHam ................... ...........S....A. >RGR_calJac KMVPTIDAINYALGNEMIC RGIWQCLSPQKSE-----KDRTK RGR_calJac ......D..........I. ...........S...... >RGR_tarSyr KTVPTINAYHYALGSEMVC RGIWQCLSPHSSE-----..... RGR_tarSyr .T......YH....S.... .........HSS. >RGR_otoGar KTVPTINAVNYALGSEMVC RGIWQCLSLQRSK-----QDGAK RGR_otoGar .T......V.....S.... ........L.RSKQ.GA. >RGR_micMur KTVPTINAINYALGSETVC RGIWQCLSPQRSE-----QDRAK RGR_micMur .T............S.T.. ..........RS.Q..A. >RGR_tupBel KMVPTVNAVNYALGSETIC RGIWGCLSP-KRE-----RDRAR RGR_tupBel .....V..V.....S.TI. ....G....KR-.R..AR >RGR_musMus KTMPTINAINYALHREMVC RGTWQCLSPQKSK-----KDRTQ RGR_musMus .TM..........HR.... ..T........SK....Q >RGR_ratNor KTMPTINAINYALRSEMVC RGTWQCRSAQKSK-----QDRTQ RGR_ratNor .TM..........RS.... ..T...R.A..SKQ...Q >RGR_cavPor KTVPTINAINYSLG----- RGPWQSLEMQRSK-----QD RGR_cavPor .T.........S..R---- -.P..S.EM.RSKQ. >RGR_dipOrd KMVPTVNAINYALCNELLC GGFSLGLLPQKGK-----QDRTQ RGR_dipOrd .....V.......C..LL. G.FSLG.L...GKQ...Q >RGR_oryCun KTVPTVNAVNYALGSEVIR RGIWQCLLPQRSV-----RGRAQ RGR_oryCun .T...V..V.....S.VIR .......L..RSVRG.AQ >RGR_ochPri KAVPTVNAINYALGSEVIR RGIWQCLLPQRSV-----RDRAQ RGR_ochPri .A...V........S.VIR .......L..RSVR..AQ >RGR_bosTau KAVPTVNAMNYALGSEMVH RGIWQCLSPQRRE-----HSREQ RGR_bosTau .A...V..M.....S...H ..........R..HS.EQ >RGR_susScr KMVPTVNAINYALGGEMVH RGIWQCLSPQRRE-----RDREQ RGR_susScr .....V........G...H ..........R..R..EQ >RGR_canFam KAAPTINAIHYALGGDMVH GGLWQCLSPQRSQ-----PDRAR RGR_canFam .AA......H....GD..H G.L.......RSQP..AR >RGR_felCat kaVPTINAINYALGSEMVH RGIWQCLSPQGSG-----LDRAR RGR_felCat .A............S...H ..........GSGL..AR >RGR_equCab KTVPTINAVNYALGSEMLH RGIWQCLSPQKSE-----RDRAQ RGR_equCab .T......V.....S..LH ...........S.R..AQ >RGR_myoLuc KMVPTVNAVNYALGS---- -GIWQRLSLQ............. RGR_myoLuc .....V..V.....S---- -....R..L. >RGR_pteVam KMAPTINAVNYALGSEMVQ RGIWQCLSPQRSE-----RDHAQ RGR_pteVam ..A.....V.....S...Q ..........RS.R.HAQ >RGR_sorAra KTVPTVNALHYGLGSGMVQ NGFRKGLWLQRRE-----RERAL RGR_sorAra .T...V..LH.G..SG..Q N.FRKG.WL.R..RE.AL >RGR_eriEur ktVPTVNAVHYVLGSEKVH KGFWQCFSPQRSE-----QDRAR RGR_eriEur .T...V..VH.V..S.K.H K.F...F...RS.Q..AR >RGR_loxAfr KAVPVINACHYALGSEVVR GGIWQYLSRQRGESPLRARDRTH RGR_loxAfr .A..V...CH....S.V.R G....Y..R.RG.SPLRAR DRTH >RGR_proCap KAVPIVNACHYALGSETVH RGIWQCLSRQRGESPPRTRDRTQ RGR_proCap .A..IV..CH....S.T.H ........R.RG.SPPRTR DRTQ >RGR_echTel KAVPIVNACHYALGSETVH RGIWQCLSRQRGESPPRTRDRTQ RGR_echTel .A..IV..CH....S.T.H ........R.RG.SPPRTR DRTQ >RGR_choHof KTMPTINAFQYALGSETVC RDIWQCLPRLRSMGRSSGHD RGR_choHof .TM.....FQ....S.T.. .D.....PRLRSMGRSSGH D >RGR_dasNov KTMPTVNALYYALGRESVH RNA RGR_dasNov .TM..V..LY....R.S.H .NA >RGR_ornAna KTVPVIDAFTYALRNEDYR GGIWQFLTGQKIERV-EVENKIK RGR_ornAna .T..V.D.FT...R..DYR G....F.TG..I.RVEVEN KIK >RGR_xenTro KTSPAVNAYVYGLGNENYR GGIWQYLTGQKLEKA-ETDNKTK RGR_xenTro .TS.AV..YV.G....NYR G....Y.TG..L..AE.DN KTK >RGR_xenLae KISPAVNAYVYGLGNENYR GGIWLYLTGQKLEKA-ETDSRTK RGR_xenLae .IS.AV..YV.G....NYR G...LY.TG..L..AE.DS RTK >RGR1_danRer KTSPTFNVFVYALGNENYR GGIWQLLTGQKIESP-AIENKSK RGR1_danRe .TS..F.VFV......NYR G....L.TG..I.SPAIEN KSK >RGR1_takRub KTCPTINVFLYALGNENYR GGIWQFLTGEKIEAP-QIENKSK RGR1_gasAc .TS..F.VFL......NYR G....L.TGE.IDVPQIEN KSK >RGR1_tetNig KTCPTVNVFLYALGNENYR GGIWQFLTGEKIETP-QLENKTK RGR1_gadMo .TA..F.VFL......NYR G....L.TGE.I.VPQIEN KSK >RGR1_gasAcu KTSPTFNVFLYALGNENYR GGIWQLLTGEKIDVP-QIENKSK RGR1_takRu .TC....VFL......NYR G....F.TGE.I.APQIEN KSK >RGR1_oryLat KTSPTFNPLLYALGNENYR GGIWQFLTGEKIHVP-QDDNKSK RGR1_tetNi .TC..V.VFL......NYR G....F.TGE.I.TPQLEN KTK >RGR1_gadMor KTAPTFNVFLYALGNENYR GGIWQLLTGEKIEVP-QIENKSK RGR1_oryLa .TS..F.PLL......NYR G....F.TGE.IHVPQDDN KSK >RGR2_danRer KTSPIFHAVLYAYGNEFYR GGVWQFLTGQK-----SAD-KKK RGR2_danRe .TS.IFH.VL..Y...FYR G.V..F.TG..SADKKK >RGR2_pimPro KTSPIFHAAMYAYGNEFYR GGIWQFLTGQK-----PAD-KKK RGR2_pimPr .TS.IFH.AM..Y...FYR G....F.TG..PADKKK >RGR2_tetNig KTNPIFNALLYTFGNEFYR GGVWHFLTGHKIVDP-VLK-KSK RGR2_tetNi .TN.IF..LL.TF...FYR G.V.HF.TGH.IVDPVL.K SK >RGR2_gasAcu kTNPIFNALLYSFGNEFYR GGVWHFLTGQKMVDP-VVK-KSK RGR2_gasAc .TN.IF..LL.SF...FYR G.V.HF.TG..MVDPVV.K SK >RGR2_oryLat KTNPFFNALLYSFGNEFYR GGVWNFLTGQKIVEP-DVK-KSKQK RGR2_hipHi .TN.IF..LL.SF...FYR G.V.HF.TG..IVDPVV.K SK >RGR2_oncMyk KTNPISNAWLYSFGNEFYR GGVWQFLTGQKFTEP-VVV-KLKGR RGR2_oryLa .TN.FF..LL.SF...FYR G.V.NF.TG..IVEPDV.K SKQK >RGR2_espLuc KMNPIFNALLYSFGNEFYR GGVWQFLTGQKFTEL-VVV-KLKGR RGR2_poeRe .TN.IF..FL.SF...FYR G.V.NF.TG..IVEPDV.K SK >RGR2_gadMor KTNPISNALLYSFGNESYR SGVWHFLTGQKFVEP-SFK-KIK RGR2_oncMy .TN.IS..WL.SF...FYR G.V..F.TG..FTEPVVVK LKGR >RGR2_poeRet KTNPIFNAFLYSFGNEFYR GGVWNFLTGQKIVEP-DVK-KSK RGR2_espLu ..N.IF..LL.SF...FYR G.V..F.TG..FTELVVVK LKGR >RGR2_hipHip KTNPIFNALLYSFGNEFYR GGVWHFLTGQKIVDP-VVK-KSK RGR2_gadMo .TN.IS..LL.SF...SYR S.V.HF.TG..FVEPSF.K IK
Peropsin exhibits greater conservation both in its post-K helix and in its cytoplasmic tail than RGR. The FR motif is perfectly conserved throughout vertebrates. Length, ancestrally 32 residues, experienced an era of variability in amniotes but then settled down to a fixed 35 residues in mammals. The differance alignment shows that a central motif EITISN conserved in early vertebrates changed character completely (to TMPVTS) in mammals, though the earlier motif still appears faded in platypus. A cysteine conserved back to invertebrates might be palmitoylated; conserved serines and threonines offer potential phosphorylation sites.
The cytoplasmic tail of peropsin is completely unalignable to RGR. Unlike RGR, tblastn of peropsin tail against whole human genome elicits matches to imaging opsins and a GPCR (neuropeptide Y receptor). While these matches are weak and largely driven by the last transmembrane section alone, 3 early tail residues (*) emerge as possible conserved residues. Whether or not homologically valid, this suggests modeling of the first 9 residues of peropsin tail by known bovine rhodopsin structure.
* * * peropsin KSSTFYNPCIYVVANKKFR RAMLAMFKC KS+T YNP IYV N++FR +L +F C LWS opsin KSATIYNPVIYVFMNRQFR NCILQLF RHO opsin KSAAIYNPVIYIMMNKQFR NCMLTTICC NPY2R GPCR ..STFANPLLYGWMNSNYR KAFLSAFRC Conserved ksstfynpciyv.ankkFR rAm.aMfkCqthq.mpvts.lpm.vsq.pl.sgr. PER_homSap KSSTFYNPCIYVVANKKFR RAMLAMFKCQTHQTMPVTSILPMDVSQNPLASGRI PER_homSap KSSTFYNPCIYVVANKKFR RAMLAMFKCQTHQTMPVTS ILPMDVSQNPLASGRI PER_panTro KSSTFYNPCIYVVANKKFR RAMLAMFKCQTHQTMPVTSILPMDVSQNPLASGRI PER_panTro ................... ................... ................ PER_gorGor ksstfynpciyvvankKFR RAMLAMFKCQTHQTMPVTSILPMDVSQNPLASGRI PER_gorGor ................... ................... ................ PER_ponPyg KSSTFYNPCIYVVANKKFR RAMLAMFKCQTHQTMPVTSILPMDVSQNPLASGRI PER_ponPyg ................... ................... ................ PER_nomLeu KSSTFYNPCIYVVANKKFR KAMLAMFKWPNHQTMPGTSILPMDVSQNPLTSGKI PER_nomLeu ................... K.......WPN.....G.. ...........T..K. PER_macMul KSSTFYNPCIYVVANKKFR RAMLAMFKCQTHQTMPVTSILPMDVSQNPLASGRI PER_macMul ................... ................... ................ PER_papHam KSSTFYNPCIYMVANKKFR RAMLAMFKCQTHQTMPVTSILPMDVSQNPLASGRI PER_papHam ...........M....... ................... ................ PER_calJac KSSTFYNPCIYVVANKKFR RAMLAMLKCQTHQTMPVTSVLPMDISQNPLASGRI PER_calJac ................... ......L............ V....I.......... PER_tarSyr ksstfynpciyvvankKFR RAMFAMLKCQTYQAMPATSSLPMNVSQNPLTSGKN PER_tarSyr ................... ...F..L....Y.A..A.. S...N......T..KN PER_otoGar KSSTFYNPCIYVVANKKFR RAMFAMFKCQTHQAMAVTSILPMDISQNPLASRRI PER_otoGar ................... ...F.........A.A... .....I.......R.. PER_micMur KSSTFYNPCIYVIANKKFR RAMFAMFKCQTHQAMPVTSIFPMGVSQNPLPSGRT PER_micMur ............I...... ...F.........A..... .F..G......P...T PER_tupBel KSSTFYNPCIYVLANKKFR KAMCAMFKCQTHQAMSVTSVLPMASSPRPLAPARV PER_tupBel ............L...... K..C.........A.S... V...AS.PR...PA.V PER_musMus KSSTFYNPCIYVAAHKKFR KAMLAMFKCQPHLAVPEPSTLPMDMPQSSLAPVRI PER_musMus ............A.H.... K.........P.LAV.EP. T....MP.SS..PV.. PER_ratNor KSSTFYNPCIYVAANKKFR KAMFAMLKCQPHQAMPEPSTLAMGVPHSPLAPARI PER_ratNor ............A...... K..F..L...P..A..EP. T.A.G.PHS...PA.. PER_ochPri KSSTFYNPCIYVAANKRSR RAMFAMFKCQIPQAKPVTSLSPRDVSQSPLSSGRT PER_cavPor ............I...... ...F...Q.....AV..A. .....A..S....... PER_cavPor KSSTFYNPCIYVIANKKFR RAMFAMFQCQTHQAVPVASILPMDASQSPLASGRI PER_dipOrd ................... ......L......A..... ................ PER_speTri KSSTFYNPCIYVAANKRFR RAMFAMFKCQTHQAMPVTSVLPMDVSQSPRASGRI PER_speTri ............A...R.. ...F.........A..... V.......S.R..... PER_oryCun KSSTFYNPCIYVAANKRFR RAMFAMFKCQTHQAMPVTSVLPMDVSQNPLPSGII PER_ochPri ............A...RS. ...F......IP.AK.... LS.R....S..S...T PER_dipOrd KSSTFYNPCIYVVANKKFR RAMLAMLKCQTHQAMPVTSILPMDVSQNPLASGRI PER_oryCun ............A...R.. ...F.........A..... V..........P..I. PER_bosTau KSSTFYNPCIYVIANKKFR RAMLAMFKCQTTQAMPVTSVLPMDVPQNPLTSGKV PER_bosTau ............I...... ...........T.A..... V.....P....T..KV PER_turTru KSSTFYNPCIYVIANKKFR RAMLAMFKCQTHQAMPMESILPMDVPQNPLTSGKV PER_turTru ............I...... .............A..ME. ......P....T..KV PER_susScr KSSTFYNPCIYVIANKKFR RAMLAMFKCQTHQAMPLESTLPMDVPQNPLASGRV PER_vicVic ............I...... .............A..M.. ......P....T...L PER_vicVic KSSTFYNPCIYVIANKKFR RAMLAMFKCQTHQAMPMTSILPMDVPQNPLTSGRL PER_susScr ............I...... .............A..LE. T.....P........V PER_canFam KSSTFYNPCIYVVANKKFR KAIFAMFKCQTHQAMPGTSILPMDVSQNPLASGRN PER_canFam ................... K.IF.........A..G.. ...............N PER_felCat ksstfynpciyvvankKFR KAMFAMFKCENRQPMPVTSILPMDVSQNPLTSGRK PER_felCat ................... K..F.....ENR.P..... ...........T...K PER_equCab KSSTFYNPCIYVVANKKFR RAMFAMFKCQTHRAMPVTSILPMDVPQNQLASGRI PER_equCab ................... ...F........RA..... ......P..Q...... PER_myoLuc KSSTFYNPCIYVVANKKFR RAMFAMFKCQTHQTMTTMSFLPMDVPQNPLTSGRI PER_myoLuc ................... ...F...........TTM. F.....P....T.... PER_pteVam KSSTFYNPCIYVVANKKFR RAMFAMFKCQDHQSMPVTSVLPMDVPQNPLTSGRI PER_pteVam ................... ...F......D..S..... V.....P....T.... PER_eriEur KSSTFYNPCIYVLANKKFR RAMFAMFKCQTHQAMPVTNTLPMDIPQK-LDSRRN PER_eriEur ............L...... ...F.........A....N T....IP.K-.D.R.N PER_sorAra KSSTFYNPCIYVVANKKFR RAMSAMLTCRAQGAMPAASTLPMDAAHSPQASGRN PER_sorAra ................... ...S..LT.RAQGA..AA. T....AAHS.Q....N PER_loxAfr KSSTFYNPCIYVVANKKFR RAMFAMFKCQTHQAEPVTCILPMNVSQNPLAAGRI PER_loxAfr ................... ...F.........AE...C ....N.......A... PER_echTel ksstfynpciyvvankKFR RAMFALLQCQPQEARRVTSILPMNVSQNPMASGRL PER_echTel ................... ...F.LLQ..PQEARR... ....N.....M....L PER_proCap KSSTFYNPCIYVVANKKFR RAMLAMFKCQTHQAVPVTNILPMTVSQNSSASGRI PER_proCap ................... .............AV...N ....T....SS..... PER_choHof KSSTFYNPCIYVVANKKFR TIMFAMLKCQTHQAVPVTSILPMNVSENPLASGRI PER_choHof ................... TI.F..L......AV.... ....N..E........ PER_dasNov KSSTFYNPCIYVVANKKFR RAIFAMLKCQTHQAMPVMSILPMNVSENPLASGRI PER_dasNov ................... ..IF..L......A...M. ....N..E........ PER_monDom KSSTFYNPCIYVAANKKFR RAISAMIRCQTHQSMPISNALPMN PER_monDom ............A...... ..IS..IR.....S..ISN A...N PER_macEug KSSTFYNPCIYVAANKKFR RAISAMMRCETHQSMPVSNALPLNLT PER_macEug ............A...... ..IS..MR.E...S...SN A..LNLT PER_ornAna KSSTFYNPCIYVVANKKFR RAMLSMVQCQTHREITITDVLPMNRSRSPLTL PER_ornAna ................... ....S.VQ....REITI.D V...NR.RS..TL PER_galGal KSSTFYNPCIYVIANKKFR RAILAMVRCQTRQEITISNALPMTVSLSALTS PER_galGal ............I...... ..I...VR...R.EITISN A...T..LSA.T. PER_taeGut KSSTFYNPCIYVIANKKFR RAILAMVRCQTRQEITINNALPMSVSQSALTSQNSSHLPA PER_taeGut ............I...... ..I...VR...R.EITINN A...S...SA.T.QNSSHL PA PER_anoCar KSSTFYNPCIYVIANKRFR RAILAMIRCQTRQEITINNVLPMSVSQSTIA PER_anoCar ............I...R.. ..I...IR...R.EITINN V...S...STI. PER_xenTro KSSTFYNPCIYVIANKKFR RAILSMVQCKSRQEVTLDNHFPMNVSQSTLTT PER_xenTro ............I...... ..I.S.VQ.KSR.EVTLDN HF..N...ST.TT PER_danRer KSSTFYNPCIYVIANKKFR RAIIGMIRCQTRQRVTINNQLPMMASSVPLNP PER_danRer ............I...... ..IIG.IR...R.RVTINN Q...MA.SV..NP PER_gasAcu KSSTFYNPCIYVIANKKFR RAIIGMVRCQTRQRITINSQVPMTTSQQPLTQ PER_gasAcu ............I...... ..IIG.VR...R.RITIN. QV..TT..Q..TQ PER_oryLat KSSTFYNPCIYVIANKKFR RAIIGMIRCQTRQRITISTQVPMTISQQPLTQ PER_oryLat ............I...... ..IIG.IR...R.RITIST QV..TI..Q..TQ PER_takRub KSSTFYNPCIYVIANKKFR RAIIGMIRCQTRQQMTINTEIPMTTSQQTATQ PER_takRub ............I...... ..IIG.IR...R.Q.TINT EI..TT..QTATQ PER_tetNig KSSTFYNPCIYVITNKKFR QAIIGMIRCQTRQQITINTDIPMTASQQTLTQ PER_tetNig ............IT..... Q.IIG.IR...R.QITINT DI..TA..QT.TQ PER_calMil KSSTFYNPCIYVIANKKFR KAIMAMICCQNRQEITINHTLPMTISRVPLTE PER_calMil ............I...... K.IM..IC..NR.EITINH T...TI.RV..TE PER1b_sacK KIPAVFNPVIYVALNPEFR KYFGKTIGCRRKRKKPIAVRLNGSEQNVENTI PER1b_sacK .IPAVF..V...AL.PE.. KYFGKTIG.RRKRKK.IAV R.NGSEQNVENTI
Reference sequence collection
Cytoplasmic loop C2 from 101 melanopsins
species helix bridge area hel transmemb Le 7 9 MEL1_homSa DRYLV ITRPLATFGVAS KRR AAFVLLGVW 20 T P MEL1_panTr DRYLV ITRPLATFGVAS KRR AAFVLLGVW 20 T P MEL1_gorGo DRYLV ITRPLATFGVAS KRR AAFVLLGVW 20 T P MEL1_ponAb DRYLV ITRPLATIGVAS KRR AAFVLLGVW 20 T P MEL1_rheMa DRYLV ITRPLATIGVAS KRR AAFVLLGVW 20 T P MEL1_calJa DRYLV ITRPLATIGVAS TKR AAFVLLGVW 20 T P MEL1_micMu DRYLV ITRPLASVGTAS KRR AGLVLLGVW 20 T P MEL1_otoGa DRYLV ITRPLTTVGVAS KRR AALVLLGVW 20 T P MEL1_musMu DRYLV ITRPLATIGRGS KRR TALVLLGVW 20 T P MEL1_ratNo DRYLV ITRPLATIGMRS KRR TALVLLGVW 20 T P MEL1_nanEh DRYLV ITRPLATIGVAS KRR TALVLLGVW 20 T P MEL1_phoSu DRYLV ITRPLATIGMGS KRR TALVLLGIW 20 T P MEL1_dipOr DRYLV ITRPLATIGVTS KRR TAFVLLGVW 20 T P MEL1_cavPo DRYLV ITRPLATIGVAS KRQ AALVLLGVW 20 T P MEL1_speTr DRYLV ITRPLATIGMAS KKR AAFFLLGVW 20 T P MEL1_oryCu DRYLV ITRPLAAVGMVS KKR AGLVLLGVW 20 T P MEL1_ochPr DRYLV ITRPLAAVGMVS KRR TGLVLLGVW 20 T P MEL1_bosTa DRYLV ITRPLATVGMVS KRR AALVLLGVW 20 T P MEL1_turTr DRYLV ITRPLATVGMVS KRR AALVLLGVW 20 T P MEL1_susSc DRYLV ITHPLATVGMVS KRR AALVLLGVW 20 T P MEL1_equCa DRYLV ITRPLATVGVVS KRW AALVLLGIW 20 T P MEL1_felCa DRYLV ITHPLATIGVVS KRR AALVLLGVW 20 T P MEL1_canFa DRYLV ITHPLAAVGVVS KRR AALVLLGVW 20 T P MEL1_myoLu DRYLV ITRPLA-IGVVS KRR AALVLLGVW 19 T P MEL1_pteVa DRYLV ITRPLAAIGVVS KRR AALVLLGVW 20 T P MEL1_eriEu DRYLV ITRPLATIGVVS KRR VALVLLGVW 20 T P MEL1_loxAf DRYLV ITRPLATIGVVS KRR AALVLLGIW 20 T P MEL1_proCa DRYLV ITRPLATIGVVS KRR TALVLLGTW 20 T P MEL1_echTe DRYLV ITRPLATIGVVS KRR AALVLLVIW 20 T P MEL1_smiCr DRYFV ITRPLASIGMIS KKK TGLILLGVW 20 T P MEL1_monDo DRYFV ITRPLASIGVIS KKK TGFILLGVW 20 T P MEL1_ornAn DRYFV ITRPLASIGVIS KKR ALLILTGVW 20 T P MEL1_anoCa DRYFV ITRPLASIGAMS TKK ALLILSGVW 20 T P MEL1_taeGu DRYFV ITKPLASVGVTS KKK ALIILVGVW 20 T P MEL1_galGa DRYFV ITKPLASVRVMS KKK ALIILVGVW 20 T P MEL1_xenTr DRYFV ITRPLTSIGVMS KKR AVLILSGVW 20 T P MEL1_danRe DRYFV ITRPLASIGVLS QKR ALLILLVAW 20 T P MEL1_danRe DRYFV ITRPLASIGVMS RKR ALLILSAAW 20 T P MEL1_takRu DRYFV ITRPLTSIGVLS RKR AFVILMTVW 20 T P MEL1_gasAc DRYFV ITRPLTSIGMMS RRR ALLILMGAW 20 T P MEL1_oryLa DRYFV ITRPLTSIGVLS RKR ALLILSAAW 20 T P MEL1_calMi DRYFV ITRPLASIGVLS HRR AGLIILSLW 20 T P MEL1_petMa DRYLV LTRPLASIGAMS KRR AMYITAAVW 20 T P MEL2_galGa DRYLV ITKPLRSIQWTS KKR TIQIIAAVW 20 T P MEL2_anoCa DRYCV ITKPLQSIKRTS KKR TCIIIVFVW 20 T P MEL2_xenLa NRYIV ITKPLQSIQWSS KKR TSQIIVLVW 20 T P MEL2_danRe DRYLV ITKPLQTIQWNS KRR TGLAILCIW 20 T P MEL2_tetNi DRYVV ITKPLQTIRRSS KRR TALAILMVW 20 T P MEL2_gasAc DRYLV ITKPLQAIHWGS KRR TTLAILLVW 20 T P MEL1_plaDu DRFYV ITNPLGAAQTMT KKR AFIILTIIW 20 T P MEL1_capCa DRYMV IAKPFYAMKHVS HKR SLIQIILAW 20 A P MEL1_helRo DRYLV VGQPLAMLNQSH FRR SFYHVLIIW 20 G P MEL1_todPa DRYNV IGRPMAASKKMS HRR AFIMIIFVW 20 G P MEL1_schMe DRYFV IAQPFQTMKSLT IKR AIIMLVFVW 20 A P MEL2_schMa DRYLV IATPFESVFQTT PRR TLLLMLFLW 20 A P MEL1_lotGi DRYLV ITSPFTAMRNMT HKR AFLMIVGVW 20 T P MEL1_sepOf DRYNV IGRPMAASKKMS HRR AFLMIIFVW 20 G P MEL1_entDo DRYNV IGRPMAASKKMS HRR AFLMIIFVW 20 G P UVV_camAb DRYST IARPLDGKLS RGQ VLLLIMLIW 18 A P UVV_catBo DRYST IARPLDGKLS RGQ VILLIALIW 18 A P UVV_apiMe DRYST IARPLDGKLS RGQ VILFIVLIW 18 A P BLU_apiMe DRYRT ISCPIDGRLN SKQ AAVIIAFTW 18 S P BLU_ DRoMe DRYKT ISNPIDGRLS YGQ IVLLILFTW 18 S P BLU_manSe DRYKT ISSPLDGRIN TVQ AGLLIAFTW 18 S P UVV1_droMe DRYNV ITKPMNRNMT FTK AVIMNIIIW 18 T P UVV1_pedHu DRCET ITNPL-QKSG KKK AFLLAAFTW 18 T P UVV_manSe DRHST ITRPLDGRLS EGK VLLMVAFVW 18 T P UVV_papXu DRHST ITRPLDGRLS RGK VLLMMVCVW 18 T P UVV2_droMe DRFNV ITRPMEGKMT HGK AIAMIIFIY 18 T P UVV2_pedHu DRYQV IVHPLER-KT KAA VYFQILLIW 18 V P LWS_nemVe DRYIV IVHPMKKIMT RKK AALMIVGVW 18 V P LWS_pedHu DRYNV IVKGLSAKPMT IKM ALLNILFVW 19 V G LWS_vanCa DRYNV IVKGIAAKPLT ING AMLRVLGIW 19 V G LWS_papXu DRYNV IVKGIAAKPMT ING ALLRILGIW 19 V G LWS_helSa DRYNV IVKGIAAKPMT ING ALLRVFGIW 19 V G LWS_pieRa DRYNV IVKGIAAKPMT INS ALLRILGVW 19 V G LWS_manSe DRYNV IVKGIAAKPMT SNG ALLRILGIW 19 V G MWS2_droMe DRYNV IVKGINGTPMT IKT SIMKILFIW 19 V G LWS_rhoPr DRYNV IVKGISAKPMT NKT AMLRILLVW 19 V G LWS_meoOe DRYNV IVKGISGTPLS QKN TTLQVLFVW 19 V G LWS_catBo DRYNV IVKGLSAKPMT ING ALLRILGIW 19 V G LWS_schGr DRYNV IVKGLSAKPMT NKT AMLRILFIW 19 V G LWS_triCa DRYNV IVKGLSAQPLT KKG AMLRILIIW 19 V G LWS2_apiMe DRYNV IVKGLSGKPLS ING ALIRIIAIW 19 V G LWS_bomTe DRYNV IVKGLSGKPLT ING ALLRILGIW 19 V G MWS_calEr DRYNV IVKGMAGQPMT IKL AIMKIALIW 19 V G MWS1_droMe DRYQV IVKGMAGRPMT IPL ALGKIAYIW 19 V G LWS_droMe DRYCV IVKGMARKPLT ATA AVLRLMVVW 19 V G LWS_arcGr DRYNV IVKGVAAEPLT SKG ASIRILFVW 19 V G LWS_eupSu DRYNV IVKGVAATPLT NKG AFARNIFSW 19 V G LWS_camLu DRYNV IVKGVAGEPLS TKK ASLWILTVW 19 V G LWS_proMi DRYNV IVKGVAGEPLS TKK ASLWILIVW 19 V G LWS_holCo DRYNV IVKGVSAEPLT SGG AMMRIAGTW 19 V G LWS_homGa DRYNV IVKGVSATPLT TNG AMLRNLFSW 19 V G LWS_neoAm DRYNV IVKGVSGEPLT NSG AMTRIAGTW 19 V G LWS_neoOe DRYNV IVKGVSGKPLS QKN ATLQVLFVW 19 V G LWS_mysDi ERYNV IVKGVSSKPLS VKG AITRIVLTW 19 V G LWS1_apiMe DRYNV IVKGMSGTPLT IKR AMLQILGIW 19 V G LWS_limPo DRYNV IVRGMAAAPLT HKK ATLLLLFVW 19 V G LWS_limPo DRYNV IVRGMAAAPLT HKK ATLLLLFVW 19 V G LWS_ixoSc DRYNV IVRGVAAAPLT HKR AALMIFFVW 19 V G ADRB2_homS DRYFA ITSPFKYQSLLT KNK ARVIILMVW 20 T P ADRA2A_hom DRYWS ITQAIEYNLKRT PRR IKAIIITVW 20 T A ADRA2C_hom DRYWS VTQAVEYNLKRT PRR VKATIVAVW 20 T A HTR1A_homS DRYWA ITDPIDYVNKRT PRR AAALISLTW 20 T P CHRM1_homS DRYFS VTRPLSYRAKRT PRR AALMIGLAW 20 T P DRD2_homSa DRYTA VAMPMLYNTRYS KRR VTVMISIVW 21 A P TAAR9_homS DRYIA VTDPLTYPTKFT VSV SGICIVLSW 20 T P ADRA2B_hom DRYWA VSRALEYNSKRT PRR IKCIILTVW 20 S A
Reference collection of 352 cytoplasmic loop sequences from all opsins
The second column contains the C2 loop sequences. The third column shows the continuation into transmembrane helix 4. The end of the loop region is determined by countback from the invariant tryptophan at position 160 in squid melanopsin as well as from crystallography and transmembrane prediction tools. Other columns show loop length and values at potentially informative positions 7 and 9 (which are generally characteristic of orthology class).
RHO1_homSa ERYVVVCKPMSNFRFGENH AIMGVAFTW 19 C P RHO1_bosTa ERYVVVCKPMSNFRFGENH AIMGVAFTW 19 C P RHO1_ornAn ERYIVVCKPMSNFRFGENH AIMGVAFTW 19 C P RHO1_monDo ERYVVVCKPMSNFRFGENH AIIGVAFTW 19 C P RHO1_galGa ERYVVVCKPMSNFRFGENH AIMGVAFSW 19 C P RHO1_calMi ERYVVVCKPMSNFRFGTNH AIMGVAFTW 19 C P RHO1_xenTr ERYVVVCKPMANFRFGENH AIMGVVFTW 19 C P RHO1_latCh ERYVVVCKPMSNFRFGENH AIMGVIFTW 19 C P RHO1_neoFo ERYIVVCKPISNFRFGENH AIMGVVFTW 19 C P RHO1_angAn ERWVVVCKPMSNFRFGENH AIMGLAFTW 19 C P RHO1_takRu ERYIVVCKPMTNFRFGEKH AIAGLVFTW 19 C P RHO1_leuEr ERYMVVCKPMANFRFGSQH AIIGVVFTW 19 C P RHO1_petMa ERYIVICKPMGNFRFGSTH AYMGVAFTW 19 C P RHO1_letJa ERYIVICKPMGNFRFGNTH AIMGVAFTW 19 C P RHO1_geoAu ERYIVICKPMGNFRFGNTH AIMGVALTW 19 C P RHO2_galGa ERYIVVCKPMGNFRFSATH AMMGIAFTW 19 C P RHO2_gekGe ERYIVICKPMGNFRFSATH AIMGIAFTW 19 C P RHO2_anoCa ERYIVVCKPMGNFRFSATH ALMGISFTW 19 C P RHO2_taeGu ERYIVICKPMGNFRFSASH ALMGIAFTW 19 C P RHO2_podSi ERYIVVCKPMGNFRFSSSH ALMGIAFTW 19 C P RHO2_pheMa ERYIVICKPMGNFRFSSSH AMMGISFTW 19 C P RHO2_latCh ERYIVVCKPMGNFRFASSH AIMGIAFTW 19 C P RHO2_geoAu ERYIVVCKPMGNFRFATTH AALGVVFTW 19 C P RHO2_neoFo ERYIVVCKPMGNFRFSNNH SIIGIVFTW 19 C P RHO1_anoCa ERYVVICKPMSNFRFGETH ALIGVSCTW 19 C P RHO1_conMy ERWMVVCKPVTNFRFGESH AIMGVMVTW 19 C P RHO2_ancDa ERYIVVCKPMGSFKFSSSH AMAGIAFTW 19 C P RHO2a_danR ERYIVVCKPMGSFKFSANH AMAGIAFTW 19 C P RHO2b_danR ERYIVVCKPMGSFKFSSNH AMAGIAFTW 19 C P RHO2c_danR ERYIVVCKPMGSFKFSSNH AFAGIGFTW 19 C P RHO2d_danR ERYIVVCKPMGSFKFSASH AFAGCAFTW 19 C P RHO2_oryLa ERYIVVCKPMGSFKFTATH SAAGCAFTW 19 C P RHO2_takRu ERYVVVCKPMGSFKFTGTH AAVGVAFTW 19 C P RHO2_gasAc ERYIVVCKPMGSFKFSGTH AGAGVLFTW 19 C P RHO2_hipHi ERYIVVCKPMGSFKFSGTH AGIGVLFTW 19 C P RHO2_mulSu ERYIVVCKPMGSFKFSGTH AGAGVAFTW 19 C P RHO2_oreNi ERYIVVCKPMGSFKFTGAH AGAGVLFTW 19 C P RHO2_pomMi ERYIVVCKPMGSFKFSGAH AGAGVALTW 19 C P SWS2_ornAn ERFLVICKPLGNLSFRGTH AIFGCAATW 19 C P SWS2_anoCa ERYLVICKPLGNFTFRGTH AIIGCAVTW 19 C P SWS2_utaSt ERFLVICKPLGNFSFRGTH AIIGCIITW 19 C P SWS2_taeGu ERFLVICKPLGNFTFRGSH AVLGCAITW 19 C P SWS2_galGa ERFLVICKPLGNFTFRGSH AVLGCVATW 19 C P SWS2_neoFo ERFLVICKPLGNFTFRSTH AIIGCVATW 19 C P SWS2_xenTr ERFLVICKPMGNFTFRESH AVLGCILTW 19 C P PIN_galGal ERYVVVCRPLGDFQFQRRH AVSGCAFTW 19 C P PIN_pheMad ERYLVICKPVGDFQFQRRH AVIGCLYTW 19 C P PIN_utaSta ERYLVICKPVGDFRFQQRH AVFGCVFTW 19 C P PIN_xenTro ERYLVICKPMGDFRFQQKH AILGCSFTW 19 C P PIN_bufJap ERYIVICKPMGDFRFQQRH AVMGCAFTW 19 C P PIN_podSic ERYLVICKPVGDFRFPARH AVLGCAFTW 19 C P PIN_calMil ERYIVICKPMGDFRFQQKH AVWGCLFTW 19 C P SWS1_homSa ERYIVICKPFGNFRFSSKH ALTVVLATW 19 C P SWS1_monDo ERFIVICKPFGNFRFNSKH AMMVVLATW 19 C P SWS1_smiCr ERFIVICKPFGNFRFNSKH AMMVVLATW 19 C P SWS1_tarRo ERFIVICKPFGNFRFSSKH AMMVVLATW 19 C P SWS1_taeGu ERYIVICKPFGNFRFNSRH ALLVVAATW 19 C P SWS1_anoCa ERYIVICKPFGNFRFNSRH ALLVVAATW 19 C P SWS1_utaSt ERYIVICKPFGNFRFNSKH ALLVVAATW 19 C P SWS1_galGa ERYIVICKPFGNFRFSSRH ALLVVVATW 19 C P SWS1_geoAu ERYIVICKPFGNFRFGSKH ALVAVGLTW 19 C P SWS1_neoFo ERYLVICKPIGNFRFGSKH SMIAVVAAW 19 C P SWS1_xenLa ERYIVICKPMGNFNFSSSH ALAVVICTW 19 C P SWS1_petMa ERYIVICKPFGNFRFGSIH SLFAFCLTW 19 C P SWS1_danRe ERYVVICKPFGSFKFGQGQ AVGAVVFTW 19 C P SWS1_oryLa ERYLVICKPFGAFKFGSNH ALAAVIFTW 19 C P SWS2_geoAu ERCLVICKPFGNIAFRGTH ALIRCGFAW 19 C P SWS2_takRu ERWLVVCKPLGNFIFKPDH AIVCCIFTW 19 C P SWS2_gasAc ERWLVICKPLGNFIFKPDH ALVCCAFTW 19 C P LWS_homSap ERWMVVCKPFGNVRFDAKL AIVGIAFSW 19 C P LWS_monDom ERWVVVCKPFGNVKFDAKL AMVGIIFSW 19 C P LWS_ornAna ERWIVVCKPFGNVKFDAKL AMVGIVFSW 19 C P LWS_anoCar ERWVVVCKPFGNVKFDAKL AVAGIVFSW 19 C P LWS_galGal ERWFVVCKPFGNIKFDGKL AVAGILFSW 19 C P LWS_xenTro ERWFVVCKPFGNIKFDGKL AATGIIFSW 19 C P LWS_neoFor ERWVVVCKPFGNIKFDGKW AAGGIIFSW 19 C P LWS_calMil ERWVVVCKPFGNVKFDGKW AAFGIIFSW 19 C P LWS_takRub ERWVVVCKPFGNVKFDAKW ATGGIVFSW 19 C P LWS_gasAcu ERWIVVCKPFGNVKFDAKW ATAGIVFSW 19 C P LWS_petMar ERWMVVCKPFGNIKFDGKI ATILIVFSW 19 C P LWS_letJap ERWMVVCKPFGNIKFDGKI AIILIVFSW 19 C P LWS_geoAus ERWMVVCKPFGNLKFDGKV AIVLIIFSW 19 C P VAOP_galGa ERYIVICRPVGNMRLRGKH AAQGIAFVW 19 C P VAOP_anoCa ERYVVICRPLGNMRLNGKH AALGVAFVW 19 C P VAOP_xenTr ERYIVICRPLGNLRLQGKH SALAIIFVW 19 C P VAOP_danRe ERFFVICRPLGNIRLRGKH AALGLVFVW 19 C P VAOP_rutRu ERFFVICRPLGNIRLRGKH AALGLLFVW 19 C P VAOP_takRu ERFFVICRPLGNMRLQAKH AAIGLLFVW 19 C P VAOP_petMa ERYFVICRPLGNFRLQSKH AVLGLAVVW 19 C P PPIN_anoCa DRAIVIAKPMGTITFTTRK AMIGVAVSW 19 A P PPIN_xenTr DRVFVVCKPMGTLTFTPKQ ALAGIAASW 19 C P PPIN_ictPu DRYMVVCRPLGAVMFQTKH ALAGVVFSW 19 C P PPIN_oncMy DRYVVVCRPMGAVMFQTRH AVGGVVLSW 19 C P PPIN_danRe ERCMVVCRPVGSISFQTRH AVFGVAVSW 19 C P PPIN_petMa DRFVVVCKPLGTLMFTRRH ALLGITWAW 19 C P PPIN_letJa DRFVVVCKPLGTLMFTRRH ALLGIAWAW 19 C P PPIN2_petM ERYVVVCKPLGGVHFGTQH GLCGVAISW 19 C P PARIE_utaS ERYNVVCQPLGTLQMSTKR GYQLLGFIW 19 C P PARIE_anoC ERYNVVCQPLGTLQMSTQR AYQLLGFIW 19 C P PARIE_xenT ERYNVVCEPIGALKLSTKR GYQGLVFIW 19 C P PARIE_takR ERYNVVCKPRAGLKLTMRR SIIGLLFVW 19 C P PARIE_gasA ERYNVVCRPRNALKLSMRR SIHGLLIVW 19 C P PARIE_danR ERYNVVCKPMAGFKLNVGR SCQGLLLVW 19 C P PER_homSap DRYLTICLPDVGRRMTTNT YIGLILGAW 19 C P PER_panTro DRYLTICLPDVGRRMTTNT YIGLILGAW 19 C P PER_nomLeu DRYLTICLPDVGRRMTTNT YIGLILGAW 19 C P PER_gorGor DRYLTICLPDVGRRMTTNT YIGLILGAW 19 C P PER_ponPyg DRYLTICLPDIGRRMTTNT YIGLILGAW 19 C P PER_macMul DRYLTICLPDIGRRMTTNT YIGMILGAW 19 C P PER_papHam DRYLTICLPDIGRRMTTNT YIGMILGAW 19 C P PER_otoGar DRYLTICRPDIGRRMTTNS YIGMILGAW 19 C P PER_tarSyr DRYLTICRPDIGRRMTTNT YVGMILGAW 19 C P PER_micMur DRYLTICRPDIGRRMTTHT YVGMILGAW 19 C P PER_cavPor DRYLTICRPDIGRRMTSHS YVGMILGAW 19 C P PER_ochPri DRYLTICQPDIGRRMTTHT YFGMILGAW 19 C P PER_oryCun DRYLTICHPDVGRRMTTRT YLGLILGAW 19 C P PER_calJac DRYLTICLPDIGRRMTTST YIIMILGAW 19 C P PER_canFam DRYLTICSPDTGRRMTTNT YISMILGAW 19 C P PER_felCat DRYLTICSPNSGRRMTTNT YISMILGAW 19 C P PER_susScr DRYLTICRPEAGRRMTTNT YISMILGAW 19 C P PER_vicVic DRYLTICRPDAGRRMTTNT YISMILGAW 19 C P PER_turTru DRYLTICCPGAGRRMTTNT YISMILGAW 19 C P PER_bosTau DRYLTICHPDAGRRMTANT YISMILGAW 19 C P PER_choHof DRYLTICHPDVGRRMTINT YISMILGAW 19 C P PER_dasNov DRYLTICRPDTGRRMTINT YISMILGAW 19 C P PER_echTel DRYLTICHPDRGRRMTSNT YVGMILGAW 19 C P PER_loxAfr DRYLTICHPHIGRRMTSNT YVSMILGAW 19 C P PER_sorAra DRYLTLCRPDAGRSMTTNS YVGLILGAW 19 C P PER_equCab DRYLTTCRPDAGRRMTTST YTSMILGAW 19 C P PER_dipOrd DRYLTICHPDIGRGMTTRT YVTMILGAW 19 C P PER_musMus DRYLTISCPDVGRRMTTNT YLSMILGAW 19 S P PER_ratNor DRYLTISCPDVGRRMTGNT YLSMVLGAW 19 S P PER_eriEur DRYLTICRPHTGRSMSANS YIAMILGAW 19 C P PER_tupBel DRYLTLCRPAVGRRMGSST YAAMILGAW 19 C P PER_monDom DRYLTICQPDLGGRMTSYN YTLMILTAW 19 C P PER_ornAna DRYLTICRPAIGRKMTRSN YTAMILAAW 19 C P PER_xenTro DRYLTICRPDIGRRISGRH YTAMILAAW 19 C P PER_galGal DRYLTICRPDIGRRMTTRN YAALILAAW 19 C P PER_anoCar DRYLTICKPHIGSRLTATN YTTLILAAW 19 C P PER_taeGut DRYLTICRPDIGRRMTTRS YATLILAAW 19 C P PER1_gasAc DRYLTICRPDIGQKMTMQS YNLLILAAW 19 C P PER_gasAcu DRYLTICRPDIGQKMTMQS YNLLILAAW 19 C P PER_oryLat DRYLTICRPDLGQKMTMQS YNLLILAAW 19 C P PER_takRub DRYITICRPDIGRKMTVQS YNLLILAAW 19 C P PER_tetNig DRYLTICRPDIGRKMTVQS YNLLIAAAW 19 C P PER_danRer DRYLTICRPDIGQKLTTRS YTLLIVAAW 19 C P PER1a_sacK DRYWATCSPVEVMELKSKY YTRMTALGW 19 C P NEUR1_homS DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_nomL DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_panT DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_ponP DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_macM DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_papH DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_calJ DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_tarS DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_cavP DRYLKICYLSYGVWLKRKH AYICLAAIW 19 C L NEUR1_dasN DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_equC DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_canF DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_susS DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_pteV DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_choH DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_musM DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_ratN DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_loxA DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_felC DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_turT DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_tupB DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_echT DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_dipO DRYLKICYLSYGVWLKRKH AYICLAVIW 19 C L NEUR1_bosT DRYLKICYLSYGIWLKRKH AYICLAVIW 19 C L NEUR1_eriE DRYLKICYLSYGVWLKRKH AYLCLAVIW 19 C L NEUR1_sorA DRYLKICYLSYGVWLKRKH AYICLVVIW 19 C L NEUR1_speT DRYLKICYLSYGVWLKRKH AFICLAVIW 19 C L NEUR1_oryC DRYLKICYLSYGVWLKRRH AYICLALIW 19 C L NEUR1_myoL DRYLKICYLSYGVWLKRKH TYICLAFIW 19 C L NEUR1_monD DRYLKICHLSYGTWLKRHH AFICLALIW 19 C L NEUR1_taeG DRYLKICHLSYGTWLKRHH AFICLAIIW 19 C L NEUR1_galG DRYLKICHLAYGTWLKRHH AFICLALIW 19 C L NEUR1_ornA DRYLKICHLSYGTWLKRHH AYICLAIIW 19 C L NEUR1_macE DRYLKICHLSYGTWLKRHH AYICLVIIW 19 C L NEUR1_gasA DRYLKICHLRYGTWLKRHH AFVCLALVW 19 C L NEUR1_anoC DRYFKICHLSYGTWLKRHH VFICLGIIW 19 C L NEUR1_tetN DRYLKICHLRYGAWLKRHH AFLCLASVW 19 C L NEUR1_xenT DRYLKICHLRYGTWLKRRH AFIALAVIW 19 C L NEUR1_takR DRYLKICHLRYGTWFKRHH AFLCLVFTW 19 C L NEUR1_oryL DRYLKICHLRYGTWLKRQH AFLCLVFVW 19 C L NEUR1_pimP DRYLKICHLRYGTWLKRQH IFLCLVFVW 19 C L NEUR1_danR DRYLKICHLRYGTWLKRHH AFLSVVFIW 19 C L NEUR1_calM DRYLKICHLQYGSWLQRRH VFMSLAFIW 19 C L NEUR2_galG VCCLKICFPAYGNRFRRKH GQILIACAW 19 C P NEUR2_anoC VCCLKICFPVYGNRFRPGH GWILIACAW 19 C P NEUR2_oncM VCFVKVCYPLYGNRFNAVH GRLLIACAW 19 C P NEUR2_xenT VCCLKVCYPAYGNKFSTAH SRILLLGIW 19 C P NEUR2_danR VCCLKVCFPNYGNKFSSSH ACVMVIGVW 19 C P NEUR2_pimP VCCLKVCCPNYGNKFSSNH ACVMVIGVW 19 C P NEUR2_tetN VCCLKVCLPNLGSKFSSSH ARLLVAGVW 19 C P NEUR2_takR VCCLKVCFPNHGSRFSSSH ARLLVVGVW 19 C P NEUR2_gasA VCCLKVCFPNHGNRFSSSH ARLLVVAVW 19 C P NEUR2_oryL VCCLKVCFPNHGNKFSFSH ARLLVAGVW 19 C P TMT_monDom ERYRTL-TLCPGQGADYQK ALLAVAGSW 19 - L TMT_macEug ERYRTL-TLCPRQGTDYHK ALLAVAGSW 19 - L TMT_ornAna ERYRTL-TLHPKQSTDYQK AVLAVGASW 19 - L TMT_galGal ERYSTL-TLCNKRSDDYRK ALLAVGGSW 19 - L TMT_taeGut ERYNTL-TLCHKRSDDFRK ALLAVAGSW 19 - L TMT_anoCar ERYSTL-TQTNKRGSDYQK ALLGVGGSW 19 - Q TMT_xenTro ERYSTL-TLYNKGGPNFKK ALLAVASSW 19 - L TMT_danRer ERYCTMMGSTEADATNYKK VIGGVLMSW 19 M S TMT_pimPro ERYCTMMGATQADSTNYKK VAMGIAFSW 19 M A TMTa_takRu ERYSTMMTPTEADPSNYCK VCLGITLSW 19 M P TMT_tetNig ERYSTMMTPTEADSSNYCK VCLGIGLSW 19 M P TMT_gasAcu ERYSTMVAPTEADSSNYHK ISLGITLSW 19 V P TMT_oryLat ERYSTMMTPAEADSSNYRK ISLGIILSW 19 M P TMTb_takRu ERYCTMVSSTIASNRDYRP VLGGICFSW 19 V S TMTa_calMi DRYITITGTTEADITNYNK TIVGIALSW 19 T T TMT1_plaDu ERYLAVVRPFDVGNLTNRR VIAGGVFVW 19 V P TMT2_anoGa ERYCLISRPFSSRNLTRRG AFLAIFFIW 19 S P TMT_triCas ERYLLIARPFRNNALNFHS AALSVFSIW 19 A P TMT_bomMor ERYLMVTRPLTSRHLSSKG AVLSIMFIW 19 T P ENCEPH_hom ERYIRVVHARVINFSW AWRAITYIW 16 V A TMT_aedAe ERFCLISHPFSSRSLSRRG AVFAILFIW 19 S P TMT_culPi ERFYLISRPFSSRSLSRRG ALGAVLLIW 19 S P ENCEPH_lox ERYIRVVHARVINFSW AWRAITYIW 16 V A TMT1_anoGa ERFCLISRPFAAQNRSKQG ACLAVLFIW 19 S P ENCEPH_can ERYIRVVHARVINFSW AWRAITYIW 16 V A TMT_triCa ERYLLIARPFRNNALNFHS AALSVFSIW 19 A P ENCEPH_oto ERYIRVVHARVINFSW AWRAITYIW 16 V A ENCEPH_mus ERYIRVVHARVINFSW AWRAITYIW 16 V A ENCEPH_ano ERYIRVVHARVIDFSW SWRAITYIW 16 V A ENCEPH_gal ERYIRVVHAKVIDFSW SWRAITYIW 16 V A ENCEPH_mon ERYNRIVHAKVINFSW AWRAITYIW 16 V A ENCEPH_pte ERYIRVVQARAIDFSW AWRTITYIW 16 V A ENCEPH_squ ERYIRVVNATAIDFSW AWRAITYIW 16 V A ENCEPH_xen ERYARVVYGKYVNSSW SKRSITFVW 16 V G ENCEPH_dan ERYIRVVHAKVVDFPW AWRAITHIW 16 V A ENCEPH_tak ERYIRVVHAQVVDFPW AWRAIGHIW 16 V A ENCEPH_gas ERYIRVVHAQVVDFPW AWRAIGHIW 16 V A ENCEPH_ory ERYIRVVHAQVVDFPW AWRAIGHIW 16 V A ENCEPH_cal ERYIRVVNAKATNFPW AWRAITYTW 16 V A ENCEPH_squ ERYIRVVNATAIDFSW AWRAITYIW 16 V A ENCEPH_pet ERYARLIKAQVLDFSW AWRAVTYTW 16 I A RGR_homSap GRYHHYCTRSQLAWNS AVSLVLFVW 16 C R RGR_panTro GRYHHYCTRSQLAWNS AISLVLFVW 16 C R RGR_gorGor GRYHHYCTGSTLACKS AVSLVLSGR 16 C G RGR_macMul GRYHHYCTRSQLAWNS AISLVLFVW 16 C R RGR_ponPyg GRYHHYCTGSQLAWNS AISLVLFVW 16 C G RGR_calJac GRYHHYCTGSQLAWNS AISLVLFVW 16 C G RGR_nomLeu GRYHHYCTGSQLAWNS AISLVLFVW 16 C G RGR_tarSyr GRYHHYCTGSQLAWNT AISLVLFVW 16 C G RGR_pteVam GRYHHYCTGSRLAWNT AVSLVLFVW 16 C G RGR_oryCun GRYHHYCTGSQLAWNT AVLLVLFVW 16 C G RGR_ochPri GRYHHYCTGSQLAWNT AVLLVLFVW 16 C G RGR_otoGar GRYHHYCTGRPLAWST AISLVLFVW 16 C G RGR_micMur GRYHHYCTGSPLAWST AISLVLFVW 16 C G RGR_musMus GRYHHYCTGRQLAWDT AIPLVLFVW 16 C G RGR_ratNor GRYHHYCTGRQLAWDT AIPLVLFVW 16 C G RGR_cavPor GRHQQCCTRGRLTWST AVPLVLFVW 16 C R RGR_speTri GRYHHYCTGSQLAWNT AIPLVLFVW 16 C G RGR_sorAra GRYHHYCTGRQLAWDV AIALVIFVW 16 C G RGR_myoLuc GRYHHYCTGSRLAWRT AASLVLFVW 16 C G RGR_canFam GRYHHYCTRGQLAWNT AISLVLCVW 16 C R RGR_felCat GRYHHYCSGSQLAWNT AISLVICVW 16 C G RGR_bosTau GRYHHFCTGSRLDWNT AVSLVFFVW 16 C G RGR_turTru GRYHHYCTGSRLDWNT AVSLVFFVW 16 C G RGR_susScr GRYHHYCTRSRLDWNT AVSLVFFVW 16 C R RGR_equCab GRYHHYCTRSRLAWNT AVFLVFFVW 16 C R RGR_eriEur GRYHHHCTRSRLAWNT AVFLVFFVW 16 C R RGR_dipOrd GRCHHHCTGSLLGWDT AVSLVIFVW 16 C G RGR_loxAfr ERYHHYCTRSRLAWSS ASALVLFVW 16 C R RGR_proCap ERYHHYCTGSKLAWSS AGALVLFMW 16 C G RGR_echTel ERYHHYCTGSQFTWSS ASTLVLFMW 16 C G RGR_dasNov ERCHRHCIGRRLAWST AGCLVLCLW 16 C G RGR_choHof ERYRHHCTGSQLSWST AGSLVLCVW 16 C G RGR_ornAna DRYLRHCSRSKPQWGT AVSTVLFAW 16 C R RGR_anoCar DRHHQYCTGNKLQWGS VIPMTIFLW 16 C G RGR_galGal DRYHHYCTRSKLQWST AISMMVFAW 16 C R RGR_taeGut DRYHHYCTRSRLQWST AVSMMVFAW 16 C R RGR_xenTro DRYHQYCTRSKLHWST AVSVVFFIW 16 C R RGR_xenLae DRYHQYCTRSKLHWGT AVSMVLFVW 16 C R RGR1_gasAc DRYHQYCTRTKLQWSS AITLAVFVW 16 C R RGR1_takRu DRYHQYCTRTKLQWSS AITLAVFIW 16 C R RGR1_tetNi DRYHQYCTRTKLQWSS AITLAVFIW 16 C R RGR1_pimPr DRYHQYCTRTKLQWSS AITLVIFIW 16 C R RGR1_osmMo DRYHQYCTRTKLQWSS AITLVMFIW 16 C R RGR1_gadMo DRYHQYCTRTELQWSS AVTLSVFIW 16 C R RGR1_danRe DRYHQYCTRTKLQWSS AITLVLFTW 16 C R RGR1_oryLa DRYHQYCTRTKLQWST AITLAVLVW 16 C R RGR_calMil DRYHQNCSRSRLQWSS AITVTVFIW 16 C R RGR2_gasAc DRYHQYCTRQKLFWST TLTMSAIIW 16 C R RGR2_tetNi DRYHQYCTRQKLFWST TLTMSSIIW 16 C R RGR2_oryLa DRYHQYCTRQKLFWST SITISLIIW 16 C R RGR2_danRe DRYHQYCTKQKMFWST SITISCLIW 16 C K RGR2_pimPr DRYHLYCTKQKMFWST SGTISALIW 16 C K RGR2_gadMo DRYHQYCTRQKLFWST TVTMCCIVW 16 C R RGR2_hipHi DRYHQYCTRQKLFWST TLTMSGIIW 16 C R RGR2_oncMy DRYHQYVTNQKLFWST AWTISIIIW 16 V N RGR2_esoLu DRYHQYVTNQKLFWST AWTFSIIIW 16 V N RGR2_poeRe DRYHQYCTRQKLFWST TLTMSGIIW 16 C R MEL1_homSa DRYLVITRPLATFGVASKRR AAFVLLGVW 20 T P MEL1_panTr DRYLVITRPLATFGVASKRR AAFVLLGVW 20 T P MEL1_gorGo DRYLVITRPLATFGVASKRR AAFVLLGVW 20 T P MEL1_ponAb DRYLVITRPLATIGVASKRR AAFVLLGVW 20 T P MEL1_rheMa DRYLVITRPLATIGVASKRR AAFVLLGVW 20 T P MEL1_calJa DRYLVITRPLATIGVASTKR AAFVLLGVW 20 T P MEL1_micMu DRYLVITRPLASVGTASKRR AGLVLLGVW 20 T P MEL1_otoGa DRYLVITRPLTTVGVASKRR AALVLLGVW 20 T P MEL1_musMu DRYLVITRPLATIGRGSKRR TALVLLGVW 20 T P MEL1_ratNo DRYLVITRPLATIGMRSKRR TALVLLGVW 20 T P MEL1_nanEh DRYLVITRPLATIGVASKRR TALVLLGVW 20 T P MEL1_phoSu DRYLVITRPLATIGMGSKRR TALVLLGIW 20 T P MEL1_dipOr DRYLVITRPLATIGVTSKRR TAFVLLGVW 20 T P MEL1_cavPo DRYLVITRPLATIGVASKRQ AALVLLGVW 20 T P MEL1_speTr DRYLVITRPLATIGMASKKR AAFFLLGVW 20 T P MEL1_oryCu DRYLVITRPLAAVGMVSKKR AGLVLLGVW 20 T P MEL1_ochPr DRYLVITRPLAAVGMVSKRR TGLVLLGVW 20 T P MEL1_bosTa DRYLVITRPLATVGMVSKRR AALVLLGVW 20 T P MEL1_turTr DRYLVITRPLATVGMVSKRR AALVLLGVW 20 T P MEL1_susSc DRYLVITHPLATVGMVSKRR AALVLLGVW 20 T P MEL1_equCa DRYLVITRPLATVGVVSKRW AALVLLGIW 20 T P MEL1_felCa DRYLVITHPLATIGVVSKRR AALVLLGVW 20 T P MEL1_canFa DRYLVITHPLAAVGVVSKRR AALVLLGVW 20 T P MEL1_myoLu DRYLVITRPLA-IGVVSKRR AALVLLGVW 20 T P MEL1_pteVa DRYLVITRPLAAIGVVSKRR AALVLLGVW 20 T P MEL1_eriEu DRYLVITRPLATIGVVSKRR VALVLLGVW 20 T P MEL1_loxAf DRYLVITRPLATIGVVSKRR AALVLLGIW 20 T P MEL1_proCa DRYLVITRPLATIGVVSKRR TALVLLGTW 20 T P MEL1_echTe DRYLVITRPLATIGVVSKRR AALVLLVIW 20 T P MEL1_smiCr DRYFVITRPLASIGMISKKK TGLILLGVW 20 T P MEL1_monDo DRYFVITRPLASIGVISKKK TGFILLGVW 20 T P MEL1_ornAn DRYFVITRPLASIGVISKKR ALLILTGVW 20 T P MEL1_anoCa DRYFVITRPLASIGAMSTKK ALLILSGVW 20 T P MEL1_taeGu DRYFVITKPLASVGVTSKKK ALIILVGVW 20 T P MEL1_galGa DRYFVITKPLASVRVMSKKK ALIILVGVW 20 T P MEL1_xenTr DRYFVITRPLTSIGVMSKKR AVLILSGVW 20 T P MEL1_danRe DRYFVITRPLASIGVLSQKR ALLILLVAW 20 T P MEL1_danRe DRYFVITRPLASIGVMSRKR ALLILSAAW 20 T P MEL1_takRu DRYFVITRPLTSIGVLSRKR AFVILMTVW 20 T P MEL1_gasAc DRYFVITRPLTSIGMMSRRR ALLILMGAW 20 T P MEL1_oryLa DRYFVITRPLTSIGVLSRKR ALLILSAAW 20 T P MEL1_calMi DRYFVITRPLASIGVLSHRR AGLIILSLW 20 T P MEL1_petMa DRYLVLTRPLASIGAMSKRR AMYITAAVW 20 T P MEL2_galGa DRYLVITKPLRSIQWTSKKR TIQIIAAVW 20 T P MEL2_anoCa DRYCVITKPLQSIKRTSKKR TCIIIVFVW 20 T P MEL2_xenLa NRYIVITKPLQSIQWSSKKR TSQIIVLVW 20 T P MEL2_danRe DRYLVITKPLQTIQWNSKRR TGLAILCIW 20 T P MEL2_tetNi DRYVVITKPLQTIRRSSKRR TALAILMVW 20 T P MEL2_gasAc DRYLVITKPLQAIHWGSKRR TTLAILLVW 20 T P MEL1_plaDu DRFYVITNPLGAAQTMTKKR AFIILTIIW 20 T P MEL1_capCa DRYMVIAKPFYAMKHVSHKR SLIQIILAW 20 A P MEL1_helRo DRYLVVGQPLAMLNQSHFRR SFYHVLIIW 20 G P MEL1_todPa DRYNVIGRPMAASKKMSHRR AFIMIIFVW 20 G P TMT_triCys ERFITIVLPLKRDTILSTKN IYIGLGILW 20 V P
Reference collection of structurally determined GPCR
>RHO1_bosTau cow rod rhodopsin MNGTEGPNFYVPFSNKTGVVRSPFEAPQYYLAEPWQFSMLAAYMFLLIMLGFPINFLTLYVTVQHKKLRTPLNYILLNLAVADLFMVFGGFTTTLYTSLHGYFVFGPTGCNLEGFFATLGGEIALWSLVVLAI ERYVVVCKPMSNFRFGENHAIMGVAFTWVMALACAAPPLVGWSRYIPEGMQCSCGIDYYTPHEETNNESFVIYMFVVHFIIPLIVIFFCYGQLVFTVKEAAAQQQESATTQKAEKEVTRMVIIMVIAFLICWL PYAGVAFYIFTHQGSDFGPIFMTIPAFFAKTSAVYNPVIYIMMNKQFRNCMVTTLCCGKNPLGDDEASTTVSKTETSQVAPA* >MEL1_todPac Todarodes pacificus (squid) Gq X70498 480 11106382 Mollusca 'squid rhodopsin' 3D: May 2008 Cys 337 palmitoyled MGRDLRDNETWWYNPSIVVHPHWREFDQVPDAVYYSLGIFIGICGIIGCGGNGIVIYLFTKTKSLQTPANMFIINLAFSDFTFSLVNGFPLMTISCFLKKWIFGFAACKVYGFIGGIFGFMSIMTMAMISI DRYNVIGRPMAASKKMSHRRAFIMIIFVWLWSVLWAIGPIFGWGAYTLEGVLCNCSFDYISRDSTTRSNILCMFILGFFGPILIIFFCYFNIVMSVSNHEKEMAAMAKRLNAKELRKAQAGANAEMRLAKI SIVIVSQFLLSWSPYAVVALLAQFGPLEWVTPYAAQLPVMFAKASAIHNPMIYSVSHPKFREAISQTFPWVLTCCQFDDKETEDDKDAETEIPAGESSDAAPSADAAQMKEMMAMMQKMQQQQAAYPPQGY APPPQGYPPQGYPPQGYPPQGYPPQGYPPPPQGAPPQGAPPAAPPQGVDNQAYQA* >ADRB1_melGal turkey Beta 1 adrenergic receptor with stabilising mutations And bound cyanopindolol MGAELLSQQWEAGMSLLMALVVLLIVAGNVLVIAAIGSTQRLQTLTNLFITSLACADLVVGLLVVPFGATLVVRGTWLWGSFLCELWTSLDVLCVTASIETLCVIAI DRYLAITSPFRYQSLMTRARAKVIICTVWAISALVSFLPIMMHWWRDEDPQALKCYQDPGCCDFVTNRAYAIASSIISFYIPLLIMIFVALRVYREA KEQIRKIDRASKRKRVMLMREHKALKTLGIIMGVFTLCWLPFFLVNIVNVFNRDLVPDWLFVAFNWLGYANSAMNPIIYCRSPDFRKAFKRLLAFPRKADRRLHHHHHH* >ADRB2_homSap beta 2 adrenergic receptor 365 aa MGQPGNGSAFLLAPNRSHAPDHDVTQQRDEVWVVGMGIVMSLIVLAIVFGNVLVITAIAKFERLQTVTNYFITSLACADLVMGLAVVPFGAAHILMKMWTFGNFWCEFWTSIDVLCVTASIETLCVIAV DRYFAITSPFKYQSLLTKNKARVIILMVWIVSGLTSFLPIQMHWYRATHQEAINCYANETCCDFFTNQAYAIASSIVSFYVPLVIMVFVYSRVFQEAKRQLQKIDKSEGRFHVQNLSQVEQDGRTGHGL RRSSKFCLKEHKALKTLGIIMGTFTLCWLPFFIVNIVHVIQDNLIRKEVYILLNWIGYVNSGFNPLIYCRSPDFRIAFQELLCLRRSSLKAYGNGYSSNGNTGEQSG* >ADORA2A_homSap adenosine adrenergic receptor 2A MPIMGSSVYITVELAIAVLAILGNVLVCWAVWLNSNLQNVTNYFVVSLAAADIAVGVLAIPFAITISTGFCAACHGCLFIACFVLVLTQSSIFSLLAIAI DRYIAIRIPLRYNGLVTGTRAKGIIAICWVLSFAIGLTPMLGWNNCGQPKEGKNHSQGCGEGQVACLFEDVVPMNYMVYFNFFACVLVPLLLMLGVYLRI FLAARRQLKQMESQPLPGERARSTLQKEVHAAKSLAIIVGLFALCWLPLHIINCFTFFCPDCSHAPLWLMYLAIVLSHTNSVVNPFIYAYRIREFRQTFR KIIRSHVLRQQEPFKAAGTSARVLAAHGSDGEQVSLRLNGHPPGVWANGSAPHPERRPNGYALGLVSGGSAQESQGNTGLPDVELLSHELKGVCPEPPGLDDPLAQDGAGVS* The C2 loop is highly conserved within each orthology class for GPCR with determined structure: RHO1 in vertebrates MEL1 in vertebrates ADRB1 in vertebrates ADRB2 orthologs in tetrapods ADORA2A in teleosts homSap ERYVVVCKPMSNFRFGENHAIMGVAFTW homSa DRYLVITRPLATFGVASKRRAAFVLLGVW homSap DRYLAITSPFRYQSLLTRARARGLVCTVW homSap DRYFAITSPFKYQSLLTKNKARVIILMVW homSap DRYIAIRIPLRYNGLVTG TRAKGIIAICW panTro ERYVVVCKPMSNFRFGENHAIMGVAFTW panTr DRYLVITRPLATFGVASKRRAAFVLLGVW panTro DRYLAITSPFRYQSLLTRARARGLVCTVW panTro DRYFAITSPFKYQSLLTKNKARVIILMVW panTro DRYIAIRIPLRYNGLVTGTRAKGIIAICW gorGor ERYVVVCKPMSNFRFGENHAIMGVAFTW gorGo DRYLVITRPLATFGVASKRRAAFVLLGVW ponAbe DRYLAITSPFRYQSLLTRARARGLVCTVW gorGor DRYFAITSPFKYQSLLTKNKARVIILMVW gorGor DRYIAIRIPLRYNGLVTGTRAKGIIAICW ponAbe ERYVVVCKPMSNFRFGENHAIMGVAFTW ponAb DRYLVITRPLATIGVASKRRAAFVLLGVW rheMac DRYLAITSPFRYQSLLTRARARGLVCTVW ponAbe DRYFAITSPFKYQSLLTKNKARVIILMVW ponAbe DRYIAIRIPLRYNGLVTGTRAKGIIAICW rheMac ERYVVVCKPMSNFRFGENHAIMGVAFTW rheMa DRYLVITRPLATIGVASKRRAAFVLLGVW calJac DRYLAITSPFRYQSLLTRARARGLVCTVW rheMac DRYFAITSPFKYQSLLTKNKARVIILMVW rheMac DRYIAIRIPLRYNGLVTGTRAKGIIAICW calJac ERYVVVCKPMSNFRFGENHAIMGVAFTW calJa DRYLVITRPLATIGVASTKRAAFVLLGVW micMur DRYLAITSPFRYQSLLTRARARALVCTVW calJac DRYFAITSPFKYQSLLTKNKARVIILMVW calJac DRYIAIRIPLRYNGLVTGTRAKGIIAICW micMur ERYVVVCKPMSNFRFGENHAIMGVVFTW micMu DRYLVITRPLASVGTASKRRAGLVLLGVW otoGar DRYLAITSPFRYQSLLTRARARPLVCTVW micMur DRYFAITSPFKYQSLLTKNKARVVILMVW micMur DRYIAIRIPLRYNGLVTGTRAKGIIAICW musMus ERYVVVCKPMSNFRFGENHAIMGVVFTW otoGa DRYLVITRPLTTVGVASKRRAALVLLGVW musMus DRYLAITSPFRYQSLLTRARARALVCTVW otoGar DRYFAITSPFKYQSLLTKNKARVVILMVW musMus DRYIAIRIPLRYNGLVTGMRAKGIIAICW ratNor ERYVVVCKPMSNFRFGENHAIMGVAFTW musMu DRYLVITRPLATIGRGSKRRTALVLLGVW ratNor DRYLAITSPFRYQSLLTRARARALVCTVW tupBel DRYFAITSPFKYQSLLTKNKARVVILMVW ratNor DRYIAIRIPLRYNGLVTGVRAKGIIAICW cavPor ERYVVVCKPMSNFRFGENHAIMGVVFTW ratNo DRYLVITRPLATIGMRSKRRTALVLLGVW cavPor DRYLAITSPFRYQSLLTRARARVLVCTVW dipOrd DRYFAITSPFKYQSLLTKNKARVVILMVW dipOrd DRYIAIRIPLRYNSLVTCTRAKGIIAICW speTri ERYMVVCKPMSNFRFGENHAIMGVIFTW dipOr DRYLVITRPLATIGVTSKRRTAFVLLGVW oryCun DRYLAITSPFRYQSLLTRARARALVCTVW cavPor DRYFAITSPFKYQSLLTKNKARVVILMVW cavPor DRYIAIRIPLRYNGLVTCTRAKGIIAICW oryCun ERYVVVCKPMSNFRFGENHAIMGVAFTW cavPo DRYLVITRPLATIGVASKRQAALVLLGVW ochPri DRYLAITSPFRYQSLLTRARARALVCTVW oryCun DRYFAITSPFKYQSLLTKNKARVVILMVW speTri DRYIAIRIPLRYNGLVTGMRAKGIIAICW ochPri ERYVVVCKPMSNFRFGENHAIMGVAFTW speTr DRYLVITRPLATIGMASKKRAAFFLLGVW bosTau DRYLAITSPFRYQSLLTRARARALVCTVW ochPri DRYFAITSPFKYQSLLTKNKARVVVLMVW oryCun DRYIAIRIPLRYNGLVTGTRAKGIIAICW bosTau ERYVVVCKPMSNFRFGENHAIMGVAFTW oryCu DRYLVITRPLAAVGMVSKKRAGLVLLGVW equCab DRYLAITSPFRYQSLLTRARARALVCTVW equCab DRYFAITSPFKYQSLLTKNKARVVILMVW ochPri DRYIAIRIPLRYNGLVTGSRAKGIIAICW equCab ERYVVVCKPMSNFRFGENHAIMGVAFTW ochPr DRYLVITRPLAAVGMVSKRRTGLVLLGVW felCat DRYLAITSPFRYQSLLTRARARALVCTVW felCat DRYFAITSPFKYQSLLTKNKARVVILMVW turTru DRYIAIRIPLRYNGLVTGTRAKGIIAVCW felCat ERYVVVCKPMSNFRFGENHAIMGVAFTW bosTa DRYLVITRPLATVGMVSKRRAALVLLGVW canFam DRYLAITAPFRYQSLLTRARARALVCTVW canFam DRYFAITSPFKYQSLLTKNKARVVILMVW bosTau DRYIAIRIPLRYNGLVTGTRAKGIIAVCW canFam ERYVVVCKPMSNFRFGENHAIMGVAFTW turTr DRYLVITRPLATVGMVSKRRAALVLLGVW myoLuc DRYLAITSPFRYQSLLTRARARALVCTVW myoLuc DRYFAITSPFKYQSLLTKNKARVVILLVW canFam DRYIAIRIPLRYNGLVTGTRAKGIIAVCW myoLuc ERYVVVCKPMSNFRFGENHAIMGLAFTW equCa DRYLVITRPLATVGVVSKRWAALVLLGIW pteVam DRYLAITSPFRYQSLLTRARARALVCTVW pteVam DRYFAITSPFKYQSLLTKNKARVVILMVW myoLuc DRYIAIRIPLRYNGLVTGARAKGIIAICW pteVam ERYVVVCKPMSNFRFGENHAIMGLALTW felCa DRYLVITHPLATIGVVSKRRAALVLLGVW echTel DRYLAITSPFRYQSLLTRARARVLVCTVW eriEur DRYFAITSPFKYQSLLTKNKARVVILMVW eriEur DRYIAIRIPLRYNGLVTGQRAKGIIAVCW eriEur ERYVVVCKPMSNFRFGENHAIMGVAFTW canFa DRYLVITHPLAAVGVVSKRRAALVLLGVW choHof DRYLAITSPFRYQSLLTRARARALVCTVW sorAra DRYFAITSPFKYQSLLTKNKARGVILMVW loxAfr DRYIAIRIPLRYNGLVTGTRAKGIIAVCW dasNov ERYVVVCKPMSNFRFGENHAVMGVAFTW myoLu DRYLVITRPLA-IGVVSKRRAALVLLGVW monDom DRYIAITSPFRYQSLLTRARARALVCTVW proCap DRYFAITSPFKYQSLLTKNKARVVILMVW proCap DRYIAIRIPLRYNGLVTGTRAKGIIAVCW monDom ERYVVVCKPMSNFRFGENHAIIGVAFTW pteVa DRYLVITRPLAAIGVVSKRRAALVLLGVW ornAna DRYIAITSPFRYRSLLTRARARGLVCGVW echTel DRYFAITSPFKYQSLLTKNKARVVILMVW galGal DRIIAIRIPLRYNGLVTGSRAKGIIAICW ornAna ERYIVVCKPMSNFRFGENHAIMGVAFTW eriEu DRYLVITRPLATIGVVSKRRVALVLLGVW galGal DRYLAITSPFRYQSLMTRARAKGIICTVW dasNov DRYFAITSPFKYQSLLTKNKARVVILMVW taeGut DRIIAIRIPLRYNGLVTGSRAKGIIAICW galGal ERYVVVCKPMSNFRFGENHAIMGVAFSW loxAf DRYLVITRPLATIGVVSKRRAALVLLGIW taeGut DRYLAITSPFRYQSLMTKGRAKGIICTVW monDom DRYFAITAPFRYQSMLTKGKARVVILVVW xenTro DRYIAIRIPLRYNSLVTSRRANAIIAVCW taeGut ERYVVVCKPMSNFRFGENHAIMGVAFSW proCa DRYLVITRPLATIGVVSKRRTALVLLGTW anoCar DRYLAITSPFRYQSLMTKKRAKIIVCVVW galGal DRYFAITSPFKYQSLLTKSKARVVILVVW tetNig DRYIAIKLPLRYNGLVTGQRAQAIIAICW anoCar ERYVVICKPMSNFRFGETHALIGVSCTW echTe DRYLVITRPLATIGVVSKRRAALVLLVIW xenTro DRYIAITSPLKYEMLVTKVRARLTVCLVW taeGut DRYFAITSPFKYQSLLTKGKARVVILVVW fugRub DRYIAIKLPLRYNSLVTGKRAQGIIAICW xenTro ERYVVVCKPMANFRFGENHAIMGVVFTW monDo DRYFVITRPLASIGVISKKKTGFILLGVW tetNig DRYVAITSPFRYQSLLTKARARAMVCAVW anoCar DRYFAITSPFKYQSHLTKNKARVIILLVW gasAcu DRYIAIKIPLRYNGLVTGQRAQGIIAICW tetNig ERYIVVCKPVTNFRFGEKHAIAGLAFTW ornAn DRYFVITRPLASIGVISKKRALLILTGVW fugRub DRYVAITSPFRYQSLLTKARAKAMVCAVW xenTro DRYFAITSPFRYQSLLTKCKARIVILLVW oryLat DRYIAIKIPLRYNSLVTSQRARGIIAICW fugRub ERYIVVCKPMTNFRFGEKHAIAGLVFTW anoCa DRYFVITRPLASIGAMSTKKALLILSGVW gasAcu DRYVAITSPFRYQSLLTKARARTVVCVVW danRer DRYIAIKIPLRYNSLVTGQRARGIIAICW gasAcu ERYVVVCKPMSNFRFGEKHAIAGLLFTW galGa DRYFVITKPLASVRVMSKKKALIILVGVW oryLat DRYVAITSPFRYQSLLTKSRAKAVVCVVW oryLat ERYVVVCKPMTNFRFEEKHAIAGLAFSW xenTr DRYFVITRPLTSIGVMSKKRAVLILSGVW danRer DRYIAIISPFRYQSLLTKARAKVVVCAVW danRer ERWMVVCKPVSNFRFGENHAIMGVAFTW danRe DRYFVITRPLASIGVLSQKRALLILLVAW petMar DRYIAVARPLRYETLMNKRRARFIIVAVW petMar ERYIVICKPMGNFRFGSTHAYMGVAFTW takRu DRYFVITRPLTSIGVLSRKRAFVILMTVW gasAc DRYFVITRPLTSIGMMSRRRALLILMGAW oryLa DRYFVITRPLTSIGVLSRKRALLILSAAW calMi DRYFVITRPLASIGVLSHRRAGLIILSLW petMa DRYLVLTRPLASIGAMSKRRAMYITAAVW