Opsin evolution: informative indels: Difference between revisions

From genomewiki
Jump to navigationJump to search
mNo edit summary
No edit summary
 
(47 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''See also:''' [[Opsin_evolution|Curated Sequences]] | [[Opsin_evolution:_ancestral_introns|Ancestral Introns]] | [[Opsin_evolution:_Cytoplasmic_face|Cytoplasmic face]] | [[Opsin_evolution:_ancestral_sequences|Ancestral Sequences]] | [[Opsin_evolution:_alignment|Alignment]] | [[Opsin_evolution:_update_blog|Update Blog]]
=== Introduction to indels ===
=== Introduction to indels ===


Line 7: Line 9:
Nonetheless, [[Pegasoferae%3F|examples of homoplasy]] are easy to come by, especially in repetitive nucleotide regions  encoding runs of compositionally simple amino acids subject to the mutational mechanism of replication slippage. Homoplasy at longer time scales manifests itself by incoherent distribution over a known phylogenetic tree. Convergent evolution can also be driven by selective advantage for altered length.
Nonetheless, [[Pegasoferae%3F|examples of homoplasy]] are easy to come by, especially in repetitive nucleotide regions  encoding runs of compositionally simple amino acids subject to the mutational mechanism of replication slippage. Homoplasy at longer time scales manifests itself by incoherent distribution over a known phylogenetic tree. Convergent evolution can also be driven by selective advantage for altered length.


Indels occur very unevenly across the length of a given protein homology class. The rate might be high in terminal regions if the amino or carboxy termini are unimportant to the fold or function of matured protein. Within folded regions of soluble proteins, indels are greatly concentrated in loop regions of the 3D structure where a change in length can be accommodated without structural disruption. The distributional occurence of indels even allows prediction of loop regions.
Indels occur very unevenly across the length of a given protein homology class. The rate might be high in terminal regions if the amino or carboxy termini are unimportant to the fold or function of matured protein. Within folded regions of soluble proteins, indels are greatly concentrated in loop regions of the 3D structure where a change in length can be accommodated without structural disruption. The distributional occurrence of indels even allows prediction of loop regions.


For integral membrane proteins such as GPCR, deletions are very rarely fixed in the transmembrane helical regions because a shortened length would no longer span the membrane at the same angle, thus pulling in inappropriate non-hydrophobic residues from soluble loops. Insertions too are rare because they push hydrophobic and boundary turn residues out into soluble compartments and distort connecting loops, perhaps altering insertion angles of adjacent transmembrane regions. Such mutations arise frequently enough but are rarely fixed at the population level or hang on as balanced alleles over timescales commensurate with ordinal speciations.  
For integral membrane proteins such as GPCR, deletions are very rarely fixed in the transmembrane helical regions because a shortened length would no longer span the membrane at the same angle, thus pulling in inappropriate non-hydrophobic residues from soluble loops. Insertions too are rare because they push hydrophobic and boundary turn residues out into soluble compartments and distort connecting loops, perhaps altering insertion angles of adjacent transmembrane regions. Such mutations arise frequently enough but are rarely fixed at the population level or hang on as balanced alleles over timescales commensurate with ordinal speciations.  


In massively expanded gene families such as GPCR, a coherently fixed indel in one descendent clade of the gene tree suggests adaptive sub- or neo-functionalisation: if the indel were merely tolerated as near-neutral change, over geological timescales homoplasy at that site would occur. A remarkable site in transmembrane helix 2 was [http://www.ncbi.nlm.nih.gov/pubmed/19357801? proposed] in May 2009:
In massively expanded gene families such as GPCR, a coherently fixed indel in one descendent clade of the gene tree suggests adaptive sub- or neo-functionalization: if the indel were merely tolerated as near-neutral change, over geological timescales homoplasy at that site would occur. A remarkable site in transmembrane helix 2 was [http://www.ncbi.nlm.nih.gov/pubmed/19357801? proposed] in May 2009:


<blockquote>'Class A GPCR constitute a large family of transmembrane receptors. Helical distortions play a major role in the overall fold of these receptors. Most are related to conserved proline residues. However, in transmembrane helix 2, the proline pattern is not conserved, and when present, proline may be located at position TM 2.58, 2.59, or 2.60 yielding a bulged structure in P2.59 and P2.60 receptors or a more typical proline kink in P2.58 receptors. The proline pattern of helix 2 can be used as an evolutionary marker of molecular divergence of class A GPCRs.</blockquote>  
<blockquote>'Class A GPCR constitute a large family of transmembrane receptors. Helical distortions play a major role in the overall fold of these receptors. Most are related to conserved proline residues. However, in transmembrane helix 2, the proline pattern is not conserved, and when present, proline may be located at position TM 2.58, 2.59, or 2.60 yielding a bulged structure in P2.59 and P2.60 receptors or a more typical proline kink in P2.58 receptors. The proline pattern of helix 2 can be used as an evolutionary marker of molecular divergence of class A GPCRs.</blockquote>  


<blockquote>At this site, two independent indel events occurred. One [unresolvable] indel arose very early in GPCR evolution in a bilaterian ancestor before protostome-deuterostome divergence. This indel led to the split between the <font color="green">P2.58 somatostatin/opioid receptors</font> and peptide receptors with the P2.59 pattern. Subfamilies with proline at position 2.59 or no proline expanded earlier, whereas P2.60 receptors remained marginal throughout evolution. P2.58 receptors underwent later rapid expansion in vertebrates with the development of the <font color="brown">chemokine and purinergic receptor subfamilies</font> from somatostatin/opioid-related ancestors. A second indel, resolvable as a deletion, occurred in <font color="blue">insect melanopsins</font>.'</blockquote>
<blockquote>At this site, two independent indel events occurred. One [unresolvable] indel arose very early in GPCR evolution in a bilateran ancestor before protostome-deuterostome divergence. This indel led to the split between the <font color="green">P2.58 somatostatin/opioid receptors</font> and peptide receptors with the P2.59 pattern. Subfamilies with proline at position 2.59 or no proline expanded earlier, whereas P2.60 receptors remained marginal throughout evolution. P2.58 receptors underwent later rapid expansion in vertebrates with the development of the <font color="brown">chemokine and purinergic receptor subfamilies</font> from somatostatin/opioid-related ancestors. A second indel, resolvable as a deletion, occurred in <font color="blue">insect melanopsins</font>.'</blockquote>


This result refines the classification of Class A GPCR, which might be quite indecisive at certain gene tree nodes from sequence alignment alone. Timing of the insect deletion can be done better (below) because the SwissProt collection used by the authors carries only 20% of the melanopsins actually available. Note the structural significance of length and bulge changes can be examined in available 3D determinations. The functional effect of this shift in TM2 remains obscure but must be important.
This result refines the classification of Class A GPCR, which might be quite indecisive at certain gene tree nodes from sequence alignment alone. Timing of the insect deletion can be done better (below) because the SwissProt collection used by the authors carries only 20% of the melanopsins actually available. Note the structural significance of length and bulge changes can be examined in available 3D determinations. The functional effect of this shift in TM2 remains obscure but must be important.
Line 31: Line 33:
=== Indels in ciliary opsins ===  
=== Indels in ciliary opsins ===  


The tertiary structural integrity requirements of a 7-transmembrane opsin, along with tuned binding of retinal, isomerization cycle conformational shifts and binding to secondary protein contributers to the photoreception cycle, conspire to greatly constrain admissable locations for ciliary opsin indels. Indeed this varies greatly by region, with indels never seen in the transmembrane regions themselves (despite tens of billions of branch length years) and restricted in connecting cytoplasmic and extracellular loops to EC2 and IC3 and IC7. Indel incidence is much higher in amino and carboxy terminal tails but not useful because of gapping ambiguity issues.  
The tertiary structural integrity requirements of a 7-transmembrane opsin, along with tuned binding of retinal, isomerization cycle conformational shifts and binding to secondary protein contributers to the photoreception cycle, conspire to greatly constrain admissible locations for ciliary opsin indels. Indeed this varies greatly by region, with indels never seen in the transmembrane regions themselves (despite tens of billions of branch length years) and restricted in connecting cytoplasmic and extracellular loops to EC2 and IC3 and IC7. Indel incidence is much higher in amino and carboxy terminal tails but not useful because of gapping ambiguity issues.  


The distribution of fixed indels is quite peculiar: almost all occur in gene family stems (ie shortly after gene duplication in one branch), hardly any occur mid-history. For vertebrate imaging opsins, this means prior to lamprey divergence. In other words, not only had all the classes of imaging opsins emerged post-tunicate/amphioxus pre-lamprey but (neglecting tails) also all their indels. No further indels arose in the subsequent 500 million years in any of these opsins, as if these opsins were already optimized from the length perspective
The distribution of fixed indels is quite peculiar: almost all occur in gene family stems (ie shortly after gene duplication in one branch), hardly any occur mid-history. For vertebrate imaging opsins, this means prior to lamprey divergence. In other words, not only had all the classes of imaging opsins emerged post-tunicate/amphioxus pre-lamprey but (neglecting tails) also all their indels. No further indels arose in the subsequent 500 million years in any of these opsins, as if these opsins were already optimized from the length perspective


Consequently the rate of indel occurence per billion years of branch length -- and so the frequency of multiple independent events near a given site -- is highly correlated to region, ie each region has a characteristic time scale over which it can be informative: too long and the risk of homoplasy (convergent evolution) is too high. That risk is exacerbated by uncertainty in gap placement within an alignment, which first requires delimitation by flanking invariant residues. Gap length per se is ambiguous: an indel of 3 residues shared by two extant species might have arisen once as a single event in the first species or as two events (one and two residues successively) in the other. Thus any phylogenetic interpretation of indels must be tempered by knowledge of the regional indel susceptibilities and the assumption these remain fairly constant across lineages and time.
Consequently the rate of indel occurrence per billion years of branch length -- and so the frequency of multiple independent events near a given site -- is highly correlated to region, ie each region has a characteristic time scale over which it can be informative: too long and the risk of homoplasy (convergent evolution) is too high. That risk is exacerbated by uncertainty in gap placement within an alignment, which first requires delimitation by flanking invariant residues. Gap length per se is ambiguous: an indel of 3 residues shared by two extant species might have arisen once as a single event in the first species or as two events (one and two residues successively) in the other. Thus any phylogenetic interpretation of indels must be tempered by knowledge of the regional indel susceptibilities and the assumption these remain fairly constant across lineages and time.


Informative indels show up as readily apparent columns of gaps in large-scale alignments. If present across a single opsin orthology class, that merely validates prior blast clustering and other rare genomic events in establishing those classes in the first place. Sporadic indels, defined here as indels found within a single opsin gene, arise from seqencing errors but if not might be an adaptive specialization. It's very rare to see a ciliary opsin indel restricted to a phylogenetic subclade but examples exist: the post-marsupial loss of 5 residues of RHO1 in the distal arrestin binding region.  
Informative indels show up as readily apparent columns of gaps in large-scale alignments. If present across a single opsin orthology class, that merely validates prior blast clustering and other rare genomic events in establishing those classes in the first place. Sporadic indels, defined here as indels found within a single opsin gene, arise from sequencing errors but if not might be an adaptive specialization. It's very rare to see a ciliary opsin indel restricted to a phylogenetic subclade but examples exist: the post-marsupial loss of 5 residues of RHO1 in the distal arrestin binding region.  


We're concerned here primarily with non-sporadic indels that span two or more orthology classes that speak to unresolved dating and topological issues in the gene tree. Significant individual indels visible on the [[Opsin_evolution:_alignment|alignment page]]. These give rise to a table sortable by position along the opsin sequence, indel length, region (eg 3rd cytoplasmic loop), higher taxonomic clade, and phylogenetic depth. Specific goals are dating indel events, characterizing remote opsins in pre-vertebrate deuterostomes, correctly placing cnidarians opsins, disambiguating opsins from non-opsin GPCR, and establishing ancestral lengths.
We're concerned here primarily with non-sporadic indels that span two or more orthology classes that speak to unresolved dating and topological issues in the gene tree. Significant individual indels visible on the [[Opsin_evolution:_alignment|alignment page]]. These give rise to a table sortable by position along the opsin sequence, indel length, region (eg 3rd cytoplasmic loop), higher taxonomic clade, and phylogenetic depth. Specific goals are dating indel events, characterizing remote opsins in pre-vertebrate deuterostomes, correctly placing cnidarians opsins, disambiguating opsins from non-opsin GPCR, and establishing ancestral lengths.
Line 47: Line 49:
The [[Opsin_evolution:_Cytoplasmic_face|third cytoplasmic loop]] has variable length distally. Length is constant within orthology classes with parietopsin having full length, parapinopsin one residue shorter, and all others two residues fewer. This is a region of high beta factor in bovine rhodopsin crystals, ie has too much movement to be assigned a conformation. Unsurprisingly no function has been assigned. While the indel pattern supports the conventional gene tree, evidently this indel hotspot has fixed at least three separate events. While that hasn't resulted in overt homoplasy in terms of length, additional events could be masked. This weakens interpretive certainty of indels in this region.
The [[Opsin_evolution:_Cytoplasmic_face|third cytoplasmic loop]] has variable length distally. Length is constant within orthology classes with parietopsin having full length, parapinopsin one residue shorter, and all others two residues fewer. This is a region of high beta factor in bovine rhodopsin crystals, ie has too much movement to be assigned a conformation. Unsurprisingly no function has been assigned. While the indel pattern supports the conventional gene tree, evidently this indel hotspot has fixed at least three separate events. While that hasn't resulted in overt homoplasy in terms of length, additional events could be masked. This weakens interpretive certainty of indels in this region.


The amino terminus has 4 informative indels, all deletions. The first unites unites RHO1 and RHO2 to the exclusion of all other opsins (as does the short highly conserved N-terminus with two glycosylation sites). No indel or intron distinguishes them. RHO2 has an odd phylogenetic distribution -- it seems to occur in one species of lamprey but not in genomic lamprey (despite 19 million traces) nor in cartilaginous nor ray-finned fish, but seeming rises again in lungfish, coelocanth, lizards, and chicken but not frog nor any mammal. Possibly the lamprey RHO2 is a lineage-specific duplication of lamprey RHO1. A later independent duplication in lobe-finned fish persisted until the mammalian nocturnal loss era. It may be missing in frog because of an incomplete genome.
The amino terminus has 4 informative indels, all deletions. The first unites unites RHO1 and RHO2 to the exclusion of all other opsins (as does the short highly conserved N-terminus with two glycosylation sites). No indel or intron distinguishes them. RHO2 has an odd phylogenetic distribution -- it seems to occur in one species of lamprey but not in genomic lamprey (despite 19 million traces) nor in cartilaginous nor ray-finned fish, but seeming rises again in lungfish, coelacanth, lizards, and chicken but not frog nor any mammal. Possibly the lamprey RHO2 is a lineage-specific duplication of lamprey RHO1. A later independent duplication in lobe-finned fish persisted until the mammalian nocturnal loss era. It may be missing in frog because of an incomplete genome.
 
=== A restricted SWS1 indel partitions Carnivora ===
 
Caniformia-restricted deletion in SWS1 splits Caniformia (dogs, bears, seals...) off from Felidae (cats, civits ...) within Carnivora. Whales and dolphin SWS1 are all recent pseudogenes. Despite this, they are full length but all exhibit an unprecedented N --> D substitution at the N D P C iron triangle. This N is deeply conserved in almost all opsin and even GPCR. This was may have been adaptive in some way initially but set the stage for later pseudogenization.
 
SWS1_homSap  LNAMVLVATLRYKKLRQPLNYILVNVSFG G FLLCIFSV F PVFVASCNGYFVFGRHVC human
SWS1_tarSyr  LNAMVLVATLHYRKLRQPLNYILVNVSLG G FLLCIFSV L PVFIASCRGYFVFGRHVC tarsier
SWS1_oryCun  LNAMVLVATLRYKKLRQPLNYILVNISLA G FLACIFSV F NVFVASCYGYFVFGRFVC rabbit
SWS1_ratNor  LNATVLVATLHYKKLRQPLNYILVNVSLG G FLFCIFSV F TVFIASCHGYFLFGRHVC rat
<font color ="blue">SWS1_ailMel  LNATVLVATLRYRKLRQPLNYILVNVSLA G FVYCI<font color ="red">-</font>SV S TVFIASCHGYFIFGRHVC panda
SWS1_canFam  LNGTVLVATLRYKKLRQPLNYILVNVSLG G FLYCI<font color ="red">-</font>SV S TVFIASCQGYFVFGRHVC dog
SWS1_enhLut  LNATVLVATLRYKKLRQPLNYILVNVSLG G FIYCI<font color ="red">-</font>SV S SVFIASCHGYFIFGHHIC otter
SWS1_phoVit  LNASVLVATLRYKKLRQPLNYILVNVSLG G FLYCI<font color ="red">-</font>SV S SVFIASCQGYFIFGRHVC seal
SWS1_ursMar  LNATVLVATLRYRKLRQPLNYILVNVSLA G FVYCI<font color ="red">-</font>SV S TVFIASCHGYFIFGRHVC bear</font>
<font color ="green">SWS1_felCat  LNATVLVATLRYRKLRQPLNYILVNVSLG G FLYCVSSV S IVFITSCHAYFIFGRHVC cat</font>
<font color ="brown">SWS1_hipAmp  LNATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F VVFITSCHGYFVFGRHVC hippo
SWS1_ptePum  LNATVLVATLRYRKLRQPLNYILVNVSLG G FLFCIFSV F TVFIASCQGYFVFGRHVC bat
SWS1_talEur  LNATVLVATLRYRKLRQPLNYILVNVSLG G FLFCIFSV L TVFIASCKGYFIFGRHVC mole
SWS1_sorAra  LNATVLVPTLRYRKLRQPLNYILVNVSLG G FLFCIFSV F TVIIASCKGYFVIGRHVC shrew
SWS1_susScr  LNATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F SVFIASCHGYFVFGRRVC pig
SWS1_bosTau  LNATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F IVFITSCYGYFVFGRHVC cow
SWS1_lamPac  LNATVLIATLRYRKLRQPLNYILVNVSLG G FIYCMFSV F CVFVASCYGYFVFGRRVC lama
SWS1_turTru  L<font color ="magenta">D</font>ATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F VVFITSCHGYFVFGRHVC dolphin</font>
SWS1_echTel  LNAVVLVATLRYRKLRQPLNYILVNVSLA S VLFCVISV F TVFVASCHGYFIFGRHVC hyrax
SWS1_monDom  LNAVVLVATLRYKKLRQPLNYILVNVSLC G FIFCIFAV F TVFISSSQGYFIFGRHVC
SWS1_smiCri  LNGVVLIATLRYKKLRQPLNYILVNISLA G FIFCVFSV F TVFVSSSQGYFVFGRHVC
SWS1_tarRom  LNAVVLIATLRYKKLRQPLNYILVNISLA G FIFCVISV F TVFISSSQGYFIFGRHVC
SWS1_galGal  LNAVVLWVTVRYKRLRQPLNYILVNISAS G FVSCVLSV F VVFVASARGYFVFGKRVC
SWS1_taeGut  LNAIVLIVTIKYKKLRQPLNYILVNISVS G LMCCVFCI F TVFIASSQGYFVFGKHMC
SWS1_anoCar  LNAIILIVTVKYKKLRQPLNYILVNISFA G FLFCTFSV F TVFMASSQGYFFFGRHVC
SWS1_utaSta  LNAIILIVTVKYKKLRQPLNYILVNISFA G FLFCVFSV F TVFLASSQGYFFFGRHIC
SWS1_xenLae  LNFIVLLVTIKYKKLRQPLNYILVNITVG G FLMCIFSI F PVFVSSSQGYFFFGRIAC
SWS1_neoFor  LNAIVLFVTIKYKKLQQPLNYILVNISLA G FIFCFFGV F AVFIASCQGYFIFGKTVC
SWS1_danRer  MNGIVLFVTMKYKKLRQPLNYILVNISLA G FIFDTFSV S QVSVCAARGYYSLGYTLC
SWS1_oryLat  LNFVVLLATAKYKKLRVPLNYILVNITFA G FIFVTFSV S QVFLASVRGYYFFGQTLC
SWS1_petMar  LNAIVLIVTVKCKKLRQPLTYMLVNISAA G LVFCLFSI S TVFLFSTQGYFVFGPTVC
SWS1_geoAus  LNAIVLVVTIKYKKLRQPLNYILVNISAA G LVFCLFSI S TVFVASMQGYFFLGPTIC
 
[[Image:AlaNonarepeat.jpg]]


=== Indels in melanopsins: TM2 region ===   
=== Indels in melanopsins: TM2 region ===   
Line 53: Line 94:
The mid-transmembrane helix region preceding the proline in TM2 -- the only opsin transmembrane helix ever to experience an indel in 100 billion years of  branch length evolution -- exhibits various independent insertions and deletions. That would seem to undercut efforts to make the length a definitive fundamental classifying tool among GPCR. The situation can be compounded by separate indels following the proline that, depending on gap placement, might affect the extracellular loop connecting TM2 and TM3.
The mid-transmembrane helix region preceding the proline in TM2 -- the only opsin transmembrane helix ever to experience an indel in 100 billion years of  branch length evolution -- exhibits various independent insertions and deletions. That would seem to undercut efforts to make the length a definitive fundamental classifying tool among GPCR. The situation can be compounded by separate indels following the proline that, depending on gap placement, might affect the extracellular loop connecting TM2 and TM3.


However with care, the <font color="blue"> homoplasy is managable</font>, making the locus is quite informative for opsins (though a detailed analysis is necessary to fully exploit it).
However with care, the <font color="blue"> homoplasy is manageable</font>, making the locus is quite informative for opsins (though a detailed analysis is necessary to fully exploit it).


An 'iron triangle' provides a fixed upstream frame of reference critical to reliable gapping of indels in this region. This consists of a very conserved  Asn55 in TM1 hydrogen bonded to the almost universal charged residue Asp83 internal to TM2 which is further hydrogen bonded [http://www.pnas.org/content/106/21/8555.full via internal H20] to N of the terminal NPXXY motif and a peptide amide Ala299 in TM7 (bovine rhodopsin numbering). The iron triangle is central to the proper associative bundling and relative orientation of the seven transmembrane helices in the vicinity of the Schiff base K296. No indels occur in any opsin or GPCR between this N and D (meaning [[Opsin_evolution:_Cytoplasmic_face#The_first_cytoplasmic_loop|cytoplasmic loop CL1]] is of fixed length, namely 12 aa). Note from the full alignment that D83 has been replaced by G in all teleost fish RHO2 and all SWS1; it is mixed with N83 in some RHO1, RHO2 and entirely N83 in SWS2 but ancestrally strictly D in basal ciliary opsins.
An 'iron triangle' provides a fixed upstream frame of reference critical to reliable gapping of indels in this region. This consists of a very conserved  Asn55 in TM1 hydrogen bonded to the almost universal charged residue Asp83 internal to TM2 which is further hydrogen bonded [http://www.pnas.org/content/106/21/8555.full via internal H20] to N of the terminal NPXXY motif and a peptide amide Ala299 in TM7 (bovine rhodopsin numbering). The iron triangle is central to the proper associative bundling and relative orientation of the seven transmembrane helices in the vicinity of the Schiff base K296. No indels occur in any opsin or GPCR between this N and D (meaning [[Opsin_evolution:_Cytoplasmic_face#The_first_cytoplasmic_loop|cytoplasmic loop CL1]] is of fixed length, namely 12 aa). Note from the full alignment that D83 has been replaced by G in all teleost fish RHO2 and all SWS1; it is mixed with N83 in some RHO1, RHO2 and entirely N83 in SWS2 but ancestrally strictly D in basal ciliary opsins.
Line 63: Line 104:
The effect is not dramatic in terms of angstroms of shift (as can be seen from a [http://www.pnas.org/content/106/21/8555/F1.large.jpg recent 3D alignment] of helix TM2 that compares bovine and squid opsins, yet it follows from comparative genomics that the consequences for adsorption spectrum and/or regulation of signaling must be substantial. In other words, gene clade specific retention of proline or specific substituents observed in the massive alignment below holding for billions of years of branch length is only feasible when adaptive.
The effect is not dramatic in terms of angstroms of shift (as can be seen from a [http://www.pnas.org/content/106/21/8555/F1.large.jpg recent 3D alignment] of helix TM2 that compares bovine and squid opsins, yet it follows from comparative genomics that the consequences for adsorption spectrum and/or regulation of signaling must be substantial. In other words, gene clade specific retention of proline or specific substituents observed in the massive alignment below holding for billions of years of branch length is only feasible when adaptive.


The 185 ciliary opsins (which includes 5 basal cnidarian opsins) in the reference sequence collection are all of the same length in this region (excpt for  odd Apis and Platynereis sequences), as are 65 peropsins, RGR and neuropsins, many melanopsins, and the vast majority of near-opsin GPCR. Consequently this length, denoted P59.2 (for proline in position 59 bovine rhodopsin numbering and 2 residues shorter in the proline-cysteine region than the longest opsins, is <font color="blue">ancestral for melanopsins</font> which themselves vary in length.
The 185 ciliary opsins (which includes 5 basal cnidarian opsins) in the reference sequence collection are all of the same length in this region (except for  odd Apis and Platynereis sequences), as are 65 peropsins, RGR and neuropsins, many melanopsins, and the vast majority of near-opsin GPCR. Consequently this length, denoted P59.2 (for proline in position 59 bovine rhodopsin numbering and 2 residues shorter in the proline-cysteine region than the longest opsins, is <font color="blue">ancestral for melanopsins</font> which themselves vary in length.


Deuterostome melanopsins are all of P.59.2 type, as are LMS and BCR arthropod melanopsins, a subclass of lophotrochozoan melanopsins, and the one known cndarian melanopsin. The remaining dozen known lophotrochozoan melanopsins are all type P.60.2. This class -- which fortunately includes the structurally determined squid melanopsin -- thus has a one residue insertion whose location appears to be 5 residues after the D and 4 before the P.  
Deuterostome melanopsins are all of P.59.2 type, as are LMS and BCR arthropod melanopsins, a subclass of lophotrochozoan melanopsins, and the one known cnidarian melanopsin. The remaining dozen known lophotrochozoan melanopsins are all type P.60.2. This class -- which fortunately includes the structurally determined squid melanopsin -- thus has a one residue insertion whose location appears to be 5 residues after the D and 4 before the P.  


Thus lophotrochozoan melanopsins had ancestral length up to a gene duplication which subsequently acquired this stem insertion in a descendent copy. A single other human GPCR, namely thyrotropin-releasing hormone receptor TRHR, is also P.60.2, demonstrating homoplasy. However given the rarity of transmembrane indel events, the history here can be reliably disambiguated assuming parsimony.
Thus lophotrochozoan melanopsins had ancestral length up to a gene duplication which subsequently acquired this stem insertion in a descendent copy. A single other human GPCR, namely thyrotropin-releasing hormone receptor TRHR, is also P.60.2, demonstrating homoplasy. However given the rarity of transmembrane indel events, the history here can be reliably disambiguated assuming parsimony.


The three classes of ecdysozoan ultraviolet melanopsins (represented by 44 genes) all share a one residue deletion in this same region, approximately at the 4th post-D residue, making them P.58.2 class, homoplasic to within gap placement to moderately abundant GPCR (eg somatostatin receptor). This event, affecting insects, crustaceans and chelicerates, occured deep within the stem lineage of ecdysozoa. More data from early diverging arthropods is needed to refine the timing. Recall these opsins have a [[Opsin_evolution:_key_critters_%28ecdysozoa%29#Ecdysozoa_.._opsin_repertoire_of_the_last_common_ancestor|peculiar lysine K90]] (sometimes E90) that tunes their adsorption into the ultraviolet. The extra residue loss may be required to correctly position the K90 for its blueshift.
The three classes of ecdysozoan ultraviolet melanopsins (represented by 44 genes) all share a one residue deletion in this same region, approximately at the 4th post-D residue, making them P.58.2 class, homoplasic to within gap placement to moderately abundant GPCR (eg somatostatin receptor). This event, affecting insects, crustaceans and chelicerates, occurred deep within the stem lineage of ecdysozoa. More data from early diverging arthropods is needed to refine the timing. Recall these opsins have a [[Opsin_evolution:_key_critters_%28ecdysozoa%29#Ecdysozoa_.._opsin_repertoire_of_the_last_common_ancestor|peculiar lysine K90]] (sometimes E90) that tunes their adsorption into the ultraviolet. The extra residue loss may be required to correctly position the K90 for its blueshift.


The three molluscan melanopsins of ancestral length share a striking signature aspartate residue two position preceding the proline, ie at this same K90 position. (Recall G90D and T94I in human RHO1 [http://cat.inist.fr/?aModele=afficheN&cpsidt=14572438 constitutively activate] transducin in absence of chromophore and cause night blindness.) Consequently these three opsins may also have their adsorption shifted towards the UV since otherwise G90 is present in lophotrochozoan melanopsins. They should be renamed (ie reclassified) to reflect probable parental character, with P.60.2 lophotrochozoan opsins renamed to MEL2.
The three molluscan melanopsins of ancestral length share a striking signature aspartate residue two position preceding the proline, ie at this same K90 position. (Recall G90D and T94I in human RHO1 [http://cat.inist.fr/?aModele=afficheN&cpsidt=14572438 constitutively activate] transducin in absence of chromophore and cause night blindness.) Consequently these three opsins may also have their adsorption shifted towards the UV since otherwise G90 is present in lophotrochozoan melanopsins. They should be renamed (ie reclassified) to reflect probable parental character, with P.60.2 lophotrochozoan opsins renamed to MEL2.


The post-proline pre-cysteine region has length variations that represent insertions in various homology classes. They are difficult to gap reliably other than occuring at the distal end of TM2 before the conserved block of extracellular loop EL1. As TM2 (by definition) just reaches the surface, these extra residues can be attributed to lengthened EL1. It emerges that indels outside the D to P region are only moderately informative. They may suffice to define narrow classes of opsins where blast clustering is ambiguous. While pseudo-homoplasic, that is readily resolvable given the sequence cluster isolation:
The post-proline pre-cysteine region has length variations that represent insertions in various homology classes. They are difficult to gap reliably other than occurring at the distal end of TM2 before the conserved block of extracellular loop EL1. As TM2 (by definition) just reaches the surface, these extra residues can be attributed to lengthened EL1. It emerges that indels outside the D to P region are only moderately informative. They may suffice to define narrow classes of opsins where blast clustering is ambiguous. While pseudo-homoplasic, that is readily resolvable given the sequence cluster isolation:


* Three amphioxus melanopsins (eg MEL6_braFlo) have a 1 residue distal deletion but MELmop_braFlo does not. This event constitutes an isolated class of sequences.
* Three amphioxus melanopsins (eg MEL6_braFlo) have a 1 residue distal deletion but MELmop_braFlo does not. This event constitutes an isolated class of sequences.
Line 81: Line 122:
* RGR opsins all have a 1 residue distal deletion; however two Ciona opsins have seemingly regained a residue. Five Ciona RGR have a deletion preceding the D. However because the proline anchor is lacking, placement is otherwise uncertain in this isolated opsin class. These same five opsins are unique in having tyrosine in place of the conserved asparagine N in TM1 (that bonds to D).
* RGR opsins all have a 1 residue distal deletion; however two Ciona opsins have seemingly regained a residue. Five Ciona RGR have a deletion preceding the D. However because the proline anchor is lacking, placement is otherwise uncertain in this isolated opsin class. These same five opsins are unique in having tyrosine in place of the conserved asparagine N in TM1 (that bonds to D).


* Five peropsins have an inserted residue preceding the D. This appears to define PER2 opsins which are curently restricted to amphioxus and sea urchin. Hemichordates have a peropsin of type PER1 lacking the insert. Lophotrochozoan peropsins also lack it. Thus it appears to be a very restricted early gene expansion that did not persist in vertebrates.
* Five peropsins have an inserted residue preceding the D. This appears to define PER2 opsins which are currently restricted to amphioxus and sea urchin. Hemichordates have a peropsin of type PER1 lacking the insert. Lophotrochozoan peropsins also lack it. Thus it appears to be a very restricted early gene expansion that did not persist in vertebrates.


* NEUR4 neuropsins have a large distal insertion of 4 residues. This class of opsins is quite obscure and lacks the proline.
* NEUR4 neuropsins have a large distal insertion of 4 residues. This class of opsins is quite obscure and lacks the proline.
Line 124: Line 165:
[[Image:Opsinh2o.jpg]]
[[Image:Opsinh2o.jpg]]


=== Alignment in TM2 region: 411 opsins ===
=== Alignment in TM2 region: 420 curated opsins ===


Colored blocks show useful opsin gene tree synapomorphies -- derived states relative to last common ancestor. These genes received the indel (ie the indel has been resolved). Note the consistency with gene names, which were derived independently via blast clustering without consideration of introns, indels, and other rare genomic events.
Colored blocks show useful opsin gene tree synapomorphies -- derived states relative to last common ancestor. The TM2 region is so rich in these that it can resolve many difficult gene classification issue and thus  might be called the Rosetta Stone region of opsins -- 10 are highlighted below. Genes altered by an indel are colored <font color="magenta">magenta</font>) -- ie the indel was resolvable as a specific insertion or deletion relative to ancestral. Note the consistency of blocks with gene names derived independently via blast clustering without consideration of introns, indels, and other rare genomic events.


Some phyoSNPs at key amino acids are also shown in <font color="brown">brown</font>, notably the sole class of ciliary opsins with ancestral proline surviving at position 59 which dilstingishes the ciliary ur-opsin class TMTa from later gene duplications that became other TMT homologs and encephalopsins. SWS1 opsins all have an asparagine in place of the key aspartate; RHO2 in teleost fish all have a glcine; NEUR4 all have a serine here as well as a 4 residue insert.
Some [[Opsin_evolution:_LWS_PhyloSNPs|phyoSNPs]] at key amino acids are also shown in <font color="red">red</font>, notably the K90 of insect ultraviolet melanopsins and sole class of ciliary opsins with ancestral proline surviving at position 59 which distinguishes the ciliary ur-opsin class TMTa from later gene duplications that became other TMT homologs and encephalopsins. SWS1 opsins all have an asparagine in place of the key aspartate; RHO2 in teleost fish all have a glycine; NEUR4 all have a serine here as well as a 4 residue insert.
             --TM1-----><---CL1---><---------------TM2------------><--EC1---><-TM3
             --TM1-----><---CL1---><---------------TM2------------><--EC1---><-TM3
  <font color="blue">MEL1_todPa  G N GIVIYLFTKTKSLQTPANMFIINLAFS D FTFSLVNGF P LMTISCFL-- KKWIFGF AA C  P.602
  <font color="blue">MEL1_todPa  G N GIVIYLFTKTKSLQTPANMFIINLAFS D FTFS<font color="magenta">L</font>VN<font color="RED">G</font>F P LMTISCFL-- KKWIFGF AA C  P.60.2
  MEL1_sepOf  G N GIVIYLFTKTKSLQTPANMFIINLAFS D FTFSLVNGF P LMTISCFI-- KKWVFGM AA C  P.60
  MEL1_sepOf  G N GIVIYLFTKTKSLQTPANMFIINLAFS D FTFS<font color="magenta">L</font>VN<font color="RED">G</font>F P LMTISCFI-- KKWVFGM AA C  P.60.2
  MEL1_entDo  G N GVVIYLFSKTKSLQTPANMFIINLAMS D LSFSAINGF P LKTISAFM-- KKWIFGK VA C  P.602
  MEL1_entDo  G N GVVIYLFSKTKSLQTPANMFIINLAMS D LSFS<font color="magenta">A</font>IN<font color="RED">G</font>F P LKTISAFM-- KKWIFGK VA C  P.60.2
  MEL1_patYe  G N TTVVYIFSNTKSLRSPSNLFVVNLAVS D LIFSAVNGF P LLTVSSFH-- QKWIFGS LF C  P.602
  MEL1_patYe  G N TTVVYIFSNTKSLRSPSNLFVVNLAVS D LIFS<font color="magenta">A</font>VN<font color="RED">G</font>F P LLTVSSFH-- QKWIFGS LF C  P.60.2
  MEL1_lotGi  G N FVVIYTFSRTKSLRTASNMFVVNLALS D LTFSAVNGF P LFSLSSFS-- HKWIFGR VA C  P.602
  MEL1_lotGi  G N FVVIYTFSRTKSLRTASNMFVVNLALS D LTFS<font color="magenta">A</font>VN<font color="RED">G</font>F P LFSLSSFS-- HKWIFGR VA C  P.60.2
  MEL1_plaDu  G N LLVVWTFLKTKSLRTAPNMLLVNLAIG D MAFSAINGF P LLTISSIN-- KRWVWGK LW c P.602
  MEL1_plaDu  G N LLVVWTFLKTKSLRTAPNMLLVNLAIG D MAFS<font color="magenta">A</font>IN<font color="RED">G</font>F P LLTISSIN-- KRWVWGK LW C P.60.2
  MEL1_schMe  G N LLVLYIFARAKSLRTPPNMFIMSLAIG D LTFSAVNGF P LLTISSFN-- TRWAWGK LT C  P.602
  MEL1_schMe  G N LLVLYIFARAKSLRTPPNMFIMSLAIG D LTFS<font color="magenta">A</font>VN<font color="RED">G</font>F P LLTISSFN-- TRWAWGK LT C  P.60.2
  MEL1_capCa  G N LVVITLFIKTRSLRTPPNMFIINLALS D MGFCATNGF P LMTVASFQ-- KLWRWGP VA C  P.602
  MEL1_capCa  G N LVVITLFIKTRSLRTPPNMFIINLALS D MGFC<font color="magenta">A</font>TN<font color="RED">G</font>F P LMTVASFQ-- KLWRWGP VA C  P.60.2
  MEL1_schMa  G N SLVITLFLLCKQLRTPPNMLIVSLAIS D FSFALINGF P LKTIAAFN-- HRWGWGK LA C  P.602
  MEL1_schMa  G N SLVITLFLLCKQLRTPPNMLIVSLAIS D FSFA<font color="magenta">L</font>IN<font color="RED">G</font>F P LKTIAAFN-- HRWGWGK LA C  P.60.2
  MEL2_schMa  L N LLVIVFFTMFKSLRTPSNILVVNLAIS D FGFSAVIGF P LKTMAAFN-- NFWPWGK LA C  P.602
  MEL2_schMa  L N LLVIVFFTMFKSLRTPSNILVVNLAIS D FGFS<font color="magenta">A</font>VI<font color="RED">G</font>F P LKTMAAFN-- NFWPWGK LA C  P.60.2
  MEL3_schMa  T N LLVIFVFLTPKSSISLQCALIINLAIS D FGFSAVIGF P LKTIAAFN-- QYWPWGS VA C  P.602
  MEL3_schMa  T N LLVIFVFLTPKSSISLQCALIINLAIS D FGFS<font color="magenta">A</font>VI<font color="RED">G</font>F P LKTIAAFN-- QYWPWGS VA C  P.60.2
  MEL1_helRo  G N IIVVWVFSRTPSLRTPSNVLVINLAIC D ILFSALIGF P MSALSCFQ-- RHWIWGN FY C  P.602
  MEL1_helRo  G N IIVVWVFSRTPSLRTPSNVLVINLAIC D ILFS<font color="magenta">A</font>LI<font color="RED">G</font>F P MSALSCFQ-- RHWIWGN FY C  P.60.2
  MEL2_helRo  . . .........TPILRTHANVLIINLALC D LIFSSLIGF P MTALSCFK-- RHWIWGD LG C  P.602</font>
  MEL2_helRo  . . .........TPILRTHANVLIINLALC D LIFS<font color="magenta">S</font>LI<font color="RED">G</font>F P MTALSCFK-- RHWIWGD LG C  P.60.2</font>
  MEL1_aplCa  G N SLVIITCIRFKDLRTRSNILIINLAVG D LLMC-LIDF P LLAAASFY-- GEWPYGR QV C  P.592
  MEL1_aplCa  G N SLVIITCIRFKDLRTRSNILIINLAVG D LLMC-LIDF P LLAAASFY-- GEWPYGR QV C  P.59.2
  MEL2_lotGi  G N SIVIWAHVRIKSLSTTSNMLILNLCVG C LIMC-IVDF P LYATSSFL-- QKWIFGH KV C  P.592
  MEL2_lotGi  G N SIVIWAHVRIKSLSTTSNMLILNLCVG C LIMC-IVDF P LYATSSFL-- QKWIFGH KV C  P.59.2
  MEL2_aplCa  . . ........RHSSLRTSSNLLVVNLTVA D LVMS-SLDF P ILAISSYK-- GCWVMGF LG C  P.592
  MEL2_aplCa  . . ........RHSSLRTSSNLLVVNLTVA D LVMS-SLDF P ILAISSYK-- GCWVMGF LG C  P.59.2
  MEL1_homSa  G N LTVIYTFCRSRSLRTPANMFIINLAVS D FLMS-FTQA P VFFTSSLY-- KQWLFGE TG C  P.592
MEL1_dapPu  A N STILYVFSRFKRLRTPANVFIINLTIC D FLA<font color="magenta">-</font>-CCLH P LAVYSAFR-- GRWSFGQ TG C  P.58.2
  MEL1_felCa  G N LMVIYTFCRSRGLRTPANMFIINLAVS D FFMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.592
  MEL1_homSa  G N LTVIYTFCRSRSLRTPANMFIINLAVS D FLMS-FTQA P VFFTSSLY-- KQWLFGE TG C  P.59.2
  MEL1_canFa  G N LMVIYTFCRTRGLRTPSNMFIINLAVS D FFMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.592
  MEL1_felCa  G N LMVIYTFCRSRGLRTPANMFIINLAVS D FFMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.59.2
  MEL1_myoLu  G N LTVIYTFCRSRGLRTPANMFIINLAVS D FLMC-FTQA P VVFASSIY-- KRWLFGE AG C  P.592
MEL1_ailMe  G N LMVIYTFCRTRGLRTPSNMFIINLAVS D FLMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.59.2
  MEL1_pteVa  G N LTVIYTFCRSRGLRTPANMFIINLAVS D FLMS-FTQA P VVFISSLY-- KRWLFGQ AG C  P.592
  MEL1_canFa  G N LMVIYTFCRTRGLRTPSNMFIINLAVS D FFMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.59.2
  MEL1_smiCr  G N LLVIYTFCRSRSLRTPANMFIINLAIS D FFMS-FTQA P VFFASSLY-- ERWIFGE KG C  P.592
  MEL1_myoLu  G N LTVIYTFCRSRGLRTPANMFIINLAVS D FLMC-FTQA P VVFASSIY-- KRWLFGE AG C  P.59.2
  MEL1_monDo  G N FLVIYTFCRSHSLRTPANMFIINLAIS D FFMS-FTQA P VFFASSMY-- KRWIFGE KA C  P.592
  MEL1_pteVa  G N LTVIYTFCRSRGLRTPANMFIINLAVS D FLMS-FTQA P VVFISSLY-- KRWLFGQ AG C  P.59.2
  MEL1_loxAf  G N LMVIYIFFRSRGLRTPANMFIINLAVS D FLMS-FTQA P VFFASSLY-- KRWLFGE AG C  P.592
  MEL1_smiCr  G N LLVIYTFCRSRSLRTPANMFIINLAIS D FFMS-FTQA P VFFASSLY-- ERWIFGE KG C  P.59.2
  MEL1_taeGu  G N FLVFYAFCRSRSLQTPANILIINLAIS D FLMS-ITQS P VFFTSSLY-- KHWIFGE KG C  P.592
  MEL1_monDo  G N FLVIYTFCRSHSLRTPANMFIINLAIS D FFMS-FTQA P VFFASSMY-- KRWIFGE KA C  P.59.2
  MEL1_galGa  G N FLVIYAFCRSRTLQKPANIFIINLAVS D FLMS-ITQS P VFFTNSLH-- KRWIFGE KG C  P.592
  MEL1_loxAf  G N LMVIYIFFRSRGLRTPANMFIINLAVS D FLMS-FTQA P VFFASSLY-- KRWLFGE AG C  P.59.2
  MEL1_xenTr  G N FLVIYAFCRSRSLRSPANMFIINLAIT D FLMS-VTQA P VFFATSLH-- KRWIFGE KG C  P.592
  MEL1_taeGu  G N FLVFYAFCRSRSLQTPANILIINLAIS D FLMS-ITQS P VFFTSSLY-- KHWIFGE KG C  P.59.2
  MEL1_danRe  G N FLVIYAFSRSRTLRTPANLFIINLAIT D FLMC-ATQA P IFFTTSMH-- KRWIFGE KG C  P.592
  MEL1_galGa  G N FLVIYAFCRSRTLQKPANIFIINLAVS D FLMS-ITQS P VFFTNSLH-- KRWIFGE KG C  P.59.2
  MEL1_takRu  G N FLVIYAFCRSRSLRTPANMFIINLAVT D LLMC-VTQT P IFFTTSMY-- KRWIFGE KG C  P.592
  MEL1_xenTr  G N FLVIYAFCRSRSLRSPANMFIINLAIT D FLMS-VTQA P VFFATSLH-- KRWIFGE KG C  P.59.2
  MEL1_gasAc  G N VLVIYAFSKSRSLRTPANMFIINLAIT D LLMC-VTQA P IFFTTSMH-- KRWIFGE KG C  P.592
  MEL1_danRe  G N FLVIYAFSRSRTLRTPANLFIINLAIT D FLMC-ATQA P IFFTTSMH-- KRWIFGE KG C  P.59.2
  MEL1_oryLa  G N FLVIYAFSRSRSLRTPANMFIINLAIT D LLMC-VTQS P IFFTTSMH-- KRWIFGE KG C  P.592
  MEL1_takRu  G N FLVIYAFCRSRSLRTPANMFIINLAVT D LLMC-VTQT P IFFTTSMY-- KRWIFGE KG C  P.59.2
  MEL1_calMi  G N FLVIYAFLRSRSLRTPANTFIINLAAT D FLMS-VTQS P IFFITSIH-- KRWIFGE KG C  P.592
  MEL1_gasAc  G N VLVIYAFSKSRSLRTPANMFIINLAIT D LLMC-VTQA P IFFTTSMH-- KRWIFGE KG C  P.59.2
  MEL1_petMa  G N VLVIYAFSKSKSLRSPANIFIINLAFA D FFMS-ITQT P IFFVTSLH-- KRWIFGE KG C  P.592
  MEL1_oryLa  G N FLVIYAFSRSRSLRTPANMFIINLAIT D LLMC-VTQS P IFFTTSMH-- KRWIFGE KG C  P.59.2
  MELmop_bra  G N AVVVYSFIKSKGLRTPANFFIINLALS D FLMN-LTNM P IFAVNSAF-- QRWLLSD FA C  P.592
  MEL1_calMi  G N FLVIYAFLRSRSLRTPANTFIINLAAT D FLMS-VTQS P IFFITSIH-- KRWIFGE KG C  P.59.2
  MELmop_bra  G N AVVVYSFIKSKGLRTPANFFIINLALS D FLMN-LTNM P IFAVNSAF-- QRWLLSD FA C  P.592
  MEL1_petMa  G N VLVIYAFSKSKSLRSPANIFIINLAFA D FFMS-ITQT P IFFVTSLH-- KRWIFGE KG C  P.59.2
  MEL1_strPu  . . ........WTKSLRTPPNMLIVNLAIS D FGMV-ITNF P LMFASTIY-- NRWLFGD AG C  P.592
  MELmop_bra  G N AVVVYSFIKSKGLRTPANFFIINLALS D FLMN-LTNM P IFAVNSAF-- QRWLLSD FA C  P.59.2
  MEL2_strPu  G N SLVIYTFLRFKKLHSPINLLIVNLSAS D LLVA-TTGT P LSMVSSFY-- GRWLFGT NA C  P.592
  MELmop_bra  G N AVVVYSFIKSKGLRTPANFFIINLALS D FLMN-LTNM P IFAVNSAF-- QRWLLSD FA C  P.59.2
  MEL2_galGa  G N LLVLYAFYSNKKLRTPQNFFIMNLAVS D FLMS-ASQA P ICFVNSLH-- REWILGD IG C  P.592
  MEL1_strPu  . . ........WTKSLRTPPNMLIVNLAIS D FGMV-ITNF P LMFASTIY-- NRWLFGD AG C  P.59.2
  MEL2_xenLa  G N MLVLYAFYRNKKLRTAPNYFIINLAIS D FLMS-ATQA P VCFLSSLH-- REWILGD IG C  P.592
  MEL2_strPu  G N SLVIYTFLRFKKLHSPINLLIVNLSAS D LLVA-TTGT P LSMVSSFY-- GRWLFGT NA C  P.59.2
  MEL2_anoCa  G N LLVLYAFYSNKRLRTPPNYFIMNLAVS D FLMS-ATQA P ICFLNSMH-- KEWVLGD IG C  P.592
  MEL2_galGa  G N LLVLYAFYSNKKLRTPQNFFIMNLAVS D FLMS-ASQA P ICFVNSLH-- REWILGD IG C  P.59.2
  MEL2_tetNi  G N VLVIFAFYSNKKLRSLPNYFIVNLAVS D LLMA-STQS P IFFIN-LY-- KEWMFGE TA C  P.592
  MEL2_xenLa  G N MLVLYAFYRNKKLRTAPNYFIINLAIS D FLMS-ATQA P VCFLSSLH-- REWILGD IG C  P.59.2
  MEL2_danRe  G N ALVMFAFYRNKKLRSLPNYFIMNLAVS D FLMA-ITQS P IFFINCLY-- KEWMFGE LG C  P.592
  MEL2_anoCa  G N LLVLYAFYSNKRLRTPPNYFIMNLAVS D FLMS-ATQA P ICFLNSMH-- KEWVLGD IG C  P.59.2
  MEL2_gasAc  G N ALVMLAVYSNKKLRNLPNYFIMNLAVS D FLMA-FTQS P IFFINCLY-- KEWAFGE TG C  P.592
  MEL2_tetNi  G N VLVIFAFYSNKKLRSLPNYFIVNLAVS D LLMA-STQS P IFFIN-LY-- KEWMFGE TA C  P.59.2
  <font color="#990099">MEL6_braFl  G N AVALYAFCRSRSLRRPKNYLIANLCLT D MVVC-LVYS P IIVTRSL--- SHGLPSK ES C  P.593
  MEL2_danRe  G N ALVMFAFYRNKKLRSLPNYFIMNLAVS D FLMA-ITQS P IFFINCLY-- KEWMFGE LG C  P.59.2
  MEL6_braBe  G N VVALYAFCRTRSLRRPKNYVVANLCLT D MFVC-LVYC P IVVSRSF--- SHGFPSK ES C  P.593
  MEL2_gasAc  G N ALVMLAVYSNKKLRNLPNYFIMNLAVS D FLMA-FTQS P IFFINCLY-- KEWAFGE TG C  P.59.2
  MELx_braFl  G N AVALYAFCSTRKLRRPKNYVVANLCLT D LIMC-IVYC P VIVISSF--- SGRIPTD GA C  P.593</font>
  <font color="blue">MEL6_braFl  G N AVALYAFCRSRSLRRPKNYLIANLCLT D MVVC-LVYS P IIVTRSL<font color="magenta">-</font>-- SHGLPSK ES C  P.59.3
  MEL6_braBe  G N VVALYAFCRTRSLRRPKNYVVANLCLT D MFVC-LVYC P IVVSRSF<font color="magenta">-</font>-- SHGFPSK ES C  P.59.3
  MELx_braFl  G N AVALYAFCSTRKLRRPKNYVVANLCLT D LIMC-IVYC P VIVISSF<font color="magenta">-</font>-- SGRIPTD GA C  P.59.3</font>
   
   
  LMS1_droMe  G N GVVIYIFATTKSLRTPANLLVINLAIS D FGIM-ITNT P MMGINLYF-- ETWVLGP MM C  P.592
  LMS1_droMe  G N GVVIYIFATTKSLRTPANLLVINLAIS D FGIM-ITNT P MMGINLYF-- ETWVLGP MM C  P.59.2
  LMS2_droMe  G N GVVVYIFGGTKSLRTPANLLVLNLAFS D FCMM-ASQS P VMIINFYY-- ETWVLGP LW C  P.592
  LMS2_droMe  G N GVVVYIFGGTKSLRTPANLLVLNLAFS D FCMM-ASQS P VMIINFYY-- ETWVLGP LW C  P.59.2
  LMS6_droMe  G N FIVMYIFTSSKGLRTPSNMFVVNLAFS D FMMM-FTMF P PVVLNGFY-- GTWIMGP FL C  P.592
  LMS6_droMe  G N FIVMYIFTSSKGLRTPSNMFVVNLAFS D FMMM-FTMF P PVVLNGFY-- GTWIMGP FL C  P.59.2
  LMS_anoGam  G N GMVIYIFSTAKSLRTPSNLFIVNLALS D FLMM-GTNA P TMVYNCWF-- ETWSLGL LM C  P.592
  LMS_anoGam  G N GMVIYIFSTAKSLRTPSNLFIVNLALS D FLMM-GTNA P TMVYNCWF-- ETWSLGL LM C  P.59.2
  LMS_rhoPro  G N GMVIFIFSSTKTLRTPSNLLVVNLAFS D FLMM-FTMS P PMVINCYN-- ETWVLGP LM C  P.592
  LMS_rhoPro  G N GMVIFIFSSTKTLRTPSNLLVVNLAFS D FLMM-FTMS P PMVINCYN-- ETWVLGP LM C  P.59.2
  LMS_schGre  G N GMVIYIFSTTKSLRTPSNLLVVNLAFS D FLMM-FTMS A PMGINCYY-- ETWVLGP FM C  P.592
  LMS_schGre  G N GMVIYIFSTTKSLRTPSNLLVVNLAFS D FLMM-FTMS A PMGINCYY-- ETWVLGP FM C  P.59.2
  LMS_lucCru  G N GMVIYIFSTTKSLRSPSNLLVVNLAFS D FLMM-FTMA P PMVINCYN-- ETWVWGP LF C  P.592
  LMS_lucCru  G N GMVIYIFSTTKSLRSPSNLLVVNLAFS D FLMM-FTMA P PMVINCYN-- ETWVWGP LF C  P.59.2
  LMS_triCas  G N GMVIYIFSSTKALRTPSNLLVVNLAFS D FLMM-LCMS P AMVINCYN-- ETWVLGP LV C  P.592
  LMS_triCas  G N GMVIYIFSSTKALRTPSNLLVVNLAFS D FLMM-LCMS P AMVINCYN-- ETWVLGP LV C  P.59.2
  LMS_manSex  G N GMVIYIFMSTKSLKTPSNLLVVNLAFS D FLMM-CAMS P AMVVNCYY-- ETWVWGP FA C  P.592
  LMS_manSex  G N GMVIYIFMSTKSLKTPSNLLVVNLAFS D FLMM-CAMS P AMVVNCYY-- ETWVWGP FA C  P.59.2
  LMS_papXut  G N GMVVYIFTSTKSLKTPSNLLVVNLAFS D FLMM-LCMA P PMLINCYY-- ETWVFGP LA C  P.592
  LMS_papXut  G N GMVVYIFTSTKSLKTPSNLLVVNLAFS D FLMM-LCMA P PMLINCYY-- ETWVFGP LA C  P.59.2
  LMS_homCoa  G N GMVVYIFSCTKALRTPSNLLVVNLAFS D FLMM-FTMA P PMVLNCYY-- ETWVLGP FM C  P.592
  LMS_homCoa  G N GMVVYIFSCTKALRTPSNLLVVNLAFS D FLMM-FTMA P PMVLNCYY-- ETWVLGP FM C  P.59.2
  LMSa_nasVi  G N GMVVYIFASTKSLRTPSNLLVINLAFS D FCMM-FTMS P PMVINCYY-- ETWVFGP LM C  P.592
  LMSa_nasVi  G N GMVVYIFASTKSLRTPSNLLVINLAFS D FCMM-FTMS P PMVINCYY-- ETWVFGP LM C  P.59.2
  LMSb_apiMe  G N GMVVYIFLSTKSLRTPSNLFVINLAIS D FLMM-FCMS P PMVINCYY-- ETWVLGP LF C  P.592
  LMSb_apiMe  G N GMVVYIFLSTKSLRTPSNLFVINLAIS D FLMM-FCMS P PMVINCYY-- ETWVLGP LF C  P.59.2
  LMS_acyPis  G N GMVIYIFTCTKNLRTPSNLLIVNLAFS D FCLM-FTMC P AMVWNCFY-- ETWMFGP FA C  P.592
  LMS_acyPis  G N GMVIYIFTCTKNLRTPSNLLIVNLAFS D FCLM-FTMC P AMVWNCFY-- ETWMFGP FA C  P.59.2
  LMSb_nasVi  G N GMVVYIFLVTPSLRTPSNLLVINLAFS D FVMM-IIMS P PMVVNCWY-- ETWILGP LM C  P.592
  LMSb_nasVi  G N GMVVYIFLVTPSLRTPSNLLVINLAFS D FVMM-IIMS P PMVVNCWY-- ETWILGP LM C  P.59.2
  LMSa_apiMe  G N GVVVYVFIMTPSLRTPSNLLVVNLAFS D FIMM-GFMC P PMVICCFY-- ETWVLGS LM C  P.592
  LMSa_apiMe  G N GVVVYVFIMTPSLRTPSNLLVVNLAFS D FIMM-GFMC P PMVICCFY-- ETWVLGS LM C  P.59.2
  LMS_meoOer  G N FVVIWVFMNTKALRSPANTLVVSLAVS D FIMM-ACMF P PLVLNCYW-- GTWIFGP LF C  P.592
  LMS_meoOer  G N FVVIWVFMNTKALRSPANTLVVSLAVS D FIMM-ACMF P PLVLNCYW-- GTWIFGP LF C  P.59.2
  LMS_limPol  G N GMVIYLMMTTKSLRTPTNLLVVNLAFS D FCMM-AFMM P TMTSNCFA-- ETWILGP FM C  P.592
  LMS_limPol  G N GMVIYLMMTTKSLRTPTNLLVVNLAFS D FCMM-AFMM P TMTSNCFA-- ETWILGP FM C  P.59.2
  LMS2_plePa  G N GMVMYLMNTTKSLKTPTNMLIVNLAFS D FCMM-AFMM P TMAANCFA-- ETWILGP FM C  P.592
  LMS2_plePa  G N GMVMYLMNTTKSLKTPTNMLIVNLAFS D FCMM-AFMM P TMAANCFA-- ETWILGP FM C  P.59.2
  LMS2_hasAd  G N GMVIYLMSTTKSLKTPTNMLIVNLAFS D FCMM-AFMM P TMAANCFA-- ETWILGP LM C  P.592
  LMS2_hasAd  G N GMVIYLMSTTKSLKTPTNMLIVNLAFS D FCMM-AFMM P TMAANCFA-- ETWILGP LM C  P.59.2
  LMS_ixoSca  G N SMVIYIMTTSKSLRSPTNMLVVNLAFS D WCMM-AFMM P TMAANCFA-- ETWILGP FM C  P.592
  LMS_ixoSca  G N SMVIYIMTTSKSLRSPTNMLVVNLAFS D WCMM-AFMM P TMAANCFA-- ETWILGP FM C  P.59.2
  LMS1_plePa  G N SIVIYLMLSVKSLRTPANFLVTSLAVS D GGML-AFMA P TMPINCFA-- QTWVLGP FM C  P.592
  LMS1_plePa  G N SIVIYLMLSVKSLRTPANFLVTSLAVS D GGML-AFMA P TMPINCFA-- QTWVLGP FM C  P.59.2
  LMS1_hasAd  G N GVVMYLMMTVKNLRTPGNFLVLNLALS D FGML-FFMM P TMSINCFA-- ETWVIGP FM C  P.592
  LMS1_hasAd  G N GVVMYLMMTVKNLRTPGNFLVLNLALS D FGML-FFMM P TMSINCFA-- ETWVIGP FM C  P.59.2
  <font color="blue">BCRa_hemSa  G N GLVIYLYMKSQALKTPANMLIVNLALS D LIML-TTNF P PFCYNCFGS- GRWMFSG TY C  P.591
  <font color="blue">BCRa_hemSa  G N GLVIYLYMKSQALKTPANMLIVNLALS D LIML-TTNF P PFCYNCFG<font color="magenta">S</font>- GRWMFSG TY C  P.59.1
  BCRb_hemSa  G N GLVIYLFNKSAALRTPANILVVNLALS D LIML-TTNV P FFTYNCFGS- GVWMFSP QY C  P.591
  BCRb_hemSa  G N GLVIYLFNKSAALRTPANILVVNLALS D LIML-TTNV P FFTYNCFG<font color="magenta">S</font>- GVWMFSP QY C  P.59.1
  BCR_porPel  G N GMVIYLFAKCQALRTPANILVVNLALS D LIML-TTNV P FFTYNCFGN- GVWMFSA TY C  P.591
  BCR_porPel  G N GMVIYLFAKCQALRTPANILVVNLALS D LIML-TTNV P FFTYNCFG<font color="magenta">N</font>- GVWMFSA TY C  P.59.1
  BCR_triGra  G N SLVISLFTKTKELRTPANMFVVNLAFS D LCMM-ITQF P MFVYNCFGN- GMWLFGP FL C  P.591
  BCR_triGra  G N SLVISLFTKTKELRTPANMFVVNLAFS D LCMM-ITQF P MFVYNCFG<font color="magenta">N</font>- GMWLFGP FL C  P.59.1
  BCR2_triLo  G N SLVISLFTKTKELRTPANMFVVNLAFS D LCMM-ITQF P MFVYNCFGN- GMWLFGP FL C  P.591
  BCR2_triLo  G N SLVISLFTKTKELRTPANMFVVNLAFS D LCMM-ITQF P MFVYNCFG<font color="magenta">N</font>- GMWLFGP FL C  P.59.1
  BCR_limPol  G Q SVVLYLFAKTKPLRTPANMLIVNLAFS D FMMM-ITQF P VFIINCLGG- GAWQLGP LL C  P.591
  BCR_limPol  G Q SVVLYLFAKTKPLRTPANMLIVNLAFS D FMMM-ITQF P VFIINCLG<font color="magenta">G</font>- GAWQLGP LL C  P.59.1
  BCR2_braKu  G N GLVIWIFLKTKSLRTPSNMLIVNLAIA D FFMM-LTQS P LYIISAFST- RWWIWGH FW C  P.591
  BCR2_braKu  G N GLVIWIFLKTKSLRTPSNMLIVNLAIA D FFMM-LTQS P LYIISAFS<font color="magenta">T</font>- RWWIWGH FW C  P.59.1
  BCR3_braKu  G N GLVIKIFLKTKSLRTPSNMLIVNLAIA D FFMM-LTQS P LFIISAFSS- RWWIWGH FW C  P.591
  BCR3_braKu  G N GLVIKIFLKTKSLRTPSNMLIVNLAIA D FFMM-LTQS P LFIISAFS<font color="magenta">S</font>- RWWIWGH FW C  P.59.1
  BCR1_triGr  G N YLVLRIFTKFQELRRPSNVLVINLALS D MLLM-LTLF P ECVYNFLGS- GPWRFGD LG C  P.591</font>
  BCR1_triGr  G N YLVLRIFTKFQELRRPSNVLVINLALS D MLLM-LTLF P ECVYNFLG<font color="magenta">S</font>- GPWRFGD LG C  P.591</font>
  <font color="#990099">BCR2_triGr  G N VLVLHIFGKHKNLRSPTNTLLMNLAFC D LMIF-IGLY P EMLGNIFDMN GTWMWGD VA C  P.590
  <font color="green">BCR2_triGr  G N VLVLHIFGKHKNLRSPTNTLLMNLAFC D LMIF-IGLY P EMLGNIFM<font color="magenta">ND</font> GTWMWGD VA C  P.59.0
  BCR1_triLo  G N VLVLHIFGKHKNLRSPTNTLLMNLAFC D LMIF-IGLY P EMLGNIFDMN GTWMWGD IA C  P.590
  BCR1_triLo  G N VLVLHIFGKHKNLRSPTNTLLMNLAFC D LMIF-IGLY P EMLGNIFM<font color="magenta">ND</font> GTWMWGD IA C  P.59.0
  BCR3_triGr  G N VLVLYIFGKYKSLRSPTNVLVMNLAFC D LGLF-VGLY P ELLGNIFNIN GPWMWGD VA C  P.590</font>
  BCR3_triGr  G N VLVLYIFGKYKSLRSPTNVLVMNLAFC D LGLF-VGLY P ELLGNIFI<font color="magenta">NN</font> GPWMWGD VA C  P.59.0</font>
MEL1_dapPu  A N STILYVFSRFKRLRTPANVFIINLTIC D FLA--CCLH P LAVYSAFR-- GRWSFGQ TG C  P.582
<font color="red">UV7a_acyPi  G N SLVIFMYFKCRSLQTPANMLIINLAVS D FIM--LAKA S VFIYNSYY-- LGPALGK LG C  P.582
UV7b_acyPi  G N SLVIFMYIKCKSLQTPANVLIMNLAVS D FIM--LAKT P VFIYNSFY-- QGPTLGK LG C  P.582
UV7_rhoPro  G N LLVIFMILRFRTLRTSSNILILNLAVS D FLM--VAKM P VFIYNSFY-- FGPVLGE MG C  P.582
UV7_anoGam  G N ALVVFMFYRYRSLRTPANYLVINLAVA D FII--MMEA P MFIYNSIH-- QGPALGS IG C  P.58
UV7_aedAeg  G N LLVILMFFRFKSLRTPANYLVINLAIA D FII--MLEA P LFVYNSYH-- QGPATGN VW C  P.582
UV7_culQui  G N VLVIFMFFKFKSLRTPANYLVINLAVA D FLI--MLEA P IFVYNSYH-- LGPAFGN TL C  P.582
UV7_droMel  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LIKC P IAIYNNIK-- EGPALGD IA C  P.582
UV7_droYak  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LIKC P IAIYNNIK-- EGPALGD IA C  P.582
UV7_droAna  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIK-- EGPALGD VA C  P.582
UV7_droPse  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIK-- EGPALGD AA C  P.582
UV7_droWil  G N AFVIFMFSNRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIK-- EGPALGD IA C  P.582
UV7_droMoj  G N AFVIFMFGSRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIQ-- EGPALGD AA C  P.582
UV7_pedHum  G N FLIIYLFLRKRSLRTPSNVFIFNLAVS D SLL--LLKM P VFIINSFY-- LGPALGN LG C  P.582
UV7_ixoSca  . . .........RRRIRSQANLLVFNLALS D LLM--VLEI P LLVYNSLK-- LRPALGV WG C  P.582
UV5_plePay  G N AIVMYIFFSAKTLRTPTNMFVIGLAMA D LLM--MSKT P VFIYNCFH-- LGPVFGQ IG C  P.582
UV5_hasAda  G N AIVIYIFSVSKSLRTPTNMFVIGLAMA D LLM--MSKT P VFIYNCFH-- LGPVFGQ LG C  P.582
UV5_braKug  G N GVVIWVFASAKSLRTPSNLFVINLAVL D FLM--MLKT P VFIVNSFN-- EGPIWGK TG C  P.582
UV5_triLon  G N GVVIWIFSSAKSLRTPSNMFVINLAVL D FIM--MMKT P VFIVNSFN-- EGPIWGK FG C  P.582
UV5_triGra  G N GVVIWIFSSAKSLRTPSNMFVINLAVL D FIM--MMKT P VFIVNSFN-- EGPIWGK FG C  P.582
UV5a_dapPu  G N GVVIWIFTNCKSLRTPSNMLVVNLAIL D MLM--MLKS P VMIINSYN-- EGPIWGK LG C  P.582
UV5b_dapPu  G N GIVIYIFSTTKELKTPSNILILNLAIC D FIM--MIKT P IFIVNSFN-- EGPVFGR LG C  P.582
UV5_papXut  G N GLVIFIFSASKSLRTPSNLLVVQLAVL D FLM--MLKA P IFIYNSIK-- RGFASGV IG C  P.582
UV5_manSex  G N GMVIFIFSTTKSLRTSSNFLVLNLAIL D FIM--MAKA P -FIYNSAM-- RGFAVGT VG C  P.582
UV5_apiMel  G N GLVIWIFCAAKSLRTPSNMFVVNLAIC D FFM--MIKT P IFIYNSFN-- TGFALGN LG C  P.582
UV5_nasVit  G N GLVIWIFCAAKSLRTPSNMFVVNLAIC D FMM--MLKT P IFIYNSFH-- TGFALGN LG C  P.582
UV5_diaNig  G N GLVIWVFSSAKTLRTPSNIFVINLALY D FIM--MLKT P IFIYNSFN-- LGFGLGQ LG C  P.582
UV5_lucCru  G N GLVLWIFSTSKSLKTASNMFVVNLAFC D FIM--MMKM P IFVYNSFN-- RGYALGH IG C  P.582
UV5_triCas  G N GLVIWIFSTSKSLRTASNMFVVNLAIC D FAM--MIKT P IFIYNSFY-- RGFALGH LG C  P.582
UV5_anoGam  G N GLVIWIFIAAKSLRTPSNVFVINLAIC D FFM--MAKT P IFIYNSFT-- KGFTLGN LG C  P.582
UV4_droMel  G N GMVIWIFSTSKSLRTPSNMFVLNLAVF D LIM--CLKA P IFIYNSFH-- RGFALGN TW C  P.582
UV3_droMel  G N GLVIWVFSAAKSLRTPSNILVINLAFC D FMM--MVKT P IFIYNSFH-- QGYALGH LG C  P.582
UV5_rhoPro  G N GLVIWIFSTAKTLRTPSNIFVVNLAIC D FLM--MSKT P IFIYNSFK-- LGYALGH RA C  P.582
UV5_pedHum  G N GIVIWIFTTSKNLRTASNVFVVNLAIF D FIM--MAKT P IMIYNSMN-- LGFECGF VW C  P.582
UV5_acyPis  G N GLVIWVFCVAKPLRTPSNIFVINLALC D FVM--MAKA P IFILGSIN-- RGY-QGH FL C  P.582
UVB_acyPis  G N GLVLWIFCVSKPLRTPSNLFVLNLALC D FSM--VLVL P ILIYDSID-- HKY-PGH LQ C  P.582
UVB_megVic  G N GLVLWIFCVSKPLRTPSNLFVLNLALC D FSM--VLVL P ILIYDSID-- HKY-PGH LQ C  P.582
UVB_anoGam  G N GIVLWIFGTSKSLRNGSNMFIINLAIF D LLM--MCEM P MFLVNSFS-- ERLVGYG VG C  P.582
UVB_diaNig  G N GIVLWIFATTKSLRTPSNMFVVNQALL D LLM--MIEM P MFVLNSLYF- QRPIGWE MG C  P.581
UVB_manSex  G N GIVIWIFSTSKSLRSASNMFVINLAVF D LMM--MLEM P LLIMNSFY-- QRLVGYQ LG C  P.582
UVB_apiMel  G N CCVIWIFSTSKSLRTPSNMFIVSLAIF D IIM--AFEM P MLVISSFM-- ERMIGWE IG C  P.582
UVB_nasVit  G N GCVVWIFSTSKVLRTPSNLFIINLALF D LVM--ALEI P MLIINSFI-- ERMIGWG LG C  P.582
UV5B_droMe  G N GLVIWIFSTSKSLRTPSNLLILNLAIF D LFM--CTNM P HYLINATV-- GYIVGGD LG C  P.582</font>
   
   
  RHO1_bosTa  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTLYTSLH-- GYFVFGP TG C  P.592
<font color="blue">UV7a_acyPi  G N SLVIFMYFKCRSLQTPANMLIINLAVS D FIM<font color="magenta">-</font>-LA<font color="red">K</font>A S VFIYNSYY-- LG<font color="red">P</font>ALGK LG C  P.58.2
UV7b_acyPi  G N SLVIFMYIKCKSLQTPANVLIMNLAVS D FIM<font color="magenta">-</font>-LA<font color="red">K</font>T P VFIYNSFY-- QG<font color="red">P</font>TLGK LG C  P.58.2
UV7_rhoPro  G N LLVIFMILRFRTLRTSSNILILNLAVS D FLM<font color="magenta">-</font>-VA<font color="red">K</font>M P VFIYNSFY-- FG<font color="red">P</font>VLGE MG C  P.58.2
UV7_anoGam  G N ALVVFMFYRYRSLRTPANYLVINLAVA D FII<font color="magenta">-</font>-MM<font color="BROWN">E</font>A P MFIYNSIH-- QG<font color="red">P</font>ALGS IG C  P.58.2
UV7_aedAeg  G N LLVILMFFRFKSLRTPANYLVINLAIA D FII<font color="magenta">-</font>-ML<font color="BROWN">E</font>A P LFVYNSYH-- QG<font color="red">P</font>ATGN VW C  P.58.2
UV7_culQui  G N VLVIFMFFKFKSLRTPANYLVINLAVA D FLI<font color="magenta">-</font>-ML<font color="BROWN">E</font>A P IFVYNSYH-- LG<font color="red">P</font>AFGN TL C  P.58.2
UV7_droMel  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM<font color="magenta">-</font>-LI<font color="red">K</font>C P IAIYNNIK-- EG<font color="red">P</font>ALGD IA C  P.58.2
UV7_droYak  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM<font color="magenta">-</font>-LI<font color="red">K</font>C P IAIYNNIK-- EG<font color="red">P</font>ALGD IA C  P.58.2
UV7_droAna  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM<font color="magenta">-</font>-LV<font color="red">K</font>C P IAIYNNIK-- EG<font color="red">P</font>ALGD VA C  P.58.2
UV7_droPse  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM<font color="magenta">-</font>-LV<font color="red">K</font>C P IAIYNNIK-- EG<font color="red">P</font>ALGD AA C  P.58.2
UV7_droWil  G N AFVIFMFSNRKSLRTPANILVMNLAIC D FLM<font color="magenta">-</font>-LV<font color="red">K</font>C P IAIYNNIK-- EG<font color="red">P</font>ALGD IA C  P.58.2
UV7_droMoj  G N AFVIFMFGSRKSLRTPANILVMNLAIC D FLM<font color="magenta">-</font>-LV<font color="red">K</font>C P IAIYNNIQ-- EG<font color="red">P</font>ALGD AA C  P.58.2
UV7_pedHum  G N FLIIYLFLRKRSLRTPSNVFIFNLAVS D SLL<font color="magenta">-</font>-LL<font color="red">K</font>M P VFIINSFY-- LG<font color="red">P</font>ALGN LG C  P.58.2
UV7_ixoSca  . . .........RRRIRSQANLLVFNLALS D LLM<font color="magenta">-</font>-VL<font color="BROWN">E</font>I P LLVYNSLK-- LR<font color="red">P</font>ALGV WG C  P.58.2
UV5_plePay  G N AIVMYIFFSAKTLRTPTNMFVIGLAMA D LLM<font color="magenta">-</font>-MS<font color="red">K</font>T P VFIYNCFH-- LG<font color="red">P</font>VFGQ IG C  P.58.2
UV5_hasAda  G N AIVIYIFSVSKSLRTPTNMFVIGLAMA D LLM<font color="magenta">-</font>-MS<font color="red">K</font>T P VFIYNCFH-- LG<font color="red">P</font>VFGQ LG C  P.58.2
UV5_braKug  G N GVVIWVFASAKSLRTPSNLFVINLAVL D FLM<font color="magenta">-</font>-ML<font color="red">K</font>T P VFIVNSFN-- EG<font color="red">P</font>IWGK TG C  P.58.2
UV5_triLon  G N GVVIWIFSSAKSLRTPSNMFVINLAVL D FIM<font color="magenta">-</font>-MM<font color="red">K</font>T P VFIVNSFN-- EG<font color="red">P</font>IWGK FG C  P.58.2
UV5_triGra  G N GVVIWIFSSAKSLRTPSNMFVINLAVL D FIM<font color="magenta">-</font>-MM<font color="red">K</font>T P VFIVNSFN-- EG<font color="red">P</font>IWGK FG C  P.58.2
UV5a_dapPu  G N GVVIWIFTNCKSLRTPSNMLVVNLAIL D MLM<font color="magenta">-</font>-ML<font color="red">K</font>S P VMIINSYN-- EG<font color="red">P</font>IWGK LG C  P.58.2
UV5b_dapPu  G N GIVIYIFSTTKELKTPSNILILNLAIC D FIM<font color="magenta">-</font>-MI<font color="red">K</font>T P IFIVNSFN-- EG<font color="red">P</font>VFGR LG C  P.58.2
UV5_papXut  G N GLVIFIFSASKSLRTPSNLLVVQLAVL D FLM<font color="magenta">-</font>-ML<font color="red">K</font>A P IFIYNSIK-- RGFASGV IG C  P.58.2
UV5_manSex  G N GMVIFIFSTTKSLRTSSNFLVLNLAIL D FIM<font color="magenta">-</font>-MA<font color="red">K</font>A P -FIYNSAM-- RGFAVGT VG C  P.58.2
UV5_apiMel  G N GLVIWIFCAAKSLRTPSNMFVVNLAIC D FFM<font color="magenta">-</font>-MI<font color="red">K</font>T P IFIYNSFN-- TGFALGN LG C  P.58.2
UV5_nasVit  G N GLVIWIFCAAKSLRTPSNMFVVNLAIC D FMM<font color="magenta">-</font>-ML<font color="red">K</font>T P IFIYNSFH-- TGFALGN LG C  P.58.2
UV5_diaNig  G N GLVIWVFSSAKTLRTPSNIFVINLALY D FIM<font color="magenta">-</font>-ML<font color="red">K</font>T P IFIYNSFN-- LGFGLGQ LG C  P.58.2
UV5_lucCru  G N GLVLWIFSTSKSLKTASNMFVVNLAFC D FIM<font color="magenta">-</font>-MM<font color="red">K</font>M P IFVYNSFN-- RGYALGH IG C  P.58.2
UV5_triCas  G N GLVIWIFSTSKSLRTASNMFVVNLAIC D FAM<font color="magenta">-</font>-MI<font color="red">K</font>T P IFIYNSFY-- RGFALGH LG C  P.58.2
UV5_anoGam  G N GLVIWIFIAAKSLRTPSNVFVINLAIC D FFM<font color="magenta">-</font>-MA<font color="red">K</font>T P IFIYNSFT-- KGFTLGN LG C  P.58.2
UV4_droMel  G N GMVIWIFSTSKSLRTPSNMFVLNLAVF D LIM<font color="magenta">-</font>-CL<font color="red">K</font>A P IFIYNSFH-- RGFALGN TW C  P.58.2
UV3_droMel  G N GLVIWVFSAAKSLRTPSNILVINLAFC D FMM<font color="magenta">-</font>-MV<font color="red">K</font>T P IFIYNSFH-- QGYALGH LG C  P.58.2
UV5_rhoPro  G N GLVIWIFSTAKTLRTPSNIFVVNLAIC D FLM<font color="magenta">-</font>-MS<font color="red">K</font>T P IFIYNSFK-- LGYALGH RA C  P.58.2
UV5_pedHum  G N GIVIWIFTTSKNLRTASNVFVVNLAIF D FIM<font color="magenta">-</font>-MA<font color="red">K</font>T P IMIYNSMN-- LGFECGF VW C  P.58.2
UV5_acyPis  G N GLVIWVFCVAKPLRTPSNIFVINLALC D FVM<font color="magenta">-</font>-MA<font color="red">K</font>A P IFILGSIN-- RGY-QGH FL C  P.58.2
UVB_anoGam  G N GIVLWIFGTSKSLRNGSNMFIINLAIF D LLM<font color="magenta">-</font>-MC<font color="red">E</font>M P MFLVNSFS-- ERLVGYG VG C  P.58.2
UVB_diaNig  G N GIVLWIFATTKSLRTPSNMFVVNQALL D LLM<font color="magenta">-</font>-MI<font color="BROWN">E</font>M P MFVLNSLYF- QRPIGWE MG C  P.58.1
UVB_manSex  G N GIVIWIFSTSKSLRSASNMFVINLAVF D LMM<font color="magenta">-</font>-ML<font color="BROWN">E</font>M P LLIMNSFY-- QRLVGYQ LG C  P.58.2
UVB_apiMel  G N CCVIWIFSTSKSLRTPSNMFIVSLAIF D IIM<font color="magenta">-</font>-AF<font color="BROWN">E</font>M P MLVISSFM-- ERMIGWE IG C  P.58.2
UVB_nasVit  G N GCVVWIFSTSKVLRTPSNLFIINLALF D LVM<font color="magenta">-</font>-AL<font color="BROWN">E</font>I P MLIINSFI-- ERMIGWG LG C  P.58.2
UV5B_droMe  G N GLVIWIFSTSKSLRTPSNLLILNLAIF D LFM<font color="magenta">-</font>-CTNM P HYLINATV-- GYIVGGD LG C  P.58.2
UVB_acyPis  G N GLVLWIFCVSKPLRTPSNLFVLNLALC D FSM<font color="magenta">-</font>-VLVL P ILIYDSID-- HKY-PGH LQ C  P.58.2
UVB_megVic  G N GLVLWIFCVSKPLRTPSNLFVLNLALC D FSM<font color="magenta">-</font>-VLVL P ILIYDSID-- HKY-PGH LQ C  P.58.2</font>
  RHO1_bosTa  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTLYTSLH-- GYFVFGP TG C  P.59.2
  RHO1_homSa  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-LGGF T STLYTSLH-- GYFVFGP TG C  P.581
  RHO1_homSa  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-LGGF T STLYTSLH-- GYFVFGP TG C  P.581
  RHO1_monDo  I N FLTLYVTIQHKKLRTPLNYILLNLAIA D LFMV-FGGF T MTLYTSLH-- GYFVFGP TG C  P.592
  RHO1_monDo  I N FLTLYVTIQHKKLRTPLNYILLNLAIA D LFMV-FGGF T MTLYTSLH-- GYFVFGP TG C  P.59.2
  RHO1_ornAn  I N FLTLYVTIQHKKLRTPLNYILLNLAFA N HFMV-LGGF T TTLYTSLH-- GYFVFGP TG C  P.592
  RHO1_ornAn  I N FLTLYVTIQHKKLRTPLNYILLNLAFA N HFMV-LGGF T TTLYTSLH-- GYFVFGP TG C  P.59.2
  RHO1_galGa  V N FLTLYVTIQHKKLRTPLNYILLNLVVA D LFMV-FGGF T TTMYTSMN-- GYFVFGV TG C  P.592
  RHO1_galGa  V N FLTLYVTIQHKKLRTPLNYILLNLVVA D LFMV-FGGF T TTMYTSMN-- GYFVFGV TG C  P.59.2
  RHO1_anoCa  I N FLTLFVTIQHKKLRTPLNYILLNLAVA N LFMV-LMGF T TTMYTSMN-- GYFIFGT VG C  P.592
  RHO1_anoCa  I N FLTLFVTIQHKKLRTPLNYILLNLAVA N LFMV-LMGF T TTMYTSMN-- GYFIFGT VG C  P.59.2
  RHO1_xenTr  I N FMTLYVTIQHKKLRTPLNYILLNLVFA N HFMV-LCGF T VTMYTSMH-- GYFIFGQ TG C  P.592
  RHO1_xenTr  I N FMTLYVTIQHKKLRTPLNYILLNLVFA N HFMV-LCGF T VTMYTSMH-- GYFIFGQ TG C  P.59.2
  RHO1_neoFo  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTMYTAMN-- GYFVFGV VG C  P.592
  RHO1_neoFo  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTMYTAMN-- GYFVFGV VG C  P.59.2
  RHO1_latCh  I N FLTLFVTIQHKKLRTPLNYILLDLAVA D LCMV-FGGF F VTMYSSMN-- GYFVLGP TG C  P.592
  RHO1_latCh  I N FLTLFVTIQHKKLRTPLNYILLDLAVA D LCMV-FGGF F VTMYSSMN-- GYFVLGP TG C  P.59.2
  RHO1_angAn  V N FLTLYVTIEHKKLRTPLNYILLNLAVA N LFMV-FGGF T TTVYTSMH-- GYFVFGE TG C  P.592
  RHO1_angAn  V N FLTLYVTIEHKKLRTPLNYILLNLAVA N LFMV-FGGF T TTVYTSMH-- GYFVFGE TG C  P.59.2
  RHO1_conMy  I N FLTLYVTIEHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTMYTSMH-- GYFVFGP TG C  P.592
  RHO1_conMy  I N FLTLYVTIEHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTMYTSMH-- GYFVFGP TG C  P.59.2
  RHO1_takRu  V N FLTLFVTVKHKKLRTPLNYVLLNLAVA D LFMV-IGGF T VTLYTALH-- AYFVLGV TG C  P.592
  RHO1_takRu  V N FLTLFVTVKHKKLRTPLNYVLLNLAVA D LFMV-IGGF T VTLYTALH-- AYFVLGV TG C  P.59.2
  RHO1_leuEr  V N FLTLFVTIQHKKLRQPLNYILLNLAVS D LFMV-FGGF T TTIITSMN-- GYFIFGP AG C  P.592
  RHO1_leuEr  V N FLTLFVTIQHKKLRQPLNYILLNLAVS D LFMV-FGGF T TTIITSMN-- GYFIFGP AG C  P.59.2
  RHO1_calMi  V N FLTLYVTFEHKKLRQPLNFILLNLAVA D LFMV-FGGF F ITVYTSLH-- GYFVFGV TG C  P.592
  RHO1_calMi  V N FLTLYVTFEHKKLRQPLNFILLNLAVA D LFMV-FGGF F ITVYTSLH-- GYFVFGV TG C  P.59.2
  RHO1_petMa  V N FLTLFVTVQHKKLRTPLNYILLNLAVA N LFMV-LFGF T LTMYSSMN-- GYFVFGP TM C  P.592
  RHO1_petMa  V N FLTLFVTVQHKKLRTPLNYILLNLAVA N LFMV-LFGF T LTMYSSMN-- GYFVFGP TM C  P.59.2
  RHO1_letJa  V N FLTLFVTVQHKKLRTPLNYILLNLAMA N LFMV-LFGF T VTMYTSMN-- GYFVFGP TM C  P.592
  RHO1_letJa  V N FLTLFVTVQHKKLRTPLNYILLNLAMA N LFMV-LFGF T VTMYTSMN-- GYFVFGP TM C  P.59.2
  RHO1_geoAu  V N FLTLFVTVQHKKLRTPLNYILLNLAVS N LFMI-LFGF T TTMYTSMN-- GYFVFGP TM C  P.592
  RHO1_geoAu  V N FLTLFVTVQHKKLRTPLNYILLNLAVS N LFMI-LFGF T TTMYTSMN-- GYFVFGP TM C  P.59.2
  RHO2_calMi  I N GLTLLVTVKHKKLRQPLNFILLNLAVA D LFMV-FGGF F ITVYTSLH-- GYFVFGV TG C  P.592
  RHO2_calMi  I N GLTLLVTVKHKKLRQPLNFILLNLAVA D LFMV-FGGF F ITVYTSLH-- GYFVFGV TG C  P.59.2
  RHO2_galGa  I N LLTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFVFGP VG C  P.592
  RHO2_galGa  I N LLTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFVFGP VG C  P.59.2
  RHO2_taeGu  I N FLTLLVTFKHKKLRQPLNYILVNLAVA D LCMA-CFGF T VTFYTAWN-- GYFVFGP IG C  P.592
  RHO2_taeGu  I N FLTLLVTFKHKKLRQPLNYILVNLAVA D LCMA-CFGF T VTFYTAWN-- GYFVFGP IG C  P.59.2
  RHO2_podSi  I N LLTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFIFGP IG C  P.592
  RHO2_podSi  I N LLTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFIFGP IG C  P.59.2
  RHO2_anoCa  I N ILTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFIFGP IG C  P.592
  RHO2_anoCa  I N ILTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFIFGP IG C  P.59.2
  RHO2_neoFo  I N LLTLVVTFKHKKLRQPLNYILVNLAVA D LFMV-CFGF T VTFSTAIN-- GYFIFGP RG C  P.592
  RHO2_neoFo  I N LLTLVVTFKHKKLRQPLNYILVNLAVA D LFMV-CFGF T VTFSTAIN-- GYFIFGP RG C  P.59.2
  RHO2_latCh  I N FLTLLVTFKHKKLRQPLNYILVNLAVA S LFMV-VFGF T VTFYSSLN-- GYFVLGP MG C  P.592
  RHO2_latCh  I N FLTLLVTFKHKKLRQPLNYILVNLAVA S LFMV-VFGF T VTFYSSLN-- GYFVLGP MG C  P.59.2
  RHO2_gekGe  L N GLTLFVTFQHKKLRQPLNYILVNLAAA N LVTV-CCGF T VTFYASWY-- AYFVFGP IG C  P.592
  RHO2_gekGe  L N GLTLFVTFQHKKLRQPLNYILVNLAAA N LVTV-CCGF T VTFYASWY-- AYFVFGP IG C  P.59.2
  RHO2_pheMa  L N GLTLFVTFQHKKLRQPLNYILVNLAVA N LLMV-ICGF T VTFYTSWY-- GYFVFGP MG C  P.592
  RHO2_pheMa  L N GLTLFVTFQHKKLRQPLNYILVNLAVA N LLMV-ICGF T VTFYTSWY-- GYFVFGP MG C  P.59.2
  RHO2_geoAu  V N FMTLFVTFKLKKLRQPLNFILVNLCVA D LLMI-MFGF T TTFYTAMN-- GYFVFGP TG C  P.592
  RHO2_geoAu  V N FMTLFVTFKLKKLRQPLNFILVNLCVA D LLMI-MFGF T TTFYTAMN-- GYFVFGP TG C  P.59.2
  <font color="green">RHO2_danRe  I N GLTLLVTAQHKKLRQPLNFILVNLAVA G TIMV-CFGF T VTFYSAIN-- GYFVLGP TG C  P.592
  <font color="BLUE">RHO2_danRe  I N GLTLLVTAQHKKLRQPLNFILVNLAVA <font color="red">G</font> TIMV-CFGF T VTFYSAIN-- GYFVLGP TG C  P.59.2
  RHO2d_danR  I N GLTLLVTAQHKKLRQPLNFILVNLAVA G TIMV-CFGF T VTFYTAIN-- GYFVLGP TG C  P.592
  RHO2d_danR  I N GLTLLVTAQHKKLRQPLNFILVNLAVA <font color="red">G</font> TIMV-CFGF T VTFYTAIN-- GYFVLGP TG C  P.59.2
  RHO2c_danR  I N GLTLVVTAQHKKLRQPLNFILVNLAVA G TIMV-CFGF T VTFYTAIN-- GYFVLGP TG C  P.592
  RHO2c_danR  I N GLTLVVTAQHKKLRQPLNFILVNLAVA <font color="red">G</font> TIMV-CFGF T VTFYTAIN-- GYFVLGP TG C  P.59.2
  RHO2a_danR  I N VLTLVVTAQHKKLRQPLNYILVNLAFA G TIMV-IFGF T VSFYCSLV-- GYMALGP LG C  P.592
  RHO2a_danR  I N VLTLVVTAQHKKLRQPLNYILVNLAFA <font color="red">G</font> TIMV-IFGF T VSFYCSLV-- GYMALGP LG C  P.59.2
  RHO2b_danR  I N VLTLLVTAQHKKLRQPLNYILVNLAFA G TIMA-FFGF T VTFYCSIN-- GYMALGP TG C  P.592
  RHO2b_danR  I N VLTLLVTAQHKKLRQPLNYILVNLAFA <font color="red">G</font> TIMA-FFGF T VTFYCSIN-- GYMALGP TG C  P.59.2
  RHO2_takRu  I N GLTLLVTAQNKKLRQPLNYILVNLAVA G LIMC-AFGF T ITITSAVN-- GYFILGA TA C  P.592
  RHO2_takRu  I N GLTLLVTAQNKKLRQPLNYILVNLAVA <font color="red">G</font> LIMC-AFGF T ITITSAVN-- GYFILGA TA C  P.59.2
  RHO2_gasAc  I N GLTLLVTAQNKKLRQPLNYILVNLAVA G LIMC-AFGF T ITITSAVN-- GYFILGA TA C  P.592
  RHO2_gasAc  I N GLTLLVTAQNKKLRQPLNYILVNLAVA <font color="red">G</font> LIMC-AFGF T ITITSAVN-- GYFILGA TA C  P.59.2
  RHO2_oreNi  I N GLTLFVTAQNKKLRQPLNYILVNLAVA G LIMC-CFGF T ITITSAIN-- GYFVLGT TF C  P.592
  RHO2_oreNi  I N GLTLFVTAQNKKLRQPLNYILVNLAVA <font color="red">G</font> LIMC-CFGF T ITITSAIN-- GYFVLGT TF C  P.59.2
  RHO2_hipHi  I N GLTLFVTAQNKKLRQPLNYILVNLAVA G LIMC-CFGF T ITITSAFN-- GYFILGA TF C  P.592
  RHO2_hipHi  I N GLTLFVTAQNKKLRQPLNYILVNLAVA <font color="red">G</font> LIMC-CFGF T ITITSAFN-- GYFILGA TF C  P.59.2
  RHO2_mulSu  I N GLTLLVTFQNKKLQQPLNYILVNLAVV G LIMC-AFGF T ITITSALN-- GYFILGP TF C  P.592
  RHO2_mulSu  I N GLTLLVTFQNKKLQQPLNYILVNLAVV <font color="red">G</font> LIMC-AFGF T ITITSALN-- GYFILGP TF C  P.59.2
  RHO2_pomMi  I N ALTLLVTFQNKKLRQPLNFILVNLAVA G LIMC-AFGF T ITITSALN-- GYFILGA TF C  P.592
  RHO2_pomMi  I N ALTLLVTFQNKKLRQPLNFILVNLAVA <font color="red">G</font> LIMC-AFGF T ITITSALN-- GYFILGA TF C  P.59.2
  RHO2_oryLa  I N ALTLVVTAQNKKLRQPLNFILVNLAVA G LIMV-CFGF T VCIYSCMV-- GYFSLGP LG C  P.592</font>
  RHO2_oryLa  I N ALTLVVTAQNKKLRQPLNFILVNLAVA <font color="red">G</font> LIMV-CFGF T VCIYSCMV-- GYFSLGP LG C  P.59.2</font>
  <font color="brown">SWS2_ornAn  I N LLTVICTIKYKKLRSHLNYILVNLAVS N MLVV-CVGS A TAFYSFAH-- MYFVLGP TA C  P.592
  SWS2_ornAn  I N LLTVICTIKYKKLR<font color="red">SH</font>LNYILVNLAVS N MLVV-CVGS A TAFYSFAH-- MYFVLGP TA C  P.59.2
  SWS2_anoCa  I N VLTIFCTFKYKKLRSHLNYILVNLSVS N LLVV-CVGS T TAFYSFSN-- MYFSLGP TA C  P.592
  SWS2_anoCa  I N VLTIFCTFKYKKLR<font color="red">SH</font>LNYILVNLSVS N LLVV-CVGS T TAFYSFSN-- MYFSLGP TA C  P.59.2
  SWS2_utaSt  I N VLTIFCTFKYKKLRSHLNYILVNLAVS N LLVV-CIGS T TAFYSFAQ-- MYFSLGP TA C  P.592
  SWS2_utaSt  I N VLTIFCTFKYKKLR<font color="red">SH</font>LNYILVNLAVS N LLVV-CIGS T TAFYSFAQ-- MYFSLGP TA C  P.59.2
  SWS2_taeGu  I N ALTVLCTAKYKKLRSHLNYILVNLAVA N LLVV-CVGS T TAFYSFSQ-- MYFALGP LA C  P.592
  SWS2_taeGu  I N ALTVLCTAKYKKLR<font color="red">SH</font>LNYILVNLAVA N LLVV-CVGS T TAFYSFSQ-- MYFALGP LA C  P.59.2
  SWS2_neoFo  I N VLTIICTFKYKKLRSHLNYILVNLAVA N LIVV-GFGS T TAFYSFSQ-- MYFAWGP LA C  P.592
  SWS2_neoFo  I N VLTIICTFKYKKLR<font color="red">SH</font>LNYILVNLAVA N LIVV-GFGS T TAFYSFSQ-- MYFAWGP LA C  P.59.2
  SWS2_galGa  I N TLTIFCTARFRKLRSHLNYILVNLALA N LLVI-LVGS T TACYSFSQ-- MYFALGP TA C  P.592
  SWS2_galGa  I N TLTIFCTARFRKLR<font color="red">SH</font>LNYILVNLALA N LLVI-LVGS T TACYSFSQ-- MYFALGP TA C  P.59.2
  SWS2_xenTr  L N LLTIICTVKYKKLRSHLNYILVNLAVA N LIVI-CFGS T TAFYSFSQ-- MYFSLGT LA C  P.592
  SWS2_xenTr  L N LLTIICTVKYKKLR<font color="red">SH</font>LNYILVNLAVA N LIVI-CFGS T TAFYSFSQ-- MYFSLGT LA C  P.59.2
  SWS2_geoAu  L N FLTVFVTIKYKKLRSHLNYILVNLAIA N LIVV-CCGS T LAFYSFMH-- KYFILGP LF C  P.592
  SWS2_geoAu  L N FLTVFVTIKYKKLR<font color="red">SH</font>LNYILVNLAIA N LIVV-CCGS T LAFYSFMH-- KYFILGP LF C  P.59.2
  SWS2_takRu  I N VLTIACTIQYKKLRSHLNYILVNLAFS N LLVT-TVGS F TCFCCFFV-- RYMIVGP LG C  P.592
  SWS2_takRu  I N VLTIACTIQYKKLR<font color="red">SH</font>LNYILVNLAFS N LLVT-TVGS F TCFCCFFV-- RYMIVGP LG C  P.59.2
  SWS2_gasAc  I N ALTVACTVQNKKLRSHLNYILVNLAVS N LLVS-GVGA F TAFLSFAA-- RYFVLGT LA C  P.592</font>
  SWS2_gasAc  I N ALTVACTVQNKKLR<font color="red">SH</font>LNYILVNLAVS N LLVS-GVGA F TAFLSFAA-- RYFVLGT LA C  P.59.2
  <font color="green">SWS1_homSa  L N AMVLVATLRYKKLRQPLNYILVNVSFG G FLLC-IFSV F PVFVASCN-- GYFVFGR HV C  P.592
  <font color="blue">SWS1_homSa  L N AMVLVATLRYKKLRQPLNYILVNVSFG <font color="red">G</font> FLLC-IFSV F PVFVASCN-- GYFVFGR HV C  P.59.2
  SWS1_monDo  L N AVVLVATLRYKKLRQPLNYILVNVSLC G FIFC-IFAV F TVFISSSQ-- GYFIFGR HV C  P.592
  SWS1_ailMel L N ATVLVATLRYRKLRQPLNYILVNVSLA <font color="red">G</font> FVYC-I<font color="red">-</font>SV S TVFIASCH-- GYFIFGR HV C  P.59.2 Caniformia-restricted deletion
  SWS1_smiCr  L N GVVLIATLRYKKLRQPLNYILVNISLA G FIFC-VFSV F TVFVSSSQ-- GYFVFGR HV C  P.592
  SWS1_canFam L N GTVLVATLRYKKLRQPLNYILVNVSLG <font color="red">G</font> FLYC-I<font color="red">-</font>SV S TVFIASCQ-- GYFVFGR HV C  P.59.2 Caniformia-restricted deletion
  SWS1_tarRo  L N AVVLIATLRYKKLRQPLNYILVNISLA G FIFC-VISV F TVFISSSQ-- GYFIFGR HV C P.592
  SWS1_felCat L N ATVLVATLRYRKLRQPLNYILVNVSLG <font color="red">G</font> FLYC-VSSV S IVFITSCH-- AYFIFGR H VC P.59.2
  SWS1_taeGu L N AIVLIVTIKYKKLRQPLNYILVNISVS G LMCC-VFCI F TVFIASSQ-- GYFVFGK HM C  P.592
  SWS1_monDo L N AVVLVATLRYKKLRQPLNYILVNVSLC <font color="red">G</font> FIFC-IFAV F TVFISSSQ-- GYFIFGR HV C  P.59.2
  SWS1_anoCa L N AIILIVTVKYKKLRQPLNYILVNISFA G FLFC-TFSV F TVFMASSQ-- GYFFFGR HV C  P.592
  SWS1_smiCr L N GVVLIATLRYKKLRQPLNYILVNISLA <font color="red">G</font> FIFC-VFSV F TVFVSSSQ-- GYFVFGR HV C  P.59.2
  SWS1_utaSt L N AIILIVTVKYKKLRQPLNYILVNISFA G FLFC-VFSV F TVFLASSQ-- GYFFFGR HI C  P.592
  SWS1_tarRo L N AVVLIATLRYKKLRQPLNYILVNISLA <font color="red">G</font> FIFC-VISV F TVFISSSQ-- GYFIFGR HV C  P.59.2
  SWS1_neoFo L N AIVLFVTIKYKKLQQPLNYILVNISLA G FIFC-FFGV F AVFIASCQ-- GYFIFGK TV C  P.592
  SWS1_taeGu L N AIVLIVTIKYKKLRQPLNYILVNISVS <font color="red">G</font> LMCC-VFCI F TVFIASSQ-- GYFVFGK HM C  P.59.2
  SWS1_galGa L N AVVLWVTVRYKRLRQPLNYILVNISAS G FVSC-VLSV F VVFVASAR-- GYFVFGK RV C  P.592
  SWS1_anoCa L N AIILIVTVKYKKLRQPLNYILVNISFA <font color="red">G</font> FLFC-TFSV F TVFMASSQ-- GYFFFGR HV C  P.59.2
  SWS1_xenLa L N FIVLLVTIKYKKLRQPLNYILVNITVG G FLMC-IFSI F PVFVSSSQ-- GYFFFGR IA C  P.592
  SWS1_utaSt L N AIILIVTVKYKKLRQPLNYILVNISFA <font color="red">G</font> FLFC-VFSV F TVFLASSQ-- GYFFFGR HI C  P.59.2
  SWS1_petMa L N AIVLIVTVKCKKLRQPLTYMLVNISAA G LVFC-LFSI S TVFLFSTQ-- GYFVFGP TV C  P.592
  SWS1_neoFo L N AIVLFVTIKYKKLQQPLNYILVNISLA <font color="red">G</font> FIFC-FFGV F AVFIASCQ-- GYFIFGK TV C  P.59.2
  SWS1_geoAu L N AIVLVVTIKYKKLRQPLNYILVNISAA G LVFC-LFSI S TVFVASMQ-- GYFFLGP TI C  P.592
  SWS1_galGa L N AVVLWVTVRYKRLRQPLNYILVNISAS <font color="red">G</font> FVSC-VLSV F VVFVASAR-- GYFVFGK RV C  P.59.2
  SWS1_danRe M N GIVLFVTMKYKKLRQPLNYILVNISLA G FIFD-TFSV S QVSVCAAR-- GYYSLGY TL C  P.592
  SWS1_xenLa L N FIVLLVTIKYKKLRQPLNYILVNITVG <font color="red">G</font> FLMC-IFSI F PVFVSSSQ-- GYFFFGR IA C  P.59.2
  SWS1_oryLa L N FVVLLATAKYKKLRVPLNYILVNITFA G FIFV-TFSV S QVFLASVR-- GYYFFGQ TL C  P.592</font>
  SWS1_petMa L N AIVLIVTVKCKKLRQPLTYMLVNISAA <font color="red">G</font> LVFC-LFSI S TVFLFSTQ-- GYFVFGP TV C  P.59.2
LWS_homSap  T N GLVLAATMKFKKLRHPLNWILVNLAVA D LAET-VIAS T ISVVNQVY-- GYFVLGH PM C  P.592
SWS1_geoAu  L N AIVLVVTIKYKKLRQPLNYILVNISAA <font color="red">G</font> LVFC-LFSI S TVFVASMQ-- GYFFLGP TI C  P.59.2
  LWS_monDom T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-VIAS T ISVINQIY-- GYFILGH PL C  P.592
  SWS1_danRe M N GIVLFVTMKYKKLRQPLNYILVNISLA <font color="red">G</font> FIFD-TFSV S QVSVCAAR-- GYYSLGY TL C  P.59.2
  LWS_macEug T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-LIAS T ISVINQIY-- GYFILGH PM C  P.592
  SWS1_oryLa L N FVVLLATAKYKKLRVPLNYILVNITFA <font color="red">G</font> FIFV-TFSV S QVFLASVR-- GYYFFGQ TL C  P.59.2</font>
  LWS_smiCra T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-IIAS T ISVINQIY-- GYFILGH PM C  P.592
  LWS_homSap T N GLVLAATMKFKKLRHPLNWILVNLAVA D LAET-VI<font color="RED">A</font>S T ISVVNQVY-- GYFVLG<font color="red">H P</font>M C  P.59.2
  LWS_ornAna T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-LIAS T ISVINQIF-- GYFILGH PM C  P.592
  LWS_ailMel T N GLVLAATMRFKKLRHPLNWILVNLAVA D LAET-VI<font color="RED">A</font>S T ISVVNQIY-- GYFVLG<font color="red">H P</font>L C  P.59.2
  LWS_galGal T N GLVLVATWKFKKLRHPLNWILVNLAVA D LGET-VIAS T ISVINQIS-- GYFILGH PM C  P.592
  LWS_monDom T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-VI<font color="RED">A</font>S T ISVINQIY-- GYFILG<font color="red">H P</font>L C  P.59.2
LWS_anoCar  T N GLVLVATAKFKKLRHPLNWILVNLAIA D LGET-VIAS T ISVINQIS-- GYFILGH PM C  P.592
  LWS_macEug T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-LI<font color="RED">A</font>S T ISVINQIY-- GYFILG<font color="red">H P</font>M C  P.59.2
  LWS_xenTro T N GLVLVATLKFKKLRHPLNWILVNMAIA D LGET-VIAS T ISVCNQIF-- GYFVLGH PM C  P.592
  LWS_smiCra T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-II<font color="RED">A</font>S T ISVINQIY-- GYFILG<font color="red">H P</font>M C  P.59.2
LWS_petMar  S N GLVLVATVKFKKLRHPLNWIIVNLAIA D ILET-IFAS T ISVCNQVY-- GYFILGH PM C  P.592
  LWS_ornAna T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-LI<font color="RED">A</font>S T ISVINQIF-- GYFILG<font color="red">H P</font>M C  P.59.2
  LWS_letJap T N GLVLVATMKFKKLRHPLNWILVNLAIA D ILET-IFAS T ISVCNQVF-- GYFILGH PM C  P.592
  LWS_galGal T N GLVLVATWKFKKLRHPLNWILVNLAVA D LGET-VI<font color="RED">A</font>S T ISVINQIS-- GYFILG<font color="red">H P</font>M C  P.59.2
  LWS_geoAus T N GLVLVATLKFKKLRHPLNWILVNLAIA D IGET-IFAS T VSVVNQIF-- GYFILGH PL C  P.592
  LWS_anoCar T N GLVLVATAKFKKLRHPLNWILVNLAIA D LGET-VI<font color="RED">A</font>S T ISVINQIS-- GYFILG<font color="red">H P</font>M C  P.59.2
LWS_neoFor  T N GLVLMATYKFKKLRHPLNWILVNLAIA D LGET-LIAS T ISVTNQIF-- GYFILGH PM C  P.592
  LWS_xenTro T N GLVLVATLKFKKLRHPLNWILVNMAIA D LGET-VI<font color="RED">A</font>S T ISVCNQIF-- GYFVLG<font color="red">H P</font>M C  P.59.2
  LWS_takRub T N GLVLVATAKFKKLRHPLNWILVNLAIA D LGET-VFAS T ISVCNQFF-- GYFILGH PM C  P.592
  LWS_petMar S N GLVLVATVKFKKLRHPLNWIIVNLAIA D ILET-IF<font color="RED">A</font>S T ISVCNQVY-- GYFILG<font color="red">H P</font>M C  P.59.2
  LWS_gasAcu T N GLVLVATAKFKKLQHPLNWILVNLAIA D LGET-VFAS T ISVCNQFF-- GYFILGH PM C  P.592
  LWS_letJap T N GLVLVATMKFKKLRHPLNWILVNLAIA D ILET-IF<font color="RED">A</font>S T ISVCNQVF-- GYFILG<font color="red">H P</font>M C  P.59.2
  LWS1_calMi T N GLVLVATVRFKKLRHPLNWILVNMALA D LGET-VLAS T VSVANQFF-- GYFILGH PL C  P.592
  LWS_geoAus T N GLVLVATLKFKKLRHPLNWILVNLAIA D IGET-IF<font color="RED">A</font>S T VSVVNQIF-- GYFILG<font color="red">H P</font>L C  P.59.2
  LWS2_calMi T N GLVLVATWKFKKLRHPLNWILVNLAIA D LGET-LFAS T ISICNQVF-- GYFILGH PM C  P.592
  LWS_neoFor T N GLVLMATYKFKKLRHPLNWILVNLAIA D LGET-LI<font color="RED">A</font>S T ISVTNQIF-- GYFILG<font color="red">H P</font>M C  P.59.2
  PIN_galGal V N GLVIVVSICYKKLRSPLNYILVNLAVA D LLVT-LCGS S VSLSNNIN-- GFFVFGR RM C  P.592
  LWS_takRub T N GLVLVATAKFKKLRHPLNWILVNLAIA D LGET-VF<font color="RED">A</font>S T ISVCNQFF-- GYFILG<font color="red">H P</font>M C  P.59.2
  PIN_colLiv V N GLVIVVSIRYKKLRSPLNYILVNLAMA D LLVT-LCGS S VSFSNNIN-- GFFVFGK RL C  P.592
  LWS_gasAcu T N GLVLVATAKFKKLQHPLNWILVNLAIA D LGET-VF<font color="RED">A</font>S T ISVCNQFF-- GYFILG<font color="red">H P</font>M C  P.59.2
  PIN_taeGut L N GLVIVVSVRHKRLRSPLNYILLNLAVA N LLVT-LCGS S VSLSNNIS-- GFFVFGE RL C  P.592
  LWS1_calMi T N GLVLVATVRFKKLRHPLNWILVNMALA D LGET-VL<font color="RED">A</font>S T VSVANQFF-- GYFILG<font color="red">H P</font>L C  P.59.2
  PIN_utaSta V N GLVIVVSIQYKKLRSPLNYILVNLAIA D LLVT-SFGS T LSFANNIY-- GFFVLGQ TA C  P.592
  LWS2_calMi T N GLVLVATWKFKKLRHPLNWILVNLAIA D LGET-LF<font color="RED">A</font>S T ISICNQVF-- GYFILG<font color="red">H P</font>M C  P.59.2
  PIN_podSic V N GLVIVVSVQFKKLRSPLNYVLVNLAVA D LLVT-FFGS T ISFVNNAQ-- GFFIFGQ AT C  P.592
  PIN_galGal V N GLVIVVSICYKKLRSPLNYILVNLAVA D LLVT-LCGS S VSLSNNIN-- GFFVFGR RM C  P.59.2
  PIN_pheMad A N GLVIAVSVRFKRLRSPLNYILVNLATA D LLVT-FFGS I ISFVNNAV-- GFFVFGK TA C  P.592
  PIN_colLiv V N GLVIVVSIRYKKLRSPLNYILVNLAMA D LLVT-LCGS S VSFSNNIN-- GFFVFGK RL C  P.59.2
  PIN_xenTro V N GLVIVVTLKYKKLRSPLNYILVNLAIA N LLVT-IFGS S VSFSNNVV-- GYFFMGK TM C  P.592
  PIN_taeGut L N GLVIVVSVRHKRLRSPLNYILLNLAVA N LLVT-LCGS S VSLSNNIS-- GFFVFGE RL C  P.59.2
  PIN_bufJap V N GMVIVVSLKYKKLRSPLNYILVNLAVA D ILVT-MFGS T VSFHNNIF-- GFFTLGK LV C  P.592
  PIN_utaSta V N GLVIVVSIQYKKLRSPLNYILVNLAIA D LLVT-SFGS T LSFANNIY-- GFFVLGQ TA C  P.59.2
  VAOP_galGa E N LAVILVTFKFKQLRQPVNYVIVNLSVA D FLVS-LTGG T ISFLANLK-- GYFYMGH WA C  P.592
  PIN_podSic V N GLVIVVSVQFKKLRSPLNYVLVNLAVA D LLVT-FFGS T ISFVNNAQ-- GFFIFGQ AT C  P.59.2
  VAOP_taeGu E N LAVILVTFKFKQLRQPINYIIVNLSVA D FLVS-LTGG T ISFLTNLK-- GYFFMGY WA C  P.592
  PIN_pheMad A N GLVIAVSVRFKRLRSPLNYILVNLATA D LLVT-FFGS I ISFVNNAV-- GFFVFGK TA C  P.59.2
  VAOP_anoCa E N FTVILVTIKFKQLRQPLNYVIVNLSVA D FLVS-LIGG T ISFSTNLK-- GYFYMGH WA C  P.593
  PIN_xenTro V N GLVIVVTLKYKKLRSPLNYILVNLAIA N LLVT-IFGS S VSFSNNVV-- GYFFMGK TM C  P.59.2
  VAOP_xenTr E N FIVILVTAKFKQLRQPLNYIIVNLSVA D FLVS-VIGG T ISIATNSR-- GYFYLGS WA C  P.592
  PIN_bufJap V N GMVIVVSLKYKKLRSPLNYILVNLAVA D ILVT-MFGS T VSFHNNIF-- GFFTLGK LV C  P.59.2
  VAOP_danRe E N FTVMLVTFRFQQLRQPLNYIIVNLSLA D FLVS-LTGG S ISFLTNYH-- GYFFLGK WA C  P.592
  VAOP_galGa E N LAVILVTFKFKQLRQPVNYVIVNLSVA D FLVS-LTGG T ISFLANLK-- GYFYMGH WA C  P.59.2
  VAOP_rutRu E N FAVMLVTFRFTQLRKPLNYIIVNLSLA D FLVS-LTGG T ISFLTNYH-- GYFFLGK WA C  P.592
  VAOP_taeGu E N LAVILVTFKFKQLRQPINYIIVNLSVA D FLVS-LTGG T ISFLTNLK-- GYFFMGY WA C  P.59.2
  VAOP_takRu E N FLVMFITFKFKQLRQPLNYIIVNLAIA D FLVS-LTGG L ISFLTNAR-- GYFFLGR WA C  P.592
  VAOP_anoCa E N FTVILVTIKFKQLRQPLNYVIVNLSVA D FLVS-LIGG T ISFSTNLK-- GYFYMGH WA C  P.59.3
  VAOP_petMa E N FAVIVVTARFRQLRQPLNYVLVNLAAA D LLVS-AIGG S VSFFTNIK-- GYFFLGV HA C  P.592
  VAOP_xenTr E N FIVILVTAKFKQLRQPLNYIIVNLSVA D FLVS-VIGG T ISIATNSR-- GYFYLGS WA C  P.59.2
  PPIN_anoCa L N TAVIAITIKYRQLRQPINYSLVNLAIA D LGAA-LLGG S LNVETNAV-- GYYNLGR VG C  P.592
  VAOP_danRe E N FTVMLVTFRFQQLRQPLNYIIVNLSLA D FLVS-LTGG S ISFLTNYH-- GYFFLGK WA C  P.59.2
  PPIN_xenTr L N VTVIVVTFKYRQLRHPINYSLVNLAIA D LGVT-VLGG A LTVETNAV-- GYFNLGR VG C  P.592
  VAOP_rutRu E N FAVMLVTFRFTQLRKPLNYIIVNLSLA D FLVS-LTGG T ISFLTNYH-- GYFFLGK WA C  P.59.2
  PPINa_petM L N STVIIVTLRHRQLRHPLNFSLVNLAVA D LGVT-VFGA S LVVETNAV-- GYFNLGR VG C  P.592
  VAOP_takRu E N FLVMFITFKFKQLRQPLNYIIVNLAIA D FLVS-LTGG L ISFLTNAR-- GYFFLGR WA C  P.59.2
  PPIN_letJa L N STVVIVTLRHRQLRHPLNFSLVNLAVA D LGVT-VFGA S LVVETNAV-- GYFNLGR VG C  P.592
  VAOP_petMa E N FAVIVVTARFRQLRQPLNYVLVNLAAA D LLVS-AIGG S VSFFTNIK-- GYFFLGV HA C  P.59.2
  PPIN_danRe L N VTVITVTLKYKQLRQPLNFALVNLAVA D LGCA-VFGG L PTVVTNAM-- GYFSLGR VG C  P.592
  PPIN_anoCa L N TAVIAITIKYRQLRQPINYSLVNLAIA D LGAA-LLGG S LNVETNAV-- GYYNLGR VG C  P.59.2
  PPIN_ictPu L N MVVIIVTVRYKQLRQPLNYALVNLAVA D LGCP-VFGG L LTAVTNAM-- GYFSLGR VG C  P.592
  PPIN_xenTr L N VTVIVVTFKYRQLRHPINYSLVNLAIA D LGVT-VLGG A LTVETNAV-- GYFNLGR VG C  P.59.2
  PPIN_oncMy M N VLVIMVTMRHRKLRQPLNYALVNLAVA D LGCA-LFGG L PTMVTNAM-- GYFSMGR LG C  P.592
  PPINa_petM L N STVIIVTLRHRQLRHPLNFSLVNLAVA D LGVT-VFGA S LVVETNAV-- GYFNLGR VG C  P.59.2
  PPINb_takR L N VLVIVVTMKHRQLRQPLSYALVNLAIC D LGCA-LFGG I PTTITSAM-- GYFSLGR VG C  P.592
  PPIN_letJa L N STVVIVTLRHRQLRHPLNFSLVNLAVA D LGVT-VFGA S LVVETNAV-- GYFNLGR VG C  P.59.2
  PPINb_tetN L N VLVIVVTLKHRQLRQPLNYALVNLAIC D LGCA-LFGG I PTTVTSAM-- GYFSLGR LG C  P.592
  PPIN_danRe L N VTVITVTLKYKQLRQPLNFALVNLAVA D LGCA-VFGG L PTVVTNAM-- GYFSLGR VG C  P.59.2
  PPINb_gasA L N ALVIVVTARHRQLRQPLSYALVNLAVC D LGCA-ACGG L PTTVTSAM-- GYFSLGR AG C  P.592
  PPIN_ictPu L N MVVIIVTVRYKQLRQPLNYALVNLAVA D LGCP-VFGG L LTAVTNAM-- GYFSLGR VG C  P.59.2
  PPINa_gasA L N ATVIIVTLMHKQLRQPLNYALVNMALA D LGTA-MTGG V LSVVNNAQ-- GYFSLGR SG C  P.592
  PPIN_oncMy M N VLVIMVTMRHRKLRQPLNYALVNLAVA D LGCA-LFGG L PTMVTNAM-- GYFSMGR LG C  P.59.2
  PPINa_takR L N ATVIIVSLMHKQLRQPLNYALVNMAVA D LGTA-MTGG L LSVVNNAQ-- GYFSLGR TG C  P.592
  PPINb_takR L N VLVIVVTMKHRQLRQPLSYALVNLAIC D LGCA-LFGG I PTTITSAM-- GYFSLGR VG C  P.59.2
  PPINa_tetN L N ATVIIVSLMHKQLRQPLNYALVNMAAA D LGTA-VSGG L LSVVNNAQ-- GHFSLGR TG C  P.592
  PPINb_tetN L N VLVIVVTLKHRQLRQPLNYALVNLAIC D LGCA-LFGG I PTTVTSAM-- GYFSLGR LG C  P.59.2
  PPINa_cioI L N ILVIVATLKNKVLRQPLNYIIVNLAVV D LLSG-FVGG F ISIAANGA-- GYFFWGK TM C  P.592
  PPINb_gasA L N ALVIVVTARHRQLRQPLSYALVNLAVC D LGCA-ACGG L PTTVTSAM-- GYFSLGR AG C  P.59.2
  PPINa_cioS L N ILVITATLKNKVLRQPLNYIIVNLAVV D LLSG-LVGG V ISIFANGA-- GYFFWGK FM C  P.592
  PPINa_gasA L N ATVIIVTLMHKQLRQPLNYALVNMALA D LGTA-MTGG V LSVVNNAQ-- GYFSLGR SG C  P.59.2
  PPINb_cioI L N GFVIIATMKNKKLRQPLNYIIINLSIA D FLSG-LVGG F IGMISNSA-- GYFYFGK TV C  P.592
  PPINa_takR L N ATVIIVSLMHKQLRQPLNYALVNMAVA D LGTA-MTGG L LSVVNNAQ-- GYFSLGR TG C  P.59.2
  PPINb_cioS L N LLVIVATYKNKDLRRPINYIIVNLAVA D LTCS-VVGG L LGVLNNGA-- GYYFLGK SV C  P.592
  PPINa_tetN L N ATVIIVSLMHKQLRQPLNYALVNMAAA D LGTA-VSGG L LSVVNNAQ-- GHFSLGR TG C  P.59.2
  PARIE_utaS N N SLVIAVTLKNPQLRNPINIFILNLSFS D LMMS-LCGT T IVIATNYY-- GYFYLGR KF C  P.592
  PPINa_cioI L N ILVIVATLKNKVLRQPLNYIIVNLAVV D LLSG-FVGG F ISIAANGA-- GYFFWGK TM C  P.59.2
  PARIE_anoC N N FLVIAVTLKNPQLRNPINIFILNLSFS D LMMS-ICGT T IVIATNYH-- GYFYLGR RF C  P.592
  PPINa_cioS L N ILVITATLKNKVLRQPLNYIIVNLAVV D LLSG-LVGG V ISIFANGA-- GYFFWGK FM C  P.59.2
  PARIE_xenT N N AIVILVTLKHPQLRNPINIFILNLSFS D LMMA-LCGT T IVVSTNYH-- GYFYLGK QF C  P.592
  PPINb_cioI L N GFVIIATMKNKKLRQPLNYIIINLSIA D FLSG-LVGG F IGMISNSA-- GYFYFGK TV C  P.59.2
  PARIE_takR N N SLAIAVMLKNPSLLQPINIFILSLAVS D LMIG-LCGS L VVTITNYH-- GSFFIGH TA C  P.592
  PPINb_cioS L N LLVIVATYKNKDLRRPINYIIVNLAVA D LTCS-VVGG L LGVLNNGA-- GYYFLGK SV C  P.59.2
  PARIE_tetN N N GLAITVMLKNPALLQPINIFILSLAVS D LMIG-LCGS L VVTITNYQ-- GSFFIGH TA C  P.592
  PARIE_utaS N N SLVIAVTLKNPQLRNPINIFILNLSFS D LMMS-LCGT T IVIATNYY-- GYFYLGR KF C  P.59.2
  PARIE_gasA N N VLVITVLVRNPSLLQPMNVFILSLAVS D LMIG-LCGS L VVTITNYH-- GSFFIGH TA C  P.592
  PARIE_anoC N N FLVIAVTLKNPQLRNPINIFILNLSFS D LMMS-ICGT T IVIATNYH-- GYFYLGR RF C  P.59.2
  PARIE_danR N N VLVIAVMVKNLHFLNAMTVIIFSLAVS D LLIA-TCGS A IVTVTNYE-- GSFFLGD AF C  P.592
  PARIE_xenT N N AIVILVTLKHPQLRNPINIFILNLSFS D LMMA-LCGT T IVVSTNYH-- GYFYLGK QF C  P.59.2
  ENCEPH_hom N N LLVLVLYYKFQRLRTPTHLLLVNISLS D LLVS-LFGV T FTFVSCLR-- NGWVWDT VG C  P.592
  PARIE_takR N N SLAIAVMLKNPSLLQPINIFILSLAVS D LMIG-LCGS L VVTITNYH-- GSFFIGH TA C  P.59.2
  ENCEPH_oto N N LLVLVLYYKFPRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLR-- NGWVWDT VG C  P.592
  PARIE_tetN N N GLAITVMLKNPALLQPINIFILSLAVS D LMIG-LCGS L VVTITNYQ-- GSFFIGH TA C  P.59.2
  ENCEPH_lox N N LLVLVLYYKFQRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLR-- NGWVWDT VG C  P.592
  PARIE_gasA N N VLVITVLVRNPSLLQPMNVFILSLAVS D LMIG-LCGS L VVTITNYH-- GSFFIGH TA C  P.59.2
  ENCEPH_pte N N LLVLVFYYKFQQVRTPFYLFLVNISFS D LLVS-FFGV T FTFVSCLR-- NGWVWDT VG C  P.592
  PARIE_danR N N VLVIAVMVKNLHFLNAMTVIIFSLAVS D LLIA-TCGS A IVTVTNYE-- GSFFLGD AF C  P.59.2
  ENCEPH_mus G N LLVLLLYSKFPRLRTPTHLFLVNLSLG D LLVS-LFGV T FTFASCLR-- NGWVWDA VG C  P.592
  <font color="blue">ENCEPH_hom  N N LLVLVLYYKFQRLR<font color="red">T</font>P<font color="red">T</font>HLLLVNISLS D LLVS-LFGV T FTFVSCLR-- NGWVWDT VG C  P.59.2
  ENCEPH_can C H FCPQKGFLEFQRLRTPTHLLLVNLSLS D LLVS-LFGV T FTFVSCLR-- NGWVWDS VG C  P.592
  ENCEPH_oto N N LLVLVLYYKFPRLR<font color="red">T</font>P<font color="red">T</font>HLFLVNISLS D LLVS-LFGV T FTFVS<font color="red">C</font>LR-- NGWVWDT VG C  P.59.2
  ENCEPH_mon N N LLVLVLYYKFQRLRTPTHLFLVNISFN D LLVS-LFGV T FTFVSCLR-- SGWVWDS VG C  P.592
  ENCEPH_lox N N LLVLVLYYKFQRLR<font color="red">T</font>P<font color="red">T</font>HLFLVNISLS D LLVS-LFGV T FTFVS<font color="red">C</font>LR-- NGWVWDT VG C  P.59.2
  ENCEPH_ano  N N LLVLVLYAKFKRLRTPTHLFLVNISLS D LLVS-LFGV S FTFGSCLR-- HRWVWDA AG C  P.592
  ENCEPH_pte N N LLVLVFYYKFQQVR<font color="red">T</font>PFYLFLVNISFS D LLVS-FFGV T FTFVS<font color="red">C</font>LR-- NGWVWDT VG C  P.59.2
  ENCEPH_gal N N LLVLVLYYKFKRLRTPTNLFLVNISLS D LLVS-VCGV S LTFMSCLR-- SRWVWDA AG C  P.592
  ENCEPH_mus G N LLVLLLYSKFPRLR<font color="red">T</font>P<font color="red">T</font>HLFLVNLSLG D LLVS-LFGV T FTFAS<font color="red">C</font>LR-- NGWVWDA VG C  P.59.2
  ENCEPH_dan N N IIVIILYSRYKRLRTPTNLLIVNISVS D LLVS-LTGV N FTFVSCVK-- RRWVFNS AT C  P.592
  ENCEPH_can C H FCPQKGFLEFQRLR<font color="red">T</font>P<font color="red">T</font>HLLLVNLSLS D LLVS-LFGV T FTFVS<font color="red">C</font>LR-- NGWVWDS VG C  P.59.2
  ENCEPH_tak N N FVVLALYCRFKRLRTPTNLLLVNISLS D LLVS-LFGI N FTFAACVQ-- GRWTWTQ AT C  P.592
  ENCEPH_mon N N LLVLVLYYKFQRLR<font color="red">T</font>P<font color="red">T</font>HLFLVNISFN D LLVS-LFGV T FTFVS<font color="red">C</font>LR-- SGWVWDS VG C  P.59.2
  ENCEPH_gas N N VVVIVLYCKFKRLRTPTNLLVVNISLS D LLVS-VIGI N FTFVSCIR-- GGWTWSR AT C  P.592
  ENCEPH_ano N N LLVLVLYAKFKRLR<font color="red">T</font>P<font color="red">T</font>HLFLVNISLS D LLVS-LFGV S FTFGS<font color="red">C</font>LR-- HRWVWDA AG C  P.59.2
  ENCEPH_ory N N LLVILLYCKFKRLRTPTSLLLVNISLS D LLVS-VVGI N FTLASCVK-- GRWMWSQ AT C  P.592
  ENCEPH_gal N N LLVLVLYYKFKRLR<font color="red">T</font>P<font color="red">T</font>NLFLVNISLS D LLVS-VCGV S LTFMS<font color="red">C</font>LR-- SRWVWDA AG C  P.59.2
  ENCEPH_xen N N LLVLILYCKFKRLQTPTNLLFFNTSLC H FVFS-LLAI T FTFMSCVR-- GSWAFSV EM C  P.592
  ENCEPH_dan N N IIVIILYSRYKRLR<font color="red">T</font>P<font color="red">T</font>NLLIVNISVS D LLVS-LTGV N FTFVS<font color="red">C</font>VK-- RRWVFNS AT C  P.59.2
ENCEPH4_br  N N FVVILLIGCHRQLRTPFNLLLLNMSVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.592
  ENCEPH_tak N N FVVLALYCRFKRLR<font color="red">T</font>P<font color="red">T</font>NLLLVNISLS D LLVS-LFGI N FTFAA<font color="red">C</font>VQ-- GRWTWTQ AT C  P.59.2
ENCEPH4_br  N N FVVILLIGCHRQLRTPFNLLLLNVSVA D LLVS-VCGN T LSFASAVQ-- HRWLWGR PG C  P.592
  ENCEPH_gas N N VVVIVLYCKFKRLR<font color="red">T</font>P<font color="red">T</font>NLLVVNISLS D LLVS-VIGI N FTFVS<font color="red">C</font>IR-- GGWTWSR AT C  P.59.2
  ENCEPH_cal N N ILVLLLYYKFKRLRTPTNLLLVNISVS D LLVS-VFGL S FTFVSCTQ-- GRWGWDS AA C  P.592
  ENCEPH_ory N N LLVILLYCKFKRLR<font color="red">T</font>P<font color="red">T</font>SLLLVNISLS D LLVS-VVGI N FTLAS<font color="red">C</font>VK-- GRWMWSQ AT C  P.59.2
  ENCEPH_squ N N LLMLVLYCKFKRLRTPTNLFLVNISIS D LLLS-VFGV I FTFVSCVK-- GRWVWDS AA C  P.592
  ENCEPH_xen N N LLVLILYCKFKRLQ<font color="red">T</font>P<font color="red">T</font>NLLFFNTSLC H FVFS-LLAI T FTFMS<font color="red">C</font>VR-- GSWAFSV EM C  P.59.2
  ENCEPH_pet N N LLLVALFVGFKRLQTPTNLLLVNISLS D LLVS-VFGN T LTLVSCVR-- RRWVWGN GG C  P.592
  ENCEPH_cal N N ILVLLLYYKFKRLR<font color="red">T</font>P<font color="red">T</font>NLLLVNISVS D LLVS-VFGL S FTFVS<font color="red">C</font>TQ-- GRWGWDS AA C  P.59.2
TMT5_braFl  S N GAVVLLFLKFRQLRTPFNMLLLNMSVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.592
  ENCEPH_squ N N LLMLVLYCKFKRLR<font color="red">T</font>P<font color="red">T</font>NLFLVNISIS D LLLS-VFGV I FTFVS<font color="red">C</font>VK-- GRWVWDS AA C  P.59.2
  TMT5_braBe S N GAVVVLFLKFPQLRTPFNLLLLNMAVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.592
  ENCEPH_pet N N LLLVALFVGFKRLQ<font color="red">T</font>P<font color="red">T</font>NLLLVNISLS D LLVS-VFGN T LTLVS<font color="red">C</font>VR-- RRWVWGN GG C  P.59.2
TMT_monDom  S N FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LSGT T LSFASSIQ-- GRWIGGK HG C  P.592
  ENCEPH4_br N N FVVILLIGCHRQLR<font color="red">T</font>PFNLLLLNMSVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.59.2
  TMT_macEug N N FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LTGT T LSFASSIR-- GRWIAGY HG C  P.592
  ENCEPH4_br N N FVVILLIGCHRQLR<font color="red">T</font>PFNLLLLNVSVA D LLVS-VCGN T LSFASAVQ-- HRWLWGR PG C  P.59.2</font>
  TMT_galGal N N LIVLILFCKFKTLRNPVNMLLLNISIS D MLVC-ISGT T LSFASNIH-- GKWIGGE HG C  P.592
  <font color="#0066CC">TMT5_braFl  S N GAVVLLFLKFRQLR<font color="red">T</font>PFNMLLLNMSVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.59.2
TMT_taeGut  N N LIVLILFCKFKTLRNPVNMLLLNISVS D MLVC-ISGT T LSFASNIR-- GKWIGGD HA C  P.592
  TMT5_braBe S N GAVVVLFLKFPQLR<font color="red">T</font>PFNLLLLNMAVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.59.2</font>
  TMT_anoCar N N LVVLILFCKFKTLRNPVNMLLLNISAS D MLVC-ISGT T LSFVSNIY-- GRWIGGE HG C  P.592
  TMT_monDom S N FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LSGT T LSFASSIQ-- GRWIGGK HG C  P.59.2
  TMT_xenTro N N FVVLILFCKFKTLRTPVNMMLLNISAS D MLVC-VSGT T LSFTSSIK-- GKWIGGE YG C  P.592
  TMT_macEug N N FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LTGT T LSFASSIR-- GRWIAGY HG C  P.59.2
  TMT_ornAna N N LIVLILFCKFKALRNPVNMIMLNISAS D MLVC-VSGT T LSFASNIS-- GRWIGGD PG C  P.592
  TMT_galGal N N LIVLILFCKFKTLRNPVNMLLLNISIS D MLVC-ISGT T LSFASNIH-- GKWIGGE HG C  P.59.2
TMT_danRer  N N LVVLVLFCKFKTLRTPVNMLLLNISIS D MLVC-MFGT T LSFASSVR-- GRWLLGR HG C  P.592
  TMT_taeGut N N LIVLILFCKFKTLRNPVNMLLLNISVS D MLVC-ISGT T LSFASNIR-- GKWIGGD HA C  P.59.2
  TMT_tetNig N N FIVLLLFCKFKKLRTPVNVLLLNISVS D MLVC-LFGT T LSFASSLR-- GRWLLGR SG C  P.592
  TMT_anoCar N N LVVLILFCKFKTLRNPVNMLLLNISAS D MLVC-ISGT T LSFVSNIY-- GRWIGGE HG C  P.59.2
TMT_takRub  N N FVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIR-- GRWLLGR IG C  P.592
  TMT_xenTro N N FVVLILFCKFKTLRTPVNMMLLNISAS D MLVC-VSGT T LSFTSSIK-- GKWIGGE YG C  P.59.2
  TMT_gasAcu N N LVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSLR-- GKWLLGR SG C  P.592
  TMT_ornAna N N LIVLILFCKFKALRNPVNMIMLNISAS D MLVC-VSGT T LSFASNIS-- GRWIGGD PG C  P.59.2
  TMT_oryLat N N FVVLILFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIR-- GRWLLGR GG C  P.592
  TMT_danRer N N LVVLVLFCKFKTLRTPVNMLLLNISIS D MLVC-MFGT T LSFASSVR-- GRWLLGR HG C  P.59.2
<font color="brown">TMTa1_anoC  N N LLVLVLFCRNKVLRSPINLLLMNISLS D LMIC-IVGT<font color="magenta"> P </font>FSFAASTQ-- GKWLIGP AG C  P.592
TMT_tetNig  N N FIVLLLFCKFKKLRTPVNVLLLNISVS D MLVC-LFGT T LSFASSLR-- GRWLLGR SG C  P.59.2
  TMTa1_xenT  N N LVVLILFCQYKVLRSPINMLLMNISLS D LMVC-ILGT<font color="magenta"> P </font>FSFAASTQ-- GHWLIGE IG C  P.592
  TMT_takRub N N FVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIR-- GRWLLGR IG C  P.59.2
  TMTa1_danR N N LLVLVLFGRYKVLRSPINFLLVNICLS D LLVC-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGD TG C  P.592
  TMT_gasAcu N N LVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSLR-- GKWLLGR SG C  P.59.2
  TMTb_danRe N N TLVLVLFCRYKVLRSPMNCLLISISVS D LLVC-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGR AG C  P.592
TMT_oryLat  N N FVVLILFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIR-- GRWLLGR GG C  P.59.2
  TMTa_gasAc N N LLVLVLFCRYKMLRSPINLLLINISIS D LLVC-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGE GG C  P.592
  <font color="blue">TMTa1_anoC N N LLVLVLFCRNKVLRSPINLLLMNISLS D LMIC-IVGT<font color="magenta"> P </font>FSFAASTQ-- GKWLIGP AG C  P.59.2
  TMTb_gasAc S N FLVLALFCRYRALRTPMNLLLVSISAS D LLVS-MVGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGR AG C  P.592
  TMTa1_xenT N N LVVLILFCQYKVLRSPINMLLMNISLS D LMVC-ILGT<font color="red"> P </font>FSFAASTQ-- GHWLIGE IG C  P.59.2
  TMTa_oryLa N N LLVLVLFCRYKILRSPINLLLINISIS D LLVC-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGE GG C  P.592
  TMTa1_danR N N LLVLVLFGRYKVLRSPINFLLVNICLS D LLVC-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGD TG C  P.59.2
  TMTb_oryLa S N LLVLALFCRYRALRTPMNLLLVSISVS D LLVS-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGR AG C  P.592
  TMTb_danRe N N TLVLVLFCRYKVLRSPMNCLLISISVS D LLVC-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGR AG C  P.59.2
  TMTa_pimPr N N TLVLILFCRYKVLRSPMNYLLVSIAVS D LLVC-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGR AG C  P.592
  TMTa_gasAc N N LLVLVLFCRYKMLRSPINLLLINISIS D LLVC-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGE GG C  P.59.2
  TMTa_takRu N N LLVLVLFCRYKMLRSPINLLLMNISIS D LLVC-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGE AG C  P.592
  TMTb_gasAc S N FLVLALFCRYRALRTPMNLLLVSISAS D LLVS-MVGT<font color="red"> P </font>FSFAASTQ-- GRWLIGR AG C  P.59.2
  TMTb_takRu S N FLVLALFCRYRALRTPMNLMLVSISAS D LLVS-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGR AG C  P.592
  TMTa_oryLa N N LLVLVLFCRYKILRSPINLLLINISIS D LLVC-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGE GG C  P.59.2
  TMTa_tetNi S N LLVLVLFCRFKVLRSPINLLLVNISVS D LLVC-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLIGA AG C  P.592
  TMTb_oryLa S N LLVLALFCRYRALRTPMNLLLVSISVS D LLVS-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGR AG C  P.59.2
  TMTb_tetNi S N LLVLALFCRFRALRTPMNLMLVSISAS D LLVS-VLGT<font color="magenta"> P </font>FSFAASTQ-- GRWLLGR AG C  P.592
  TMTa_pimPr N N TLVLILFCRYKVLRSPMNYLLVSIAVS D LLVC-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGR AG C  P.59.2
  TMTa_oncMy S N LFVLLVFARFQVLRTPINLILLNISVS D MLVC-IFGT<font color="magenta"> P </font>FSFAASLY-- GRWLIGA HG C  P.592
  TMTa_takRu N N LLVLVLFCRYKMLRSPINLLLMNISIS D LLVC-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGE AG C  P.59.2
  TMTa1_calM N N LLVLVLFCKYKVLRSPMNMLLLNISVS D MLVC-ICGT<font color="magenta"> P </font>FSFAASVQ-- GRWLVGE QG C  P.592
  TMTb_takRu S N FLVLALFCRYRALRTPMNLMLVSISAS D LLVS-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGR AG C  P.59.2
  TMTa2_calM N N LLVLLLFVCFKEIRTPLNMILLNISLS D LSVC-VFGT<font color="magenta"> P </font>FSFAASIY-- RRWLIGH KG C  P.592
  TMTa_tetNi S N LLVLVLFCRFKVLRSPINLLLVNISVS D LLVC-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLIGA AG C  P.59.2
  TMTx_braFl N N STTLYLVGRYKQLRTPFNILMVNLSVS D LLMC-VLGT<font color="magenta"> P </font>FSFVSSLH-- GRWMFGH SG C  P.592
  TMTb_tetNi S N LLVLALFCRFRALRTPMNLMLVSISAS D LLVS-VLGT<font color="red"> P </font>FSFAASTQ-- GRWLLGR AG C  P.59.2
  TMTPIN_str N N GIVMILFARFPSLRHPINSFLFNVSLS D LIIS-CLAS<font color="magenta"> P </font>FTFASNFA-- GRWLFGD LG C  P.592
  TMTa_oncMy S N LFVLLVFARFQVLRTPINLILLNISVS D MLVC-IFGT<font color="red"> P </font>FSFAASLY-- GRWLIGA HG C  P.59.2
  TMTy_braFl T N LLTVLVFWCFKSLRTPFHLYLGGIALS D LLVA-ALGS<font color="magenta"> P </font>FAVASAVG-- ERWLFGR AV C  P.592
  TMTa1_calM N N LLVLVLFCKYKVLRSPMNMLLLNISVS D MLVC-ICGT<font color="red"> P </font>FSFAASVQ-- GRWLVGE QG C  P.59.2
  ENCEPH_str G N SVVLFLFAWDRHLRTPTNMFLLSLTIS D WLVT-VVGI<font color="magenta"> P </font>FVTASIYA-- HRWLFAH VG C  P.592
  TMTa2_calM N N LLVLLLFVCFKEIRTPLNMILLNISLS D LSVC-VFGT<font color="red"> P </font>FSFAASIY-- RRWLIGH KG C  P.59.2
  TMT_triCys L N GLVIAVLIKYIRTITNTNIIVLSMSCA N ILIP-LLGS<font color="magenta"> P </font>LSATSSLM-- RKWQFGN GG C  P.592
  TMTx_braFl N N STTLYLVGRYKQLRTPFNILMVNLSVS D LLMC-VLGT<font color="red"> P </font>FSFVSSLH-- GRWMFGH SG C  P.59.2
  CUBOP_carR L N MIVLITFYRLRHKLAFKDALMASMAFS D VVQA-IVGY<font color="magenta"> P </font>LEVFTVVD-- GKWTFGM EL C  P.592
  TMTPIN_str N N GIVMILFARFPSLRHPINSFLFNVSLS D LIIS-CLAS<font color="red"> P </font>FTFASNFA-- GRWLFGD LG C  P.59.2
  TMT_apiMel A N LLVAIVIVKDAQLWTPVNVILFNLVFG D FLVS-IFGN<font color="magenta"> P </font>VAMVSAAT-- GGWYWGY KM C  P.592
  TMTy_braFl T N LLTVLVFWCFKSLRTPFHLYLGGIALS D LLVA-ALGS<font color="red"> P </font>FAVASAVG-- ERWLFGR AV C  P.59.2
  TMT1_anoGa L N IFVIALMYKDVQLWTPMNIILFNLVCS D FSVS-IIGN<font color="magenta"> P </font>LTLTSAIS-- HRWLYGK SI C  P.592
  ENCEPH_str G N SVVLFLFAWDRHLRTPTNMFLLSLTIS D WLVT-VVGI<font color="red"> P </font>FVTASIYA-- HRWLFAH VG C  P.59.2
  TMT2_anoGa L N LFVIALMCKDMQLWTPMNIILFNLVCS D FSVS-IIGN<font color="magenta"> P </font>LTLTSAIS-- HRWIFGR TL C  P.592
TMT_apiMel  A N LLVAIVIVKDAQL<font color="RED">W</font>TPVNVILFNLVFG D FLVS-IFGN<font color="red"> P </font>VAMVSAAT-- GGWYWGY KM C  P.59.2
  TMT_aedAeg L N LFVIALMCKDVQLWTPINIILFNLVCS D FSVS-IIGN<font color="magenta"> P </font>FTLTSAIS-- RHWIFGR TV C  P.592
TMT1_anoGa  L N IFVIALMYKDVQL<font color="RED">W</font>TPMNIILFNLVCS D FSVS-IIGN<font color="red"> P </font>LTLTSAIS-- HRWLYGK SI C  P.59.2
  TMT_culPip L N LFVIALMCKEVQLWTPMNIILLNLVCS D FSVS-IVGN<font color="magenta"> P </font>FTLSSAIS-- HRWLFGR KL C  P.592
TMT2_anoGa  L N LFVIALMCKDMQL<font color="RED">W</font>TPMNIILFNLVCS D FSVS-IIGN<font color="red"> P </font>LTLTSAIS-- HRWIFGR TL C  P.59.2
  TMT_triCas L N LTVIIFMLKERQLWSPLNIILFNLVVS D FLVS-VLGN<font color="magenta"> P </font>WTFFSAIN-- YGWIFGE TG C  P.592
TMT_aedAeg  L N LFVIALMCKDVQL<font color="RED">W</font>TPINIILFNLVCS D FSVS-IIGN<font color="red"> P </font>FTLTSAIS-- RHWIFGR TV C  P.59.2
  TMT_bomMor . . ............LWTPLNIILFNLVCS D FSVS-VLGN<font color="magenta"> P </font>FTLISALF-- HRWIFGH TM C  P.592
TMT_culPip  L N LFVIALMCKEVQL<font color="RED">W</font>TPMNIILLNLVCS D FSVS-IVGN<font color="red"> P </font>FTLSSAIS-- HRWLFGR KL C  P.59.2
  TMT_rhoPro G N LIVIIIMCRDKNLWTPVNFILFNVIVS D FSVA-ALGN<font color="magenta"> P </font>FTLASAIA-- KRWFFGQ SM C  P.592
TMT_triCas  L N LTVIIFMLKERQL<font color="RED">W</font>SPLNIILFNLVVS D FLVS-VLGN<font color="red"> P </font>WTFFSAIN-- YGWIFGE TG C  P.59.2
  TMTa_dapPu M N IVVVVIILNDSQKMTPLNWMLLNLACS D GAIA-GFGT<font color="magenta"> P </font>ISAAAALK-- FTWPFSH EL C  P.592
TMT_bomMor  L N LMVILLMFKDRQL<font color="RED">W</font>TPLNIILFNLVCS D FSVS-VLGN<font color="red"> P </font>FTLISALF-- HRWIFGH TM C  P.59.2
  TMTb_dapPu M N VVVVIVILNDSQRMTPLNWMLLNLACS D GAIA-GFGT<font color="magenta"> P </font>ISTAAALE-- FGWPFSQ EL C  P.592</font>
TMT_helVir  L N LMVILLMFKDRQL<font color="RED">W</font>TPLNIILFNLVCS D FSVS-VLGN<font color="red"> P </font>FTLISALF-- HRWIFGK TM C  P.59.2
  ENCEPHa_ne T N TIVVIIFISSQRLHTTPNLILFSMSVC D WLMA-TMAK S VGIYGNAR-- YWPTVGK VT C  P.592
TMT_rhoPro  G N LIVIIIMCRDKNL<font color="RED">W</font>TPVNFILFNVIVS D FSVA-ALGN<font color="red"> P </font>FTLASAIA-- KRWFFGQ SM C  P.59.2
ENCEPHb_ne  T N TIVVITFIFSKRLHTTPNLILFSMSVC D WLMA-AMAK S VGIYGNAR-- YWPTVGK VT C  P.592
TMT_acyPis  F N TCVIFIMIRDTRL<font color="RED">W</font>TPQNVIIFNLATS D LAVS-VLGN<font color="red"> P </font>VTLAAAIT-- KGWIFGQ TI C  P.59.2
ENCEPHc_ne  L N GIVLIIFLATRSLRTIPNMILLSMAWA D WLMA-CLAD A VGAYANAN-- NWPSMVG GL C  P.592
TMTa_dapPu  M N IVVVVIILNDSQK<font color="BROWN">M</font>TPLNWMLLNLACS D GAIA-GFGT<font color="red"> P </font>ISAAAALK-- FTWPFSH EL C  P.59.2
TMT1_plaDu  S N GVIMYLYFKDKSLRSPMNLLFVNLAMS D FTVA-FFGA M FQFGLTCTR- KYMSPGM AL C  P.591
TMTb_dapPu  M N VVVVIVILNDSQR<font color="BROWN">M</font>TPLNWMLLNLACS D GAIA-GFGT<font color="red"> P </font>ISTAAALE-- FGWPFSQ EL C  P.59.2</font>
  TMT2_plaDu  L N VLVLVLFIKDRKLRSPNNFLYVSLALG D LLVA-VFGT A FKFIITARK- TLLREED GF C  P.591
TMT1_plaDu  S N GVIMYLYFKDKSLRSPMNLLFVNLAMS D FTVA-FFGA M FQFGLTCTR- KYMSPGM AL C  P.59.1
  TMT2_plaDu  L N VLVLVLFIKDRKLRSPNNFLYVSLALG D LLVA-VFGT A FKFIITARK- TLLREED GF C  P.59.1
   
   
  <font color="#0066CC">RGR1_homSa L N TLTIFSFCKTPELRTPCHLLVLSLALA D SGIS-LNAL V AATSSLL--- RRWPYGS DG C  P.593
  <font color="green">TMT_triCys L N GLVIAVLIKYIRTITNTNIIVLSMSCA N ILIP-LLGS<font color="red"> P </font>LSATSSLM-- RKWQFGN GG C  P.59.2
  RGR1_ornAn L N GLTIASFRKIKELRTPSNLLVVSLALA D SGIC-LNAL M AALSSFL--- RHWPYGA EG C  P.593
  CUBOP_carR L N MIVLITFYRLRHKLAFKDALMASMAFS D VVQA-IVGY<font color="red"> P </font>LEVFTVVD-- GKWTFGM EL C  P.59.2
RGR1_galGa  L N GLTIISFRKIKELRTPSNLLVLSIALA D CGIC-INAF I AAFSSFL--- RYWPYGS EG C  P.593
  MEL1_acrMi L N SVVILTFLLDRSLLFPANLIILSIAIS D WLMS-VVPN I MGGVANAS-- NDLPFTD WS C  P.59.2
RGR1_xenTr  L N GLTLLSFYKIRELRTPSNLFIISLAVA D TGLC-LNAF V AAFSSFL--- RYWPYGS EG C  P.593
  ENC_nemVec T N TIVVIIFISSQRLHTTPNLILFSMSVC D WLMA-TMAK S VGIYGNAR-- YWPTVGK VT C  P.59.2
  RGR1_gasAc L N AVTIAAFLKVRELRTPSNFLVFSLAVA D IGIS-MNAT I AAFSSFL--- RYWPYGS DG C  P.593
  ENC_nemVec T N TIVVITFIFSKRLHTTPNLILFSMSVC D WLMA-AMAK S VGIYGNAR-- YWPTVGK VT C  P.59.2
RGR2_danRe  L N AISVLAFLRVREMQTPNNFFIFNLAVA D LSLN-INGL V AAYACYL--- RHWPFGS EG C  P.593
  ENC_nemVec L N GIVLIIFLATRSLRTIPNMILLSMAWA D WLMA-CLAD A VGAYANAN-- NWPSMVG GL C  P.59.2</font>
  RGR2_pimPr L N LISVLAFLRVREIQTPNNFFIFNLAVA D LSLN-INGL V AAYASYL--- RYWPFGS EG C  P.593
RGR2_tetNi  L N AISIVSFLTVKEMRNPSNFFVFNLALA D ISLN-VNGL I AAYASYL--- RYWPFGQ DG C  P.593
  RGR2_gasAc L N AISIASFLRVKEMWNPSNFFVFNLAVA D ICLN-VNGL T AAYASYL--- RYWPFGQ DG C  P.593
RGR2_oryLa  L N AISILAFLRVKEMRSPSSFLVFNLALA D ISLN-INGL T AAYASYL--- RYWPFGQ EG C  P.593
  RGR1_calMi L N GLTLLAFYKIKELRTPSNLLITSLALS D FGIS-MNAF I AAFSSFL--- RYWPYGS EG C  P.593
RGRa_cioIn  G Y SLLFVIFAKRPDLKK-KNKFLLSLATS D LLIT-VHVF A STIAAFA--- PQWPFGD LG C  P.593
RGRa_cioSa  G Y GLLFVIFAKSPDLKK-KNRFLFSLAVS D LLIT-IHVV A SVVASFQ--- SEWPFGS IG C  P.593</font>
RGRb1_cioI  G Y AVYFGAIWRSKTLQT-RHIWLTSLACG D IIMM-VHLI L ESLSSLGM-- GHRPRQN FE C  P.593
RGRb2_cioI  G Y SVYILAIWSSKKLQT-KHIWLTSLACA D LLMM-VHLF M DGLSSFHQ-- GRRPKGI FE C  P.593
RGRb2_cioS  G Y SIYLRAIWSSRKLQT-RHIWLTSLACA D LIMM-VHLF M DGLSSFHQ-- GRRPKGN FE C  P.593
   
   
  PER1_homSa S N IIVLGIFIKYKELRTPTNAIIINLAVT D IGVS-SIGY P MSAASDLY-- GSWKFGY AG C  P.592
  <font color="blue">RGR1_homSa L N TLTIFSFCKTPELRTPCHLLVLSLALA D SGIS-LNAL V AATSSLL<font color="magenta">-</font>-- RRWPYGS DG C  P.59.3
  PER1_ornAn S N VIVLGIFVKFEELRTATNAIIINLAVT D IGVS-GIGY P MSAASDLH-- GSWKFGH AG C  P.592
  RGR1_ornAn L N GLTIASFRKIKELRTPSNLLVVSLALA D SGIC-LNAL M AALSSFL<font color="magenta">-</font>-- RHWPYGA EG C  P.59.3
  PER1_monDo S N VIVLGIFVKYKALRTATNTIIINLAVT D IGVS-SIGY P MSAASDLY-- GSWKFGY DG C  P.592
  RGR1_galGa L N GLTIISFRKIKELRTPSNLLVLSIALA D CGIC-INAF I AAFSSFL<font color="magenta">-</font>-- RYWPYGS EG C  P.59.3
  PER1_xenTr S N IIVLGIFVKYKELRTATNAIIINLAFT D IGVS-GIGY P MSAASDLH-- GSWKFGY VG C  P.592
  RGR1_xenTr L N GLTLLSFYKIRELRTPSNLFIISLAVA D TGLC-LNAF V AAFSSFL<font color="magenta">-</font>-- RYWPYGS EG C  P.59.3
  PER1_gasAc S N IVVLLMFWKFKELRTATNFIIINLAFT D IGVA-GIGY P MSAASDIH-- GSWKFGY AG C  P.592
  RGR1_gasAc L N AVTIAAFLKVRELRTPSNFLVFSLAVA D IGIS-MNAT I AAFSSFL<font color="magenta">-</font>-- RYWPYGS DG C  P.59.3
  PER1a_sacK L S SVNFRMLLSNPDYCSKAGNFFLSLAVT D LCVC-IFET P FSAFSHHA-- GFWIFGD TA C  P.592
  RGR2_danRe L N AISVLAFLRVREMQTPNNFFIFNLAVA D LSLN-INGL V AAYACYL<font color="magenta">-</font>-- RHWPFGS EG C  P.59.3
  PER1_lotGi L S LLVALTFIREKGLFKYGRAWLHISLAI A NVGV-VGAF P FSGSSSFS-- GRWLYGS GM C  P.592
  RGR2_pimPr L N LISVLAFLRVREIQTPNNFFIFNLAVA D LSLN-INGL V AAYASYL<font color="magenta">-</font>-- RYWPFGS EG C  P.59.3
  PER1_aplCa L N LLTALTFYKDTKLTKGSQPWLHILLAL A NVGV-VAPS P FPASSSFS-- GRWLYGS TM C  P.592
  RGR2_tetNi L N AISIVSFLTVKEMRNPSNFFVFNLALA D ISLN-VNGL I AAYASYL<font color="magenta">-</font>-- RYWPFGQ DG C  P.59.3
  PER1_todPa L C GMCIIFLARQSPKPRRKYAILIHVLIT A MAV--NGGD P AHASSSIV-- GRWLYGS VG C  P.582
  RGR2_gasAc L N AISIASFLRVKEMWNPSNFFVFNLAVA D ICLN-VNGL T AAYASYL<font color="magenta">-</font>-- RYWPFGQ DG C  P.59.3
  PER1b_sacK G N SVVLEMFRRYKELLSPSAILLISLALA D LGLT-IFGM S LSCVSSFA-- GRWLFGK FG C  P.592
  RGR2_oryLa L N AISILAFLRVKEMRSPSSFLVFNLALA D ISLN-INGL T AAYASYL<font color="magenta">-</font>-- RYWPFGQ EG C  P.59.3
<font color="#0066CC">PER1_braFl  G N IFAIIVFLTEKEFRKKEHNSFALNLAIA D LSVCVFAY P SSTISGYA-- GEWMLGD VG C  P.602
  RGR1_calMi L N GLTLLAFYKIKELRTPSNLLITSLALS D FGIS-MNAF I AAFSSFL<font color="magenta">-</font>-- RYWPYGS EG C  P.59.3
  PER1_braBe G N VITITVFLTEKEFRKKQQNGFVLNLAIA D LSVCVFAY P SSAIAGYA-- GRWVLGD VG C  P.602
  RGRa_cioIn G Y SLLFVIFAKRPDLKK<font color="magenta">-</font>KNKFLLSLATS D LLIT-VHVF A STIAAFA<font color="magenta">-</font>-- PQWPFGD LG C  P.59.3
  PER2_braFl G N ATVVLMFMLKWRQLCRKANLLIINLAAV D LCISVFGY P FSASSGFA-- NQWLFSD AI C  P.602
  RGRa_cioSa G Y GLLFVIFAKSPDLKK<font color="magenta">-</font>KNRFLFSLAVS D LLIT-IHVV A SVVASFQ<font color="magenta">-</font>-- SEWPFGS IG C  P.59.3</font>
PER2_braBe  G N ATVVLMFIMKWRQLCRKANLLVINLAAA N LCITIFGY P FSASSGYA-- HQWLFPD AI C  P.602
  <font color="brown">RGRb1_cioI G Y AVYFGAIWRSKTLQT<font color="magenta">-</font>RHIWLTSLACG D IIMM-VHLI L ESLSSLGM-- GHRPRQN FE C  P.59.2
  PER2a_strP G N ITVICVLCRYRTFRKRSINLLLINMAAS D LGVSVAGY P LTTVSGYW-- GRWLFGD VG C  P.602
  RGRb2_cioI G Y SVYILAIWSSKKLQT<font color="magenta">-</font>KHIWLTSLACA D LLMM-VHLF M DGLSSFHQ-- GRRPKGI FE C  P.59.2
PER2b_strP  G N ITVLCVLCRYGTFRKRSVNILLMNMAVS D LGVSVAGY P LTAISGYR-- GRWVFAD IG C  P.602</font>
  RGRb2_cioS G Y SIYLRAIWSSRKLQT<font color="magenta">-</font>RHIWLTSLACA D LIMM-VHLF M DGLSSFHQ-- GRRPKGN FE C  P.59.2</font>
  PER2_patYe G N LLIIIVFAKRRSVRRPINFFVLNLAVS D LIVA-LLGY P MTAASAFS-- NRWIFDN IG C  P.592
  PER3_braFl E N GITLATFTKFRSLRSPTTMLLVHLAIA D LGIC-IFGY P FSGASSLR-- SHWLFGG VG C  P.592
PER3_braBe  E N GITLATFSKFRSLRSPTTMLLVHLAIA D LGIC-IFGY P FSGASSLR-- SHWLFGG VG C  P.592
  PER3_hadAd G N GLVLVTFLRFRVLVTPTTLLLVNLAVS D LGLI-LFGF P FSASSSLS-- AKWIFGE GG C  P.592
   
   
  NEUR_strPu G N ISVIVISLRKREKLKPIDLLTINLAIA D FLIC-VVSY P LPMISAFR-- HRWSFGK FG C  P.592
  <font color="blue">PER1_lotGi L S LLVALTFIREKGLFKYGRAWLHISLAI <font color="red">A</font> NVGV-VGAF P FSGSSSFS-- GRWLYGS GM C  P.59.2
  NEUR1_homS G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.592
  PER1_aplCa L N LLTALTFYKDTKLTKGSQPWLHILLAL <font color="red">A</font> NVGV-VAPS P FPASSSFS-- GRWLYGS TM C  P.59.2
  NEUR1_calJ G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.592
  PER1_todPa L C GMCIIFLARQSPKPRRKYAILIHVLIT <font color="red">A</font> MAV--NGGD P AHASSSIV-- GRWLYGS VG C  P.58.2</font>
  NEUR1_dasN G N GYVLYMSSKRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.592
  PER1_homSa S N IIVLGIFIKYKELRTPTNAIIINLAVT D IGVS-SIGY P MSAASDLY-- GSWKFGY AG C  P.59.2
NEUR1_canF  G N GYVLYMSSRRKKKLRPAEIMTINLAIC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.592
  PER1_ornAn S N VIVLGIFVKFEELRTATNAIIINLAVT D IGVS-GIGY P MSAASDLH-- GSWKFGH AG C  P.59.2
  NEUR1_bosT G N GYVLYMSSRRKKKLRPAEIMTVNLAIC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.592
  PER1_monDo S N VIVLGIFVKYKALRTATNTIIINLAVT D IGVS-SIGY P MSAASDLY-- GSWKFGY DG C  P.59.2
NEUR1_musM  G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW FG C  P.592
  PER1_xenTr S N IIVLGIFVKYKELRTATNAIIINLAFT D IGVS-GIGY P MSAASDLH-- GSWKFGY VG C  P.59.2
  NEUR1_loxA G N GYVLYMSCRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FVIISCFC-- HRWVFGW IG C  P.592
  PER1_gasAc S N IVVLLMFWKFKELRTATNFIIINLAFT D IGVA-GIGY P MSAASDIH-- GSWKFGY AG C  P.59.2
NEUR1_ochP  G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.592
  PER1a_sacK L S SVNFRMLLSNPDYCSKAGNFFLSLAVT D LCVC-IFET P FSAFSHHA-- GFWIFGD TA C  P.59.2
  NEUR1_monD G N GYVIYMSSKRKKKLRPAEIMTVNLAVC D LGIS-VVGK P FTIISCFS-- HRWVFGW VG C  P.592
  PER1b_sacK G N SVVLEMFRRYKELLSPSAILLISLALA D LGLT-IFGM S LSCVSSFA-- GRWLFGK FG C  P.59.2
  NEUR1_ornA G N GYVIYMSSRRKKKLRPAEIMTVNLAVC D LGIS-VVGK P FTIVSCFC-- HRWVFGW MG C  P.592
  PER3_braFl E N GITLATFTKFRSLRSPTTMLLVHLAIA D LGIC-IFGY P FSGASSLR-- SHWLFGG VG C  P.59.2
  NEUR1_galG G N GYVIFMSSKRKKKLRPAEIMTVNLAVC D LGIS-VVGK P FSIISFFS-- HRWIFGW MG C  P.592
  PER3_braBe E N GITLATFSKFRSLRSPTTMLLVHLAIA D LGIC-IFGY P FSGASSLR-- SHWLFGG VG C  P.59.2
  NEUR1_xenT G N GYVIYMACSRKKKLRPAEIMTINLAVC D LGIS-VTGK P FAIVSCFS-- HRWVFGW NA C  P.592
  <font color="green">PER3_hadAd</font> G N GLVLVTFLRFRVLVTPTTLLLVNLAVS D LGLI-LFGF P FSASSSLS-- AKWIFGE GG C  P.59.2
  NEUR1_danR G N GYVMYMTFKRKTKLKPPEIMTLNLAIF D FGIS-VSGK P FFIVSSFS-- HRWLFGW QG C  P.592
  <font color="green">PER2_patYe</font> G N LLIIIVFAKRRSVRRPINFFVLNLAVS D LIVA-LLGY P MTAASAFS-- NRWIFDN IG C  P.59.2
  NEUR1_calM G N GYVIYLSITQKRKLKPPEILITNLAIS D FGMS-VGGQ P FLIISCFS-- HRWIFGW VG C  P.592
  <font color="blue">PER1_braFl G N IFAIIVFLTEKEFRKKE<font color="magenta">H</font>NSFALNLAIA D LSVCVFAY P SSTISGYA-- GEWMLGD VG C  P.60.2
NEUR1a_bra  G N GRVLWLSYRCRARLRPVEMFVVSLAVA D VGLS-LVGH P FAAASSLM-- GRWSFGS AG C  P.592
  PER1_braBe G N VITITVFLTEKEFRKKQ<font color="magenta">Q</font>NGFVLNLAIA D LSVCVFAY P SSAIAGYA-- GRWVLGD VG C  P.60.2
  NEUR1b_bra G N GRVLWLSYRNWAKLRPVELFVVSLAVT D VGIS-VFGY P FAASSSLL-- GRWSFGS AG C  P.592
  PER2_braFl G N ATVVLMFMLKWRQLCRK<font color="magenta">A</font>NLLIINLAAV D LCISVFGY P FSASSGFA-- NQWLFSD AI C  P.60.2
NEUR2_galG  G N SILLYISYKKKHLLKPAEYFIINLAIS D LAMT-LTLY P LAVTSSLS-- HRWLYGK HI C  P.592
  PER2_braBe G N ATVVLMFIMKWRQLCRK<font color="magenta">A</font>NLLVINLAAA N LCITIFGY P FSASSGYA-- HQWLFPD AI C  P.60.2
  NEUR2_anoC G N SILLYVSYKKKNLLKPAEYFMINLAIS D LGMT-LTLY P LAVTSSLA-- HRWLFGQ QV C  P.592
  PER2a_strP G N ITVICVLCRYRTFRKRS<font color="magenta">I</font>NLLLINMAAS D LGVSVAGY P LTTVSGYW-- GRWLFGD VG C  P.60.2
NEUR2_xenT  G N SMLLLVAYRKRSILKPAEFFIVNLSIS D LGMT-GTLF P LAIPSLFA-- HRWLFDK VT C  P.592
  PER2b_strP G N ITVLCVLCRYGTFRKRS<font color="magenta">V</font>NILLMNMAVS D LGVSVAGY P LTAISGYR-- GRWVFAD IG C  P.60.2</font>
  NEUR2_danR G N GMLLFVAYRKRSSLKPAEFFVVNLSVS D LGMT-LSLF P LAIPSALA-- HRWLFGE IT C  P.592
NEUR2_calM  G N SVLLFVAYRKRQILKPAEYFVANLAVS D ISMT-VTLL P LAISSNFS-- HRWLFVS KP C  P.592
  NEUR3_galG G N SAVLATAVKRSSLLKSPELLTVNLAVA D IGMA-ISMY P LAIASAWN-- HAWLGGD AS C  P.592
NEUR3_taeG  G N SAVLATAVKRSSLLKPPELLTVNLAVA D IGMA-LSMY P LAIASAWS-- HAWLGGD AS C  P.592
  NEUR3_xenT G N CAVLATAVKCSSHLKAPDLLSINLAVA D LGMA-ISMY P LAIASAWN-- HAWLGGD AS C  P.592
NEUR3_anoC  G N SMVLAVAVKRSSCLRSPELLTVNLAAT D LGMG-LSMY P LAIASAWN-- HAWLGGE AT C  P.592
  NEUR3b_dan G N LMVLVMAYKRSNHMKPPELLSVNLAVT D LGAA-VTMY P LAVASAWN-- HHWIGGD VS C  P.592
NEUR3a_dan  G N AAVLLTAAWRHSVLKAPELLTVNLAVT D IGMA-LSMY P LSIASAFN-- HAWIGGD PS C  P.592
NEUR3a_tet  G N ASVLFSASRRLTPLKAPELLTVNLAVT D IGMA-LSMY P LSIASAFN-- HAWMGGD TA C  P.592
  NEUR3_petM G N GAVLGVAARRWAKLKAPELLSVNLALT D LGIA-ASIY P LAVASAWN-- HRWLGGQ PV C  P.592
<font color="brown">NEUR4_ornA  G N SMVIFILHRQRGILNPTDYLTFNLAVS D ASVS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.620
NEUR4_galG  G N SVVIFVLYKQRHLLQPTDYLTFNLAVS D ASIS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.620
NEUR4_taeG  G N SIVIFVLYKQRHVLQPTDYLTFNLAVS D ASIS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.620
NEUR4_anoc  G N SIVIFVLYRQRAGLQPTDYLTFNLAVS D ASVS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.620
  NEUR4_xenT G N SIVIFVLYKQRANLLPTDYLTFNLAVS D ASTS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.620
NEUR4_danR  G N SIVIFVLFRQRSTLQPTDYLTLNLAVS D ASIS-VFGY S RGILEIFNIF KDSGYSV WT C  P.620
NEUR4_tetN  G N TVVLFVLVRQRSSLQPTDLLTFNLAVS D ASIS-VFGY S RGIIQIFNVF QDSGFSI WT C  P.620
NEUR4_gasA  G N SLVMFVLYRQRASLQSTDFLTLNLAIS D ASIS-IFGY S RGILEIFNIF NDDGTWI WT C  P.620
NEUR4_calM  G N SIVIFILYRQRLSLQPPDYLTLNLAVS D ASIS-IFGY S RGIIEIFNVF RDDGFSI WT C  P.620</font>
   
   
  <font color="gray">GPR17_homS  G N TLALWLFIRDHKSGTPANVFLMHLAVA D LSCV--LVL P TRLVYHFSG- NHWPFGE IA C  P.581
NEUR1_homS  G N GYVLYMSSRRKKKLRPA<font color="RED">E</font>IMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
  CYSLTR1_ho  G N GFVLYVLIKTYHKKSAFQVYMINLAVA D LLCV--CTL P LRVVYYVHK- GIWLFGD FL C  P.581
NEUR1_calJ  G N GYVLYMSSRRKKKLRPA<font color="RED">E</font>IMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
  P2RY8_homS  G N LFSLWVLCRRMGPRSPSVIFMINLSVT D LMLA--SVL P FQIYYHCNR- HHWVFGV LL C  P.581
NEUR1_dasN  G N GYVLYMSSKRKKKLRPA<font color="RED">E</font>IMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
  BDKRB2_hom  E N IFVLSVFCLHKSSCTVAEIYLGNLAAA D LILA--CGL P FWAITISNN- FDWLFGE TL C  P.581
NEUR1_canF  G N GYVLYMSSRRKKKLRPA<font color="RED">E</font>IMTINLAIC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
  SSTR1_homS  G N SMVIYVILRYAKMKTATNIYILNLAIA D ELLM--LSV P FLVTSTLL-- RHWPFGA LL C  P.582
NEUR1_bosT  G N GYVLYMSSRRKKKLRPA<font color="RED">E</font>IMTVNLAIC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
  OPRL1_homS  G N CLVMYVILRHTKMKTATNIYIFNLALA D TLVL--LTL P FQGTDILL-- GFWPFGN AL C  P.582
NEUR1_musM  G N GYVLYMSSRRKKKLRPA<font color="RED">E</font>IMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW FG C  P.59.2
  OPRM1_homS  G N FLVMYVIVRYTKMKTATNIYIFNLALA D ALAT--STL P FQSVNYLM-- GTWPFGT IL C  P.582
NEUR1_loxA  G N GYVLYMSCRRKKKLRPA<font color="RED">E</font>IMTINLAVC D LGIS-VVGK P FVIISCFC-- HRWVFGW IG C  P.59.2
  CCR4_homSa  G N SVVVLVLFKYKRLRSMTDVYLLNLAIS D LLFV--FSL P FWGYYAA--- DQWVFGL GL C  P.583
NEUR1_ochP  G N GYVLYMSSRRKKKLRPA<font color="RED">E</font>IMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
  TACR2_homS  G N AIVIWIILAHRRMRTVTNYFIVNLALA D LCMA-AFNA A FNFVYASH-- NIWYFGR AF C  P.592
NEUR1_monD  G N GYVIYMSSKRKKKLRPA<font color="RED">E</font>IMTVNLAVC D LGIS-VVGK P FTIISCFS-- HRWVFGW VG C  P.59.2
  GALR1_homS  G N SLVITVLARSKKPRSTTNLFILNLSIA D LAYL-LFCI P FQATVYAL-- PTWVLGA FI C  P.592
NEUR1_ornA  G N GYVIYMSSRRKKKLRPA<font color="RED">E</font>IMTVNLAVC D LGIS-VVGK P FTIVSCFC-- HRWVFGW MG C  P.59.2
  QRFPR_homS  G N ALVFYVVTRSKAMRTVTNIFICSLALS D LLIT-FFCI P VTMLQNIS-- DNWLGGA FI C  P.592
NEUR1_galG  G N GYVIFMSSKRKKKLRPA<font color="RED">E</font>IMTVNLAVC D LGIS-VVGK P FSIISFFS-- HRWIFGW MG C  P.59.2
  PPYR1_homS  G N LCLMCVTVRQKEKANVTNLLIANLAFS D FLMC-LLCQ P LTAVYTIM-- DYWIFGE TL C  P.592
NEUR1_xenT  G N GYVIYMACSRKKKLRPA<font color="RED">E</font>IMTINLAVC D LGIS-VTGK P FAIVSCFS-- HRWVFGW NA C  P.59.2
  NPY1R_homS  G N LALIIIILKQKEMRNVTNILIVNLSFS D LLVA-IMCL P FTFVYTLM-- DHWVFGE AM C  P.592
NEUR1_danR  G N GYVMYMTFKRKTKLKPP<font color="RED">E</font>IMTLNLAIF D FGIS-VSGK P FFIVSSFS-- HRWLFGW QG C  P.59.2
  GPR19_homS  G N SLVCLVIHRSRRTQSTTNYFVVSMACA D LLIS-VAST P FVLLQFTT-- GRWTLGS AT C  P.592
NEUR1_calM  G N GYVIYLSITQKRKLKPP<font color="RED">E</font>ILITNLAIS D FGMS-VGGQ P FLIISCFS-- HRWIFGW VG C  P.59.2
  HCRTR1_hom  G N TLVCLAVWRNHHMRTVTNYFIVNLSLA D VLVT-AICL P ASLLVDIT-- ESWLFGH AL C  P.592
NEUR1a_bra  G N GRVLWLSYRCRARLRPV<font color="RED">E</font>MFVVSLAVA D VGLS-LVGH P FAAASSLM-- GRWSFGS AG C  P.59.2
  GPR161_hom  G N LVIVVTLYKKSYLLTLSNKFVFSLTLS N FLLS-VLVL P FVVTSSIR-- REWIFGV VW C  P.592
NEUR1b_bra  G N GRVLWLSYRNWAKLRPV<font color="RED">E</font>LFVVSLAVT D VGIS-VFGY P FAASSSLL-- GRWSFGS AG C  P.59.2
  ADRA1D_hom  G N LLVILSVACNRHLQTVTNYFIVNLAVA D LLLS-ATVL P FSATMEVL-- GFWAFGR AF C  P.592
NEUR_strPu  G N ISVIVISLRKREKLKPI<font color="BROWN">D</font>LLTINLAIA D FLIC-VVSY P LPMISAFR-- HRWSFGK FG C  P.59.2
  ADRB2_homS  G N VLVITAIAKFERLQTVTNYFITSLACA D LVMG-LAVV P FGAAHILM-- KMWTFGN FW C  P.592
NEUR2_galG  G N SILLYISYKKKHLLKPA<font color="RED">E</font>YFIINLAIS D LAMT-LTLY P LAVTSSLS-- HRWLYGK HI C  P.59.2
  ADRB1_melG  G N VLVIAAIGSTQRLQTLTNLFITSLACA D LVVG-LLVV P FGATLVVR-- GTWLWGS FL C  P.592
NEUR2_anoC  G N SILLYVSYKKKNLLKPA<font color="RED">E</font>YFMINLAIS D LGMT-LTLY P LAVTSSLA-- HRWLFGQ QV C  P.59.2
  PRLHR_homS  G N CLLVLVIARVRRLHNVTNFLIGNLALS D VLMC-TACV P LTLAYAFEP- RGWVFGG GL C  P.591
NEUR2_xenT  G N SMLLLVAYRKRSILKPA<font color="RED">E</font>FFIVNLSIS D LGMT-GTLF P LAIPSLFA-- HRWLFDK VT C  P.59.2
  NMUR2_homS  G N VLVCLVILQHQAMKTPTNYYLFSLAVS D LLVL-LLGM P LEVYEMWRN- YPFLFGP VG C  P.591
NEUR2_danR  G N GMLLFVAYRKRSSLKPA<font color="RED">E</font>FFVVNLSVS D LGMT-LSLF P LAIPSALA-- HRWLFGE IT C  P.59.2
  ADORA2A_ho  G N VLVCWAVWLNSNLQNVTNYFVVSLAAA D IAVG-VLAI P FAITIS---- TGFCAAC HG C  P.594
NEUR2_calM  G N SVLLFVAYRKRQILKPA<font color="RED">E</font>YFVANLAVS D ISMT-VTLL P LAISSNFS-- HRWLFVS KP C  P.59.2
  TRHR_homSa  G N IMVVLVVMRTKHMRTPTNCYLVSLAVA D LMVLVAAGL P NITDSIY--- GSWVYGY VG C  P.603
NEUR3_galG  G N SAVLATAVKRSSLLKS<font color="RED">PE</font>LLTVNLAVA D IGMA-ISMY P LAIASAWN-- HAWLGGD AS C  P.59.2
  NPY2R_homS  G N SLVIHVVIKFKSMRTVTNFFIANLAVA D LLVN-TLCL P FTLTYTLM-- GEWKMGP VL C  P.592</font>
NEUR3_taeG  G N SAVLATAVKRSSLLKP<font color="RED">PE</font>LLTVNLAVA D IGMA-LSMY P LAIASAWS-- HAWLGGD AS C  P.59.2
NEUR3_xenT  G N CAVLATAVKCSSHLKA<font color="RED">P</font><font color="BROWN">D</font>LLSINLAVA D LGMA-ISMY P LAIASAWN-- HAWLGGD AS C  P.59.2
NEUR3_anoC  G N SMVLAVAVKRSSCLRS<font color="RED">PE</font>LLTVNLAAT D LGMG-LSMY P LAIASAWN-- HAWLGGE AT C  P.59.2
NEUR3b_dan  G N LMVLVMAYKRSNHMKP<font color="RED">PE</font>LLSVNLAVT D LGAA-VTMY P LAVASAWN-- HHWIGGD VS C  P.59.2
NEUR3a_dan  G N AAVLLTAAWRHSVLKA<font color="RED">PE</font>LLTVNLAVT D IGMA-LSMY P LSIASAFN-- HAWIGGD PS C  P.59.2
NEUR3a_tet  G N ASVLFSASRRLTPLKA<font color="RED">PE</font>LLTVNLAVT D IGMA-LSMY P LSIASAFN-- HAWMGGD TA C  P.59.2
NEUR3_petM  G N GAVLGVAARRWAKLKA<font color="RED">PE</font>LLSVNLALT D LGIA-ASIY P LAVASAWN-- HRWLGGQ PV C  P.59.2
<font color="blue">NEUR4_ornA  G N SMVIFILHRQRGILNPT<font color="BROWN">D</font>YLTFNLAVS D ASVS-VFGY <font color="red">S</font> RGIIEIFN<font color="red">VF</font> R<font color="red">D</font>DGFSI WT C  P.62.0
NEUR4_galG  G N SVVIFVLYKQRHLLQPT<font color="BROWN">D</font>YLTFNLAVS D ASIS-VFGY <font color="red">S R</font>GIIEIFN<font color="red">VF</font> R<font color="red">D</font>DGFSI WT C  P.62.0
NEUR4_taeG  G N SIVIFVLYKQRHVLQPT<font color="BROWN">D</font>YLTFNLAVS D ASIS-VFGY <font color="red">S R</font>GIIEIFN<font color="red">VF</font> R<font color="red">D</font>DGFSI WT C  P.62.0
NEUR4_anoc  G N SIVIFVLYRQRAGLQPT<font color="BROWN">D</font>YLTFNLAVS D ASVS-VFGY <font color="red">S R</font>GIIEIFN<font color="red">VF</font> R<font color="red">D</font>DGFSI WT C  P.62.0
NEUR4_xenT  G N SIVIFVLYKQRANLLPT<font color="BROWN">D</font>YLTFNLAVS D ASTS-VFGY <font color="red">S R</font>GIIEIFN<font color="red">VF</font> R<font color="red">D</font>DGFSI WT C  P.62.0
NEUR4_danR  G N SIVIFVLFRQRSTLQPT<font color="BROWN">D</font>YLTLNLAVS D ASIS-VFGY <font color="red">S R</font>GILEIFN<font color="red">IF</font> K<font color="red">D</font>SGYSV WT C  P.62.0
NEUR4_tetN  G N TVVLFVLVRQRSSLQPT<font color="BROWN">D</font>LLTFNLAVS D ASIS-VFGY <font color="red">S R</font>GIIQIFN<font color="red">VF</font> Q<font color="red">D</font>SGFSI WT C  P.62.0
NEUR4_gasA  G N SLVMFVLYRQRASLQST<font color="BROWN">D</font>FLTLNLAIS D ASIS-IFGY <font color="red">S R</font>GILEIFN<font color="red">IF</font> N<font color="red">D</font>DGTWI WT C  P.62.0
NEUR4_calM  G N SIVIFILYRQRLSLQPP<font color="BROWN">D</font>YLTLNLAVS D ASIS-IFGY <font color="red">S R</font>GIIEIFN<font color="red">VF</font> R<font color="red">D</font>DGFSI WT C  P.62.0</font>
  <font color="gray">UROPS1_tri  G N MLVFLTFYKHASLRTTSNLFIINLAIT D LLTG-GIKD T LFIYGLTS-- YNWPKSA IL C  P.58.2
UROPS2_tri  G N GAVLLVLRYHHDDIKSASNYFITNLALTD FLLG-VLCM P CILISCLN-- GQWVFGQ TL C  P.58.2
GPR17_homS  G N TLALWLFIRDHKSGTPANVFLMHLAVA D LSCV--LVL P TRLVYHFSG- NHWPFGE IA C  P.58.1
  CYSLTR1_ho  G N GFVLYVLIKTYHKKSAFQVYMINLAVA D LLCV--CTL P LRVVYYVHK- GIWLFGD FL C  P.58.1
  P2RY8_homS  G N LFSLWVLCRRMGPRSPSVIFMINLSVT D LMLA--SVL P FQIYYHCNR- HHWVFGV LL C  P.58.1
  BDKRB2_hom  E N IFVLSVFCLHKSSCTVAEIYLGNLAAA D LILA--CGL P FWAITISNN- FDWLFGE TL C  P.58.1
  SSTR1_homS  G N SMVIYVILRYAKMKTATNIYILNLAIA D ELLM--LSV P FLVTSTLL-- RHWPFGA LL C  P.58.2
  OPRL1_homS  G N CLVMYVILRHTKMKTATNIYIFNLALA D TLVL--LTL P FQGTDILL-- GFWPFGN AL C  P.58.2
  OPRM1_homS  G N FLVMYVIVRYTKMKTATNIYIFNLALA D ALAT--STL P FQSVNYLM-- GTWPFGT IL C  P.58.2
  CCR4_homSa  G N SVVVLVLFKYKRLRSMTDVYLLNLAIS D LLFV--FSL P FWGYYAA--- DQWVFGL GL C  P.58.3
  TACR2_homS  G N AIVIWIILAHRRMRTVTNYFIVNLALA D LCMA-AFNA A FNFVYASH-- NIWYFGR AF C  P.59.2
  GALR1_homS  G N SLVITVLARSKKPRSTTNLFILNLSIA D LAYL-LFCI P FQATVYAL-- PTWVLGA FI C  P.59.2
  QRFPR_homS  G N ALVFYVVTRSKAMRTVTNIFICSLALS D LLIT-FFCI P VTMLQNIS-- DNWLGGA FI C  P.59.2
  PPYR1_homS  G N LCLMCVTVRQKEKANVTNLLIANLAFS D FLMC-LLCQ P LTAVYTIM-- DYWIFGE TL C  P.59.2
  NPY1R_homS  G N LALIIIILKQKEMRNVTNILIVNLSFS D LLVA-IMCL P FTFVYTLM-- DHWVFGE AM C  P.59.2
  GPR19_homS  G N SLVCLVIHRSRRTQSTTNYFVVSMACA D LLIS-VAST P FVLLQFTT-- GRWTLGS AT C  P.59.2
  HCRTR1_hom  G N TLVCLAVWRNHHMRTVTNYFIVNLSLA D VLVT-AICL P ASLLVDIT-- ESWLFGH AL C  P.59.2
  GPR161_hom  G N LVIVVTLYKKSYLLTLSNKFVFSLTLS N FLLS-VLVL P FVVTSSIR-- REWIFGV VW C  P.59.2
  ADRA1D_hom  G N LLVILSVACNRHLQTVTNYFIVNLAVA D LLLS-ATVL P FSATMEVL-- GFWAFGR AF C  P.59.2
  ADRB2_homS  G N VLVITAIAKFERLQTVTNYFITSLACA D LVMG-LAVV P FGAAHILM-- KMWTFGN FW C  P.59.2
  ADRB1_melG  G N VLVIAAIGSTQRLQTLTNLFITSLACA D LVVG-LLVV P FGATLVVR-- GTWLWGS FL C  P.59.2
  PRLHR_homS  G N CLLVLVIARVRRLHNVTNFLIGNLALS D VLMC-TACV P LTLAYAFEP- RGWVFGG GL C  P.59.1
  NMUR2_homS  G N VLVCLVILQHQAMKTPTNYYLFSLAVS D LLVL-LLGM P LEVYEMWRN- YPFLFGP VG C  P.59.1
  ADORA2A_ho  G N VLVCWAVWLNSNLQNVTNYFVVSLAAA D IAVG-VLAI P FAITIS---- TGFCAAC HG C  P.59.4
  TRHR_homSa  G N IMVVLVVMRTKHMRTPTNCYLVSLAVA D LMVLVAAGL P NITDSIY--- GSWVYGY VG C  P.60.3
  NPY2R_homS  G N SLVIHVVIKFKSMRTVTNFFIANLAVA D LLVN-TLCL P FTLTYTLM-- GEWKMGP VL C  P.59.2</font>
 
=== Indels in the TM4-EC2-TM5 region ===
 
The retinal plug loop region EC2 contains the second half of the extracellular disulfide bond which is very important to the overall [http://www.pnas.org/content/101/19/7246.long structural stability] of the GPCR molecule. After years of back and forth, it was eventually [http://www.pnas.org/content/101/19/7246.long concluded] that 92% of gene family members, including all opsin though not all GPCR do contain a disulfide linking extracellular domain EC2 to TM3.
 
[[Image:OpsinStability.jpg|left]]
 
That requires more than just comparative genomic conservation of the two cysteines because overall conservation can be quite high. Furthermore, it is quite difficult to establish homological correspondences in EC2 because of gapping issues (see alignment below). However non-existence or non-conservation of the cysteine does establish absence. All of the opsin outgroup GPCR considered here do contain a disulfide. The disulfide thus is not at all diagnostic for opsin-class GPCR and its absence reliably implies sequencing error.
 
These regions can only be aligned after careful determination of anchor residues on both sides of the disulfide cysteine. Some are subtle, involving small reduced alphabets of similar residues and imperfect invariance rather than absolute conservation. The anchors consist of a universally conserved W in TM4, a proline prior to TM4 exiting the membrane, the GWS.Y..E region unique to opsins, an aromatic residue 3 residues after the C, and finally the reliable P......CY region well within TM5.
 
This region has been quite permissive with regards to indels over evolutionary time. To account for observed length variations, six indels are needed within opsins and dozens more for the full spectrum of rhodopsin-class GPCR. Unsurprisingly almost all of these can be assigned to the EC2 extracellular loop rather than to TM4 or TM5. Somehow the disulfide bond has been able to accommodate these abrupt length variations without loss of steric capacity for formation, suggesting the loop is otherwise weakly constrained.
 
However more than the disulfide and membrane boundaries are important here for this twisted β-hairpin. Almost all opsins are further distinguished by four other conserved residues Y..E.....C..DY (in rhodopsin numbering Y178 E181 C187 D190 Y191) which presumably play the same role in both determined opsin structures. The conservation of the final tyrosine (often tryptophan) persists into GPCR  such as tachykinin receptor with highest overall opsin blast scores, though it is not sufficiently universal to occur in the other three structurally determined GPCR.
 
Various studies pertain to the phenomenal conservation of these four non-cysteine loop residues. First, interactions with other proteins can be eliminated as these take place on the cytoplasmic side. Proper folding of newly synthesized opsin is a [http://pubs.acs.org/doi/abs/10.1021/ct900145u valid consideration]. Y191 is [http://www3.interscience.wiley.com/journal/118586931/abstract proximal to the 9-methyl group] of 11-cis retinal in the dark state; Y191A changes hydrolysis of the Schiff base, folded structure after photobleaching and chromophore release. R177 [http://www.jbc.org/content/278/19/16982.full forms an ion pair] with D190 but comparative genomics quickly shows this has narrow applicability.
 
E181 has been [http://www.ncbi.nlm.nih.gov/pubmed/12835420 interpreted as the counterion] in Meta I rhodopsin (displacing E113 in the dark state); the reorganization of EC2 then propagates to TM3 via a push from the disulfide. Yet it's hard to see how this could work universally because in LWS the residue is invariably histidine and in VAOP where it is always serine (as SK covarying anomaly). These residues lack the negative charge to offset the Schiff lysine. D and E are also found sporadically in non-opsin GPCR where counterion makes no apparent sense -- this could be pursued in the structure of ADRB1.
 
Almost all of the indels resolve to insertions despite their relative rarity compared to deletions proteomewide. These insertions evidently arose from some other mechanism than extension of splice donors or acceptors (indicated below by underlining) because the site of insertion is elsewhere, even allowing for gap uncertainties. If some sequence microstructure predisposed certain regions to insertions, little of that may remain 500 myr later. All the key developments in ciliary imaging opsins took place prior to lamprey divergence.
 
Overall, the pattern of insertions but not deletions suggest a structurally determined floor to admissible loop lengths. After discounting derived forms, all opsin classes other than neuropsins have the same minimal length, including cnidarian and ctenophore opsins. Thus the ancestral opsin likely had this length and pattern:
<font color="blue">W.........P..GW</font><font color="brown">S.Y..E.....C..DW........SY</font><font color="blue">...........P.......Y</font>
 
These region provides various synapamorphies for opsin classes. Each of these is represented by a single proxy sequence that accurately represents its ortholog class. For example, all UV7 arthropod opsins have the same two residue insertion at the end of TM4, not just the Rhodnius prolixis representative shown. Homoplasy can be seen in some instances (eg  the five imaging cilopsins and the four dimly related neuropsins both have two extra residues following Y191) but it remains manageable when it occurs in two well-separated regions of what is a very deep and comprehensive gene tree.
 
[[Image:OpsinEC3indels.png|left]]
<br clear="all">


=== Indels in other opsins ===
=== Indels in other opsins ===


Informative indels would be very helpful in this class of opsins because their sequence relationships to ciliary and melanopsins are too weak. Note [[Opsin_evolution:_ancestral_introns|intron patterns]], another class of even rarer genetic event and so even better suited for deep time scales -- has already illuminated branching relationships to a certain extent.
Informative indels would be very helpful in the peropsin/neuropsin/rgropsin group of opsins because their sequence relationships to ciliary and melanopsins are too weak to determine root topology. Note [[Opsin_evolution:_ancestral_introns|intron patterns]], another class of even rarer genetic event and so even better suited for deep time scales, has already illuminated branching relationships to a certain extent.


(to be continued)  
(to be continued)  
'''See also:''' [[Opsin_evolution|Curated Sequences]] | [[Opsin_evolution:_ancestral_introns|Ancestral Introns]] | [[Opsin_evolution:_Cytoplasmic_face|Cytoplasmic face]] | [[Opsin_evolution:_ancestral_sequences|Ancestral Sequences]] | [[Opsin_evolution:_alignment|Alignment]] | [[Opsin_evolution:_update_blog|Update Blog]]


[[Category:Comparative Genomics]]
[[Category:Comparative Genomics]]

Latest revision as of 16:24, 23 March 2010

See also: Curated Sequences | Ancestral Introns | Cytoplasmic face | Ancestral Sequences | Alignment | Update Blog

Introduction to indels

Insertions and deletions of amino acids (together called coding indels) are a class of genetic event rarely fixed in conserved protein sequence regions. It is not immediately clear whether a given indel represents an insertion or a deletion. The process of deciding is called indel resolution; it requires a phylogenetic tree allowing determination of ancestral length. If outgroups are consistently short, then by parsimony the ingroup clade with longer length experienced an insertion. Indels are unresolvable when outgroup data is not available. Two or more consistent outgroup nodes establishes a period of length stability.

It is implausible -- rarity cubed -- that multiple outgroups plus an ingroup experienced independent deletions of the same length at the same site (though the exact site can be difficult to evaluate if flanking residues were also affected by the original genetic event or subsequently by accelerated compensatory mutation). Advanced statistical methods can provide only illusory gains over simple parsimony because the underlying required models of indel formation are entirely speculative.

Nonetheless, examples of homoplasy are easy to come by, especially in repetitive nucleotide regions encoding runs of compositionally simple amino acids subject to the mutational mechanism of replication slippage. Homoplasy at longer time scales manifests itself by incoherent distribution over a known phylogenetic tree. Convergent evolution can also be driven by selective advantage for altered length.

Indels occur very unevenly across the length of a given protein homology class. The rate might be high in terminal regions if the amino or carboxy termini are unimportant to the fold or function of matured protein. Within folded regions of soluble proteins, indels are greatly concentrated in loop regions of the 3D structure where a change in length can be accommodated without structural disruption. The distributional occurrence of indels even allows prediction of loop regions.

For integral membrane proteins such as GPCR, deletions are very rarely fixed in the transmembrane helical regions because a shortened length would no longer span the membrane at the same angle, thus pulling in inappropriate non-hydrophobic residues from soluble loops. Insertions too are rare because they push hydrophobic and boundary turn residues out into soluble compartments and distort connecting loops, perhaps altering insertion angles of adjacent transmembrane regions. Such mutations arise frequently enough but are rarely fixed at the population level or hang on as balanced alleles over timescales commensurate with ordinal speciations.

In massively expanded gene families such as GPCR, a coherently fixed indel in one descendent clade of the gene tree suggests adaptive sub- or neo-functionalization: if the indel were merely tolerated as near-neutral change, over geological timescales homoplasy at that site would occur. A remarkable site in transmembrane helix 2 was proposed in May 2009:

'Class A GPCR constitute a large family of transmembrane receptors. Helical distortions play a major role in the overall fold of these receptors. Most are related to conserved proline residues. However, in transmembrane helix 2, the proline pattern is not conserved, and when present, proline may be located at position TM 2.58, 2.59, or 2.60 yielding a bulged structure in P2.59 and P2.60 receptors or a more typical proline kink in P2.58 receptors. The proline pattern of helix 2 can be used as an evolutionary marker of molecular divergence of class A GPCRs.

At this site, two independent indel events occurred. One [unresolvable] indel arose very early in GPCR evolution in a bilateran ancestor before protostome-deuterostome divergence. This indel led to the split between the P2.58 somatostatin/opioid receptors and peptide receptors with the P2.59 pattern. Subfamilies with proline at position 2.59 or no proline expanded earlier, whereas P2.60 receptors remained marginal throughout evolution. P2.58 receptors underwent later rapid expansion in vertebrates with the development of the chemokine and purinergic receptor subfamilies from somatostatin/opioid-related ancestors. A second indel, resolvable as a deletion, occurred in insect melanopsins.'

This result refines the classification of Class A GPCR, which might be quite indecisive at certain gene tree nodes from sequence alignment alone. Timing of the insect deletion can be done better (below) because the SwissProt collection used by the authors carries only 20% of the melanopsins actually available. Note the structural significance of length and bulge changes can be examined in available 3D determinations. The functional effect of this shift in TM2 remains obscure but must be important.

Class  Gene           PDB            Protein                     PubMed      Best human opsin   Next Best         Signaling

T.60.1  RHO1_bosTau    1JFP 3C9M 2J4Y bovine rod rhodopsin        17825322  RHO1_homSap 93%   SWS1_homSap   45%  Gt GNAT1 raises cGMP
P.60.0  MEL1_todPac    2Z73 2ZIY      squid melanopsin            18480818  MEL1_homSap 43%   PER1_homSap   30%  Gq GNAQ? inositol trisphosphate
P.59.3  ADORA2A_homSap 3EML           adenosine receptor 2A       18832607  MEL1_homSap 27%   ENCEPH_homSap 27%  Gs GNAT3 raises cAMP
P.59.1  ADRB1_melGal   2VT4           beta 1 adrenergic receptor  18594507  MEL1_homSap 29%   ENCEPH_homSap 25%  Gs GNAT3 raises cAMP
P.59.1  ADRB2_homSap   2R4R           beta 2 adrenergic receptor  17962520  MEL1_homSap 28%   PER1_homSap   29%  Gs GNAT3 raises cAMP

Thus indels in opsins -- when they occur in a conserved region -- are potentially very informative as rare genetic events not appreciably subject to homoplasy in defining orthology classes and higher order clusterings of them, hopefully corroborating or even refining trees derived from sequence clustering by alignment. While precious, such data is limited because physiological and structural constraints have prevented most regions of opsins from ever accommodating an indel.

Indels in ciliary opsins

The tertiary structural integrity requirements of a 7-transmembrane opsin, along with tuned binding of retinal, isomerization cycle conformational shifts and binding to secondary protein contributers to the photoreception cycle, conspire to greatly constrain admissible locations for ciliary opsin indels. Indeed this varies greatly by region, with indels never seen in the transmembrane regions themselves (despite tens of billions of branch length years) and restricted in connecting cytoplasmic and extracellular loops to EC2 and IC3 and IC7. Indel incidence is much higher in amino and carboxy terminal tails but not useful because of gapping ambiguity issues.

The distribution of fixed indels is quite peculiar: almost all occur in gene family stems (ie shortly after gene duplication in one branch), hardly any occur mid-history. For vertebrate imaging opsins, this means prior to lamprey divergence. In other words, not only had all the classes of imaging opsins emerged post-tunicate/amphioxus pre-lamprey but (neglecting tails) also all their indels. No further indels arose in the subsequent 500 million years in any of these opsins, as if these opsins were already optimized from the length perspective

Consequently the rate of indel occurrence per billion years of branch length -- and so the frequency of multiple independent events near a given site -- is highly correlated to region, ie each region has a characteristic time scale over which it can be informative: too long and the risk of homoplasy (convergent evolution) is too high. That risk is exacerbated by uncertainty in gap placement within an alignment, which first requires delimitation by flanking invariant residues. Gap length per se is ambiguous: an indel of 3 residues shared by two extant species might have arisen once as a single event in the first species or as two events (one and two residues successively) in the other. Thus any phylogenetic interpretation of indels must be tempered by knowledge of the regional indel susceptibilities and the assumption these remain fairly constant across lineages and time.

Informative indels show up as readily apparent columns of gaps in large-scale alignments. If present across a single opsin orthology class, that merely validates prior blast clustering and other rare genomic events in establishing those classes in the first place. Sporadic indels, defined here as indels found within a single opsin gene, arise from sequencing errors but if not might be an adaptive specialization. It's very rare to see a ciliary opsin indel restricted to a phylogenetic subclade but examples exist: the post-marsupial loss of 5 residues of RHO1 in the distal arrestin binding region.

We're concerned here primarily with non-sporadic indels that span two or more orthology classes that speak to unresolved dating and topological issues in the gene tree. Significant individual indels visible on the alignment page. These give rise to a table sortable by position along the opsin sequence, indel length, region (eg 3rd cytoplasmic loop), higher taxonomic clade, and phylogenetic depth. Specific goals are dating indel events, characterizing remote opsins in pre-vertebrate deuterostomes, correctly placing cnidarians opsins, disambiguating opsins from non-opsin GPCR, and establishing ancestral lengths.

For deuterostome ciliary opsins, the story is fairly simple up to encephalopsin. None of the transmembrane helices have indels. That holds also for the first two cytoplasmic loops and first and last extracellular loops. Structural constraints can be too rigid, as illustrated by the well-known hydrogen bond chain of extremely conserved residues that holds the transmembrane helices in a fixed relative position: N55 in TM1 hydrogen bonded to D83 in TMH2 to peptide A299 in TMH6. Indels that altered the position of these residues within the respective helical wheels would cause the whole arrangement to become unglued. The asparagine and aspartate are deeply invariant not only in opsins but also GPCR.

The second extracellular loop has a two residue insert in all rod and cone opsins in a region so far not attributed functional significance; this may have been a near-neutral event in the ancestral stem protein (ie in a gene duplicate of pinopsin). The cytoplasmic side has all the protein-protein interactions but length of the extracellular loops can still be important in tensioning of transmembrane helices that sets their angles of insertion and relative orientation.

The third cytoplasmic loop has variable length distally. Length is constant within orthology classes with parietopsin having full length, parapinopsin one residue shorter, and all others two residues fewer. This is a region of high beta factor in bovine rhodopsin crystals, ie has too much movement to be assigned a conformation. Unsurprisingly no function has been assigned. While the indel pattern supports the conventional gene tree, evidently this indel hotspot has fixed at least three separate events. While that hasn't resulted in overt homoplasy in terms of length, additional events could be masked. This weakens interpretive certainty of indels in this region.

The amino terminus has 4 informative indels, all deletions. The first unites unites RHO1 and RHO2 to the exclusion of all other opsins (as does the short highly conserved N-terminus with two glycosylation sites). No indel or intron distinguishes them. RHO2 has an odd phylogenetic distribution -- it seems to occur in one species of lamprey but not in genomic lamprey (despite 19 million traces) nor in cartilaginous nor ray-finned fish, but seeming rises again in lungfish, coelacanth, lizards, and chicken but not frog nor any mammal. Possibly the lamprey RHO2 is a lineage-specific duplication of lamprey RHO1. A later independent duplication in lobe-finned fish persisted until the mammalian nocturnal loss era. It may be missing in frog because of an incomplete genome.

A restricted SWS1 indel partitions Carnivora

Caniformia-restricted deletion in SWS1 splits Caniformia (dogs, bears, seals...) off from Felidae (cats, civits ...) within Carnivora. Whales and dolphin SWS1 are all recent pseudogenes. Despite this, they are full length but all exhibit an unprecedented N --> D substitution at the N D P C iron triangle. This N is deeply conserved in almost all opsin and even GPCR. This was may have been adaptive in some way initially but set the stage for later pseudogenization.

SWS1_homSap  LNAMVLVATLRYKKLRQPLNYILVNVSFG G FLLCIFSV F PVFVASCNGYFVFGRHVC human
SWS1_tarSyr  LNAMVLVATLHYRKLRQPLNYILVNVSLG G FLLCIFSV L PVFIASCRGYFVFGRHVC tarsier
SWS1_oryCun  LNAMVLVATLRYKKLRQPLNYILVNISLA G FLACIFSV F NVFVASCYGYFVFGRFVC rabbit
SWS1_ratNor  LNATVLVATLHYKKLRQPLNYILVNVSLG G FLFCIFSV F TVFIASCHGYFLFGRHVC rat
SWS1_ailMel  LNATVLVATLRYRKLRQPLNYILVNVSLA G FVYCI-SV S TVFIASCHGYFIFGRHVC panda
SWS1_canFam  LNGTVLVATLRYKKLRQPLNYILVNVSLG G FLYCI-SV S TVFIASCQGYFVFGRHVC dog
SWS1_enhLut  LNATVLVATLRYKKLRQPLNYILVNVSLG G FIYCI-SV S SVFIASCHGYFIFGHHIC otter
SWS1_phoVit  LNASVLVATLRYKKLRQPLNYILVNVSLG G FLYCI-SV S SVFIASCQGYFIFGRHVC seal
SWS1_ursMar  LNATVLVATLRYRKLRQPLNYILVNVSLA G FVYCI-SV S TVFIASCHGYFIFGRHVC bear
SWS1_felCat  LNATVLVATLRYRKLRQPLNYILVNVSLG G FLYCVSSV S IVFITSCHAYFIFGRHVC cat
SWS1_hipAmp  LNATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F VVFITSCHGYFVFGRHVC hippo
SWS1_ptePum  LNATVLVATLRYRKLRQPLNYILVNVSLG G FLFCIFSV F TVFIASCQGYFVFGRHVC bat
SWS1_talEur  LNATVLVATLRYRKLRQPLNYILVNVSLG G FLFCIFSV L TVFIASCKGYFIFGRHVC mole
SWS1_sorAra  LNATVLVPTLRYRKLRQPLNYILVNVSLG G FLFCIFSV F TVIIASCKGYFVIGRHVC shrew
SWS1_susScr  LNATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F SVFIASCHGYFVFGRRVC pig
SWS1_bosTau  LNATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F IVFITSCYGYFVFGRHVC cow
SWS1_lamPac  LNATVLIATLRYRKLRQPLNYILVNVSLG G FIYCMFSV F CVFVASCYGYFVFGRRVC lama
SWS1_turTru  LDATVLVATLRYRKLRQPLNYILVNVSLG G FIYCIFSV F VVFITSCHGYFVFGRHVC dolphin
SWS1_echTel  LNAVVLVATLRYRKLRQPLNYILVNVSLA S VLFCVISV F TVFVASCHGYFIFGRHVC hyrax
SWS1_monDom  LNAVVLVATLRYKKLRQPLNYILVNVSLC G FIFCIFAV F TVFISSSQGYFIFGRHVC
SWS1_smiCri  LNGVVLIATLRYKKLRQPLNYILVNISLA G FIFCVFSV F TVFVSSSQGYFVFGRHVC
SWS1_tarRom  LNAVVLIATLRYKKLRQPLNYILVNISLA G FIFCVISV F TVFISSSQGYFIFGRHVC
SWS1_galGal  LNAVVLWVTVRYKRLRQPLNYILVNISAS G FVSCVLSV F VVFVASARGYFVFGKRVC
SWS1_taeGut  LNAIVLIVTIKYKKLRQPLNYILVNISVS G LMCCVFCI F TVFIASSQGYFVFGKHMC
SWS1_anoCar  LNAIILIVTVKYKKLRQPLNYILVNISFA G FLFCTFSV F TVFMASSQGYFFFGRHVC
SWS1_utaSta  LNAIILIVTVKYKKLRQPLNYILVNISFA G FLFCVFSV F TVFLASSQGYFFFGRHIC
SWS1_xenLae  LNFIVLLVTIKYKKLRQPLNYILVNITVG G FLMCIFSI F PVFVSSSQGYFFFGRIAC
SWS1_neoFor  LNAIVLFVTIKYKKLQQPLNYILVNISLA G FIFCFFGV F AVFIASCQGYFIFGKTVC
SWS1_danRer  MNGIVLFVTMKYKKLRQPLNYILVNISLA G FIFDTFSV S QVSVCAARGYYSLGYTLC
SWS1_oryLat  LNFVVLLATAKYKKLRVPLNYILVNITFA G FIFVTFSV S QVFLASVRGYYFFGQTLC
SWS1_petMar  LNAIVLIVTVKCKKLRQPLTYMLVNISAA G LVFCLFSI S TVFLFSTQGYFVFGPTVC
SWS1_geoAus  LNAIVLVVTIKYKKLRQPLNYILVNISAA G LVFCLFSI S TVFVASMQGYFFLGPTIC

AlaNonarepeat.jpg

Indels in melanopsins: TM2 region

The mid-transmembrane helix region preceding the proline in TM2 -- the only opsin transmembrane helix ever to experience an indel in 100 billion years of branch length evolution -- exhibits various independent insertions and deletions. That would seem to undercut efforts to make the length a definitive fundamental classifying tool among GPCR. The situation can be compounded by separate indels following the proline that, depending on gap placement, might affect the extracellular loop connecting TM2 and TM3.

However with care, the homoplasy is manageable, making the locus is quite informative for opsins (though a detailed analysis is necessary to fully exploit it).

An 'iron triangle' provides a fixed upstream frame of reference critical to reliable gapping of indels in this region. This consists of a very conserved Asn55 in TM1 hydrogen bonded to the almost universal charged residue Asp83 internal to TM2 which is further hydrogen bonded via internal H20 to N of the terminal NPXXY motif and a peptide amide Ala299 in TM7 (bovine rhodopsin numbering). The iron triangle is central to the proper associative bundling and relative orientation of the seven transmembrane helices in the vicinity of the Schiff base K296. No indels occur in any opsin or GPCR between this N and D (meaning cytoplasmic loop CL1 is of fixed length, namely 12 aa). Note from the full alignment that D83 has been replaced by G in all teleost fish RHO2 and all SWS1; it is mixed with N83 in some RHO1, RHO2 and entirely N83 in SWS2 but ancestrally strictly D in basal ciliary opsins.

Downstream, the reference frame is augmented by the first cysteine C110 of the universal GPCR disulfide linking TM3 to EC2. This is preceded by an easily recognized ancient motif WIFG (squid melanopsin; human G106R causes retinitis pigmentosa), which forces all gaps to be placed between the iron triangle D, the proline P and WIFGFAAC (FVFGPTGC in bovine rhodopsin). Thus post-proline gapping is quite constrained by reliable anchors.

Proline, as an imino acid incapable of alpha helix participation, plays a special role in GPCR transmembrane helices, kinking or bulging them. Shifting the position of a proline one residue forward or back relative to the 3.4 residues per turn helical wheel (view down axis) alters both the angle of resumption of the helix and its membrane-exiting residue position, perhaps somewhat torquing the connection to the following transmembrane helix TM3.

The effect is not dramatic in terms of angstroms of shift (as can be seen from a recent 3D alignment of helix TM2 that compares bovine and squid opsins, yet it follows from comparative genomics that the consequences for adsorption spectrum and/or regulation of signaling must be substantial. In other words, gene clade specific retention of proline or specific substituents observed in the massive alignment below holding for billions of years of branch length is only feasible when adaptive.

The 185 ciliary opsins (which includes 5 basal cnidarian opsins) in the reference sequence collection are all of the same length in this region (except for odd Apis and Platynereis sequences), as are 65 peropsins, RGR and neuropsins, many melanopsins, and the vast majority of near-opsin GPCR. Consequently this length, denoted P59.2 (for proline in position 59 bovine rhodopsin numbering and 2 residues shorter in the proline-cysteine region than the longest opsins, is ancestral for melanopsins which themselves vary in length.

Deuterostome melanopsins are all of P.59.2 type, as are LMS and BCR arthropod melanopsins, a subclass of lophotrochozoan melanopsins, and the one known cnidarian melanopsin. The remaining dozen known lophotrochozoan melanopsins are all type P.60.2. This class -- which fortunately includes the structurally determined squid melanopsin -- thus has a one residue insertion whose location appears to be 5 residues after the D and 4 before the P.

Thus lophotrochozoan melanopsins had ancestral length up to a gene duplication which subsequently acquired this stem insertion in a descendent copy. A single other human GPCR, namely thyrotropin-releasing hormone receptor TRHR, is also P.60.2, demonstrating homoplasy. However given the rarity of transmembrane indel events, the history here can be reliably disambiguated assuming parsimony.

The three classes of ecdysozoan ultraviolet melanopsins (represented by 44 genes) all share a one residue deletion in this same region, approximately at the 4th post-D residue, making them P.58.2 class, homoplasic to within gap placement to moderately abundant GPCR (eg somatostatin receptor). This event, affecting insects, crustaceans and chelicerates, occurred deep within the stem lineage of ecdysozoa. More data from early diverging arthropods is needed to refine the timing. Recall these opsins have a peculiar lysine K90 (sometimes E90) that tunes their adsorption into the ultraviolet. The extra residue loss may be required to correctly position the K90 for its blueshift.

The three molluscan melanopsins of ancestral length share a striking signature aspartate residue two position preceding the proline, ie at this same K90 position. (Recall G90D and T94I in human RHO1 constitutively activate transducin in absence of chromophore and cause night blindness.) Consequently these three opsins may also have their adsorption shifted towards the UV since otherwise G90 is present in lophotrochozoan melanopsins. They should be renamed (ie reclassified) to reflect probable parental character, with P.60.2 lophotrochozoan opsins renamed to MEL2.

The post-proline pre-cysteine region has length variations that represent insertions in various homology classes. They are difficult to gap reliably other than occurring at the distal end of TM2 before the conserved block of extracellular loop EL1. As TM2 (by definition) just reaches the surface, these extra residues can be attributed to lengthened EL1. It emerges that indels outside the D to P region are only moderately informative. They may suffice to define narrow classes of opsins where blast clustering is ambiguous. While pseudo-homoplasic, that is readily resolvable given the sequence cluster isolation:

  • Three amphioxus melanopsins (eg MEL6_braFlo) have a 1 residue distal deletion but MELmop_braFlo does not. This event constitutes an isolated class of sequences.
  • Nine melanopsins from Branchiopoda have a 1 residue distal insertion. Three other melanopsins from this group have a further 1 residue insertion. This group of melanopsins has other odd properties; these could possibly have deeper ancestral roots but data is lacking from earlier branching arthropods.
  • RGR opsins all have a 1 residue distal deletion; however two Ciona opsins have seemingly regained a residue. Five Ciona RGR have a deletion preceding the D. However because the proline anchor is lacking, placement is otherwise uncertain in this isolated opsin class. These same five opsins are unique in having tyrosine in place of the conserved asparagine N in TM1 (that bonds to D).
  • Five peropsins have an inserted residue preceding the D. This appears to define PER2 opsins which are currently restricted to amphioxus and sea urchin. Hemichordates have a peropsin of type PER1 lacking the insert. Lophotrochozoan peropsins also lack it. Thus it appears to be a very restricted early gene expansion that did not persist in vertebrates.
  • NEUR4 neuropsins have a large distal insertion of 4 residues. This class of opsins is quite obscure and lacks the proline.

Departures from the conserved N D P C format are uncommon. RGR is Y/N D VMITAL C and NEUR 4 neuropsins are consistently N D S C. Ciliary opsins are the only major group departing from this pattern. Most provocatively, the very earliest TMT opsins from deuterostomes, ecdysozoa and cnidaria have the standard pattern, establishing it as unquestionably ancestral for ciliary opsins.

These opsins should be renamed to reflect this classificatory principle because they provide the ciliary ur-opsin form and quite possibly function. They cannot be successfully modeled in TM2 using bovine rhodopsin structure because it lacks the proline and its induced kink.

Using known fish ciliary ur-opsins as probes and the N D P C (especially P) as extra criterion, it emerges that both frog and lizard have a ciliary ur-opsin in syntenic location. However chicken, platypus, marsupial, and placental mammal do not. Gene order is preserved in chicken but no pseudogene remains at this site. This is a familiar story in opsins ... an old gene fades out mid-amniote but otherwise continues on 310 million year (Wall hypothesis).

No transcript data or reference gene information is available for frog or lizard ciliary ur-opsin , meaning nothing is known about site of expression. However this opsin has been specifically studied in fish, amphioxus and sea urchin. Testis is one site of expression.

Alignment of TM2 proline region in lophotrochozoan melanopsins with included representative outgroup sequences.
Numbers in parentheses indicate total number of reference sequences represented by the proxy sequence:

MEL1_todPac GNGIVIYLFTKTKSLQTPANMFIINLAFSDFTFSLVNGFPLMTISCFLKKWIFGFAAC P.60.2   (1)
MEL1_sepOff GNGIVIYLFTKTKSLQTPANMFIINLAFSDFTFSLVNGFPLMTISCFIKKWVFGMAAC P.60.2   (1)
MEL1_entDof GNGVVIYLFSKTKSLQTPANMFIINLAMSDLSFSAINGFPLKTISAFMKKWIFGKVAC P.60.2   (1)
MEL1_patYes GNTTVVYIFSNTKSLRSPSNLFVVNLAVSDLIFSAVNGFPLLTVSSFHQKWIFGSLFC P.60.2   (1)
MEL1_lotGit GNFVVIYTFSRTKSLRTASNMFVVNLALSDLTFSAVNGFPLFSLSSFSHKWIFGRVAC P.60.2   (1)
MEL1_plaDum GNLLVVWTFLKTKSLRTAPNMLLVNLAIGDMAFSAINGFPLLTISSINKRWVWGKLWc P.60.2   (1)
MEL1_schMed GNLLVLYIFARAKSLRTPPNMFIMSLAIGDLTFSAVNGFPLLTISSFNTRWAWGKLTC P.60.2   (1)
MEL1_capCap GNLVVITLFIKTRSLRTPPNMFIINLALSDMGFCATNGFPLMTVASFQKLWRWGPVAC P.60.2   (1)
MEL1_schMan GNSLVITLFLLCKQLRTPPNMLIVSLAISDFSFALINGFPLKTIAAFNHRWGWGKLAC P.60.2   (1)
MEL2_schMan LNLLVIVFFTMFKSLRTPSNILVVNLAISDFGFSAVIGFPLKTMAAFNNFWPWGKLAC P.60.2   (1)
MEL3_schMan TNLLVIFVFLTPKSSISLQCALIINLAISDFGFSAVIGFPLKTIAAFNQYWPWGSVAC P.60.2   (1)
MEL1_helRob GNIIVVWVFSRTPSLRTPSNVLVINLAICDILFSALIGFPMSALSCFQRHWIWGNFYC P.60.2   (1)
MEL2_helRob            TPILRTHANVLIINLALCDLIFSSLIGFPMTALSCFKRHWIWGDLGC P.60.2   (1)
MEL1_aplCal GNSLVIITCIRFKDLRTRSNILIINLAVGDLLMC-LIDFPLLAAASFYGEWPYGRQVC P.59.2   (1)
MEL2_lotGig GNSIVIWAHVRIKSLSTTSNMLILNLCVGCLIMC-IVDFPLYATSSFLQKWIFGHKVC P.59.2   (1)
MEL2_aplCal           RHSSLRTSSNLLVVNLTVADLVMS-SLDFPILAISSYKGCWVMGFLGC P.59.2   (1)
LMS1_droMel GNGVVIYIFATTKSLRTPANLLVINLAISDFGIM-ITNTPMMGINLYFETWVLGPMMC P.59.2  (23)
MEL1_homSap GNLTVIYTFCRSRSLRTPANMFIINLAVSDFLMS-FTQAPVFFTSSLYKQWLFGETGC P.59.2  (20)
MEL2_strPur GNSLVIYTFLRFKKLHSPINLLIVNLSASDLLVA-TTGTPLSMVSSFYGRWLFGTNAC P.59.2  (10)
TMTa1_danRe NNLLVLVLFGRYKVLRSPINFLLVNICLSDLLVC-VLGTPFSFAASTQGRWLIGDTGC P.59.2 (185)
PER1_homSap SNIIVLGIFIKYKELRTPTNAIIINLAVTDIGVS-SIGYPMSAASDLYGSWKFGYAGC P.59.2  (33)
NEUR1_homSa GNGYVLYMSSRRKKKLRPAEIMTINLAVCDLGIS-VVGKPFTIISCFCHRWVFGWIGC P.59.2  (30)
UV7_droMel  GNAFVIFMFANRKSLRTPANILVMNLAICDFLM--LIKCPIAIYNNIKEGPALGDIAC P.58.2  (14)
UV5_apiMel  GNGLVIWIFCAAKSLRTPSNMFVVNLAICDFFM--MIKTPIFIYNSFNTGFALGNLGC P.58.2  (20)
UVB_nasVit  GNGCVVWIFSTSKVLRTPSNLFIINLALFDLVM--ALEIPMLIINSFIERMIGWGLGC P.58.2   (8)

Opsinh2o.jpg

Alignment in TM2 region: 420 curated opsins

Colored blocks show useful opsin gene tree synapomorphies -- derived states relative to last common ancestor. The TM2 region is so rich in these that it can resolve many difficult gene classification issue and thus might be called the Rosetta Stone region of opsins -- 10 are highlighted below. Genes altered by an indel are colored magenta) -- ie the indel was resolvable as a specific insertion or deletion relative to ancestral. Note the consistency of blocks with gene names derived independently via blast clustering without consideration of introns, indels, and other rare genomic events.

Some phyoSNPs at key amino acids are also shown in red, notably the K90 of insect ultraviolet melanopsins and sole class of ciliary opsins with ancestral proline surviving at position 59 which distinguishes the ciliary ur-opsin class TMTa from later gene duplications that became other TMT homologs and encephalopsins. SWS1 opsins all have an asparagine in place of the key aspartate; RHO2 in teleost fish all have a glycine; NEUR4 all have a serine here as well as a 4 residue insert.

            --TM1-----><---CL1---><---------------TM2------------><--EC1---><-TM3
MEL1_todPa  G N GIVIYLFTKTKSLQTPANMFIINLAFS D FTFSLVNGF P LMTISCFL-- KKWIFGF AA C  P.60.2
MEL1_sepOf  G N GIVIYLFTKTKSLQTPANMFIINLAFS D FTFSLVNGF P LMTISCFI-- KKWVFGM AA C  P.60.2
MEL1_entDo  G N GVVIYLFSKTKSLQTPANMFIINLAMS D LSFSAINGF P LKTISAFM-- KKWIFGK VA C  P.60.2
MEL1_patYe  G N TTVVYIFSNTKSLRSPSNLFVVNLAVS D LIFSAVNGF P LLTVSSFH-- QKWIFGS LF C  P.60.2
MEL1_lotGi  G N FVVIYTFSRTKSLRTASNMFVVNLALS D LTFSAVNGF P LFSLSSFS-- HKWIFGR VA C  P.60.2
MEL1_plaDu  G N LLVVWTFLKTKSLRTAPNMLLVNLAIG D MAFSAINGF P LLTISSIN-- KRWVWGK LW C  P.60.2
MEL1_schMe  G N LLVLYIFARAKSLRTPPNMFIMSLAIG D LTFSAVNGF P LLTISSFN-- TRWAWGK LT C  P.60.2
MEL1_capCa  G N LVVITLFIKTRSLRTPPNMFIINLALS D MGFCATNGF P LMTVASFQ-- KLWRWGP VA C  P.60.2
MEL1_schMa  G N SLVITLFLLCKQLRTPPNMLIVSLAIS D FSFALINGF P LKTIAAFN-- HRWGWGK LA C  P.60.2
MEL2_schMa  L N LLVIVFFTMFKSLRTPSNILVVNLAIS D FGFSAVIGF P LKTMAAFN-- NFWPWGK LA C  P.60.2
MEL3_schMa  T N LLVIFVFLTPKSSISLQCALIINLAIS D FGFSAVIGF P LKTIAAFN-- QYWPWGS VA C  P.60.2
MEL1_helRo  G N IIVVWVFSRTPSLRTPSNVLVINLAIC D ILFSALIGF P MSALSCFQ-- RHWIWGN FY C  P.60.2
MEL2_helRo  . . .........TPILRTHANVLIINLALC D LIFSSLIGF P MTALSCFK-- RHWIWGD LG C  P.60.2
MEL1_aplCa  G N SLVIITCIRFKDLRTRSNILIINLAVG D LLMC-LIDF P LLAAASFY-- GEWPYGR QV C  P.59.2
MEL2_lotGi  G N SIVIWAHVRIKSLSTTSNMLILNLCVG C LIMC-IVDF P LYATSSFL-- QKWIFGH KV C  P.59.2
MEL2_aplCa  . . ........RHSSLRTSSNLLVVNLTVA D LVMS-SLDF P ILAISSYK-- GCWVMGF LG C  P.59.2
MEL1_dapPu  A N STILYVFSRFKRLRTPANVFIINLTIC D FLA--CCLH P LAVYSAFR-- GRWSFGQ TG C  P.58.2
MEL1_homSa  G N LTVIYTFCRSRSLRTPANMFIINLAVS D FLMS-FTQA P VFFTSSLY-- KQWLFGE TG C  P.59.2
MEL1_felCa  G N LMVIYTFCRSRGLRTPANMFIINLAVS D FFMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.59.2
MEL1_ailMe  G N LMVIYTFCRTRGLRTPSNMFIINLAVS D FLMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.59.2
MEL1_canFa  G N LMVIYTFCRTRGLRTPSNMFIINLAVS D FFMS-FTQA P VFFASSLH-- KRWLFGE AG C  P.59.2
MEL1_myoLu  G N LTVIYTFCRSRGLRTPANMFIINLAVS D FLMC-FTQA P VVFASSIY-- KRWLFGE AG C  P.59.2
MEL1_pteVa  G N LTVIYTFCRSRGLRTPANMFIINLAVS D FLMS-FTQA P VVFISSLY-- KRWLFGQ AG C  P.59.2
MEL1_smiCr  G N LLVIYTFCRSRSLRTPANMFIINLAIS D FFMS-FTQA P VFFASSLY-- ERWIFGE KG C  P.59.2
MEL1_monDo  G N FLVIYTFCRSHSLRTPANMFIINLAIS D FFMS-FTQA P VFFASSMY-- KRWIFGE KA C  P.59.2
MEL1_loxAf  G N LMVIYIFFRSRGLRTPANMFIINLAVS D FLMS-FTQA P VFFASSLY-- KRWLFGE AG C  P.59.2
MEL1_taeGu  G N FLVFYAFCRSRSLQTPANILIINLAIS D FLMS-ITQS P VFFTSSLY-- KHWIFGE KG C  P.59.2
MEL1_galGa  G N FLVIYAFCRSRTLQKPANIFIINLAVS D FLMS-ITQS P VFFTNSLH-- KRWIFGE KG C  P.59.2
MEL1_xenTr  G N FLVIYAFCRSRSLRSPANMFIINLAIT D FLMS-VTQA P VFFATSLH-- KRWIFGE KG C  P.59.2
MEL1_danRe  G N FLVIYAFSRSRTLRTPANLFIINLAIT D FLMC-ATQA P IFFTTSMH-- KRWIFGE KG C  P.59.2
MEL1_takRu  G N FLVIYAFCRSRSLRTPANMFIINLAVT D LLMC-VTQT P IFFTTSMY-- KRWIFGE KG C  P.59.2
MEL1_gasAc  G N VLVIYAFSKSRSLRTPANMFIINLAIT D LLMC-VTQA P IFFTTSMH-- KRWIFGE KG C  P.59.2
MEL1_oryLa  G N FLVIYAFSRSRSLRTPANMFIINLAIT D LLMC-VTQS P IFFTTSMH-- KRWIFGE KG C  P.59.2
MEL1_calMi  G N FLVIYAFLRSRSLRTPANTFIINLAAT D FLMS-VTQS P IFFITSIH-- KRWIFGE KG C  P.59.2
MEL1_petMa  G N VLVIYAFSKSKSLRSPANIFIINLAFA D FFMS-ITQT P IFFVTSLH-- KRWIFGE KG C  P.59.2
MELmop_bra  G N AVVVYSFIKSKGLRTPANFFIINLALS D FLMN-LTNM P IFAVNSAF-- QRWLLSD FA C  P.59.2
MELmop_bra  G N AVVVYSFIKSKGLRTPANFFIINLALS D FLMN-LTNM P IFAVNSAF-- QRWLLSD FA C  P.59.2
MEL1_strPu  . . ........WTKSLRTPPNMLIVNLAIS D FGMV-ITNF P LMFASTIY-- NRWLFGD AG C  P.59.2
MEL2_strPu  G N SLVIYTFLRFKKLHSPINLLIVNLSAS D LLVA-TTGT P LSMVSSFY-- GRWLFGT NA C  P.59.2
MEL2_galGa  G N LLVLYAFYSNKKLRTPQNFFIMNLAVS D FLMS-ASQA P ICFVNSLH-- REWILGD IG C  P.59.2
MEL2_xenLa  G N MLVLYAFYRNKKLRTAPNYFIINLAIS D FLMS-ATQA P VCFLSSLH-- REWILGD IG C  P.59.2
MEL2_anoCa  G N LLVLYAFYSNKRLRTPPNYFIMNLAVS D FLMS-ATQA P ICFLNSMH-- KEWVLGD IG C  P.59.2
MEL2_tetNi  G N VLVIFAFYSNKKLRSLPNYFIVNLAVS D LLMA-STQS P IFFIN-LY-- KEWMFGE TA C  P.59.2
MEL2_danRe  G N ALVMFAFYRNKKLRSLPNYFIMNLAVS D FLMA-ITQS P IFFINCLY-- KEWMFGE LG C  P.59.2
MEL2_gasAc  G N ALVMLAVYSNKKLRNLPNYFIMNLAVS D FLMA-FTQS P IFFINCLY-- KEWAFGE TG C  P.59.2
MEL6_braFl  G N AVALYAFCRSRSLRRPKNYLIANLCLT D MVVC-LVYS P IIVTRSL--- SHGLPSK ES C  P.59.3
MEL6_braBe  G N VVALYAFCRTRSLRRPKNYVVANLCLT D MFVC-LVYC P IVVSRSF--- SHGFPSK ES C  P.59.3
MELx_braFl  G N AVALYAFCSTRKLRRPKNYVVANLCLT D LIMC-IVYC P VIVISSF--- SGRIPTD GA C  P.59.3

LMS1_droMe  G N GVVIYIFATTKSLRTPANLLVINLAIS D FGIM-ITNT P MMGINLYF-- ETWVLGP MM C  P.59.2
LMS2_droMe  G N GVVVYIFGGTKSLRTPANLLVLNLAFS D FCMM-ASQS P VMIINFYY-- ETWVLGP LW C  P.59.2
LMS6_droMe  G N FIVMYIFTSSKGLRTPSNMFVVNLAFS D FMMM-FTMF P PVVLNGFY-- GTWIMGP FL C  P.59.2
LMS_anoGam  G N GMVIYIFSTAKSLRTPSNLFIVNLALS D FLMM-GTNA P TMVYNCWF-- ETWSLGL LM C  P.59.2
LMS_rhoPro  G N GMVIFIFSSTKTLRTPSNLLVVNLAFS D FLMM-FTMS P PMVINCYN-- ETWVLGP LM C  P.59.2
LMS_schGre  G N GMVIYIFSTTKSLRTPSNLLVVNLAFS D FLMM-FTMS A PMGINCYY-- ETWVLGP FM C  P.59.2
LMS_lucCru  G N GMVIYIFSTTKSLRSPSNLLVVNLAFS D FLMM-FTMA P PMVINCYN-- ETWVWGP LF C  P.59.2
LMS_triCas  G N GMVIYIFSSTKALRTPSNLLVVNLAFS D FLMM-LCMS P AMVINCYN-- ETWVLGP LV C  P.59.2
LMS_manSex  G N GMVIYIFMSTKSLKTPSNLLVVNLAFS D FLMM-CAMS P AMVVNCYY-- ETWVWGP FA C  P.59.2
LMS_papXut  G N GMVVYIFTSTKSLKTPSNLLVVNLAFS D FLMM-LCMA P PMLINCYY-- ETWVFGP LA C  P.59.2
LMS_homCoa  G N GMVVYIFSCTKALRTPSNLLVVNLAFS D FLMM-FTMA P PMVLNCYY-- ETWVLGP FM C  P.59.2
LMSa_nasVi  G N GMVVYIFASTKSLRTPSNLLVINLAFS D FCMM-FTMS P PMVINCYY-- ETWVFGP LM C  P.59.2
LMSb_apiMe  G N GMVVYIFLSTKSLRTPSNLFVINLAIS D FLMM-FCMS P PMVINCYY-- ETWVLGP LF C  P.59.2
LMS_acyPis  G N GMVIYIFTCTKNLRTPSNLLIVNLAFS D FCLM-FTMC P AMVWNCFY-- ETWMFGP FA C  P.59.2
LMSb_nasVi  G N GMVVYIFLVTPSLRTPSNLLVINLAFS D FVMM-IIMS P PMVVNCWY-- ETWILGP LM C  P.59.2
LMSa_apiMe  G N GVVVYVFIMTPSLRTPSNLLVVNLAFS D FIMM-GFMC P PMVICCFY-- ETWVLGS LM C  P.59.2
LMS_meoOer  G N FVVIWVFMNTKALRSPANTLVVSLAVS D FIMM-ACMF P PLVLNCYW-- GTWIFGP LF C  P.59.2
LMS_limPol  G N GMVIYLMMTTKSLRTPTNLLVVNLAFS D FCMM-AFMM P TMTSNCFA-- ETWILGP FM C  P.59.2
LMS2_plePa  G N GMVMYLMNTTKSLKTPTNMLIVNLAFS D FCMM-AFMM P TMAANCFA-- ETWILGP FM C  P.59.2
LMS2_hasAd  G N GMVIYLMSTTKSLKTPTNMLIVNLAFS D FCMM-AFMM P TMAANCFA-- ETWILGP LM C  P.59.2
LMS_ixoSca  G N SMVIYIMTTSKSLRSPTNMLVVNLAFS D WCMM-AFMM P TMAANCFA-- ETWILGP FM C  P.59.2
LMS1_plePa  G N SIVIYLMLSVKSLRTPANFLVTSLAVS D GGML-AFMA P TMPINCFA-- QTWVLGP FM C  P.59.2
LMS1_hasAd  G N GVVMYLMMTVKNLRTPGNFLVLNLALS D FGML-FFMM P TMSINCFA-- ETWVIGP FM C  P.59.2
BCRa_hemSa  G N GLVIYLYMKSQALKTPANMLIVNLALS D LIML-TTNF P PFCYNCFGS- GRWMFSG TY C  P.59.1
BCRb_hemSa  G N GLVIYLFNKSAALRTPANILVVNLALS D LIML-TTNV P FFTYNCFGS- GVWMFSP QY C  P.59.1
BCR_porPel  G N GMVIYLFAKCQALRTPANILVVNLALS D LIML-TTNV P FFTYNCFGN- GVWMFSA TY C  P.59.1
BCR_triGra  G N SLVISLFTKTKELRTPANMFVVNLAFS D LCMM-ITQF P MFVYNCFGN- GMWLFGP FL C  P.59.1
BCR2_triLo  G N SLVISLFTKTKELRTPANMFVVNLAFS D LCMM-ITQF P MFVYNCFGN- GMWLFGP FL C  P.59.1
BCR_limPol  G Q SVVLYLFAKTKPLRTPANMLIVNLAFS D FMMM-ITQF P VFIINCLGG- GAWQLGP LL C  P.59.1
BCR2_braKu  G N GLVIWIFLKTKSLRTPSNMLIVNLAIA D FFMM-LTQS P LYIISAFST- RWWIWGH FW C  P.59.1
BCR3_braKu  G N GLVIKIFLKTKSLRTPSNMLIVNLAIA D FFMM-LTQS P LFIISAFSS- RWWIWGH FW C  P.59.1
BCR1_triGr  G N YLVLRIFTKFQELRRPSNVLVINLALS D MLLM-LTLF P ECVYNFLGS- GPWRFGD LG C  P.591
BCR2_triGr  G N VLVLHIFGKHKNLRSPTNTLLMNLAFC D LMIF-IGLY P EMLGNIFMND GTWMWGD VA C  P.59.0
BCR1_triLo  G N VLVLHIFGKHKNLRSPTNTLLMNLAFC D LMIF-IGLY P EMLGNIFMND GTWMWGD IA C  P.59.0
BCR3_triGr  G N VLVLYIFGKYKSLRSPTNVLVMNLAFC D LGLF-VGLY P ELLGNIFINN GPWMWGD VA C  P.59.0

UV7a_acyPi  G N SLVIFMYFKCRSLQTPANMLIINLAVS D FIM--LAKA S VFIYNSYY-- LGPALGK LG C  P.58.2
UV7b_acyPi  G N SLVIFMYIKCKSLQTPANVLIMNLAVS D FIM--LAKT P VFIYNSFY-- QGPTLGK LG C  P.58.2
UV7_rhoPro  G N LLVIFMILRFRTLRTSSNILILNLAVS D FLM--VAKM P VFIYNSFY-- FGPVLGE MG C  P.58.2
UV7_anoGam  G N ALVVFMFYRYRSLRTPANYLVINLAVA D FII--MMEA P MFIYNSIH-- QGPALGS IG C  P.58.2
UV7_aedAeg  G N LLVILMFFRFKSLRTPANYLVINLAIA D FII--MLEA P LFVYNSYH-- QGPATGN VW C  P.58.2
UV7_culQui  G N VLVIFMFFKFKSLRTPANYLVINLAVA D FLI--MLEA P IFVYNSYH-- LGPAFGN TL C  P.58.2
UV7_droMel  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LIKC P IAIYNNIK-- EGPALGD IA C  P.58.2
UV7_droYak  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LIKC P IAIYNNIK-- EGPALGD IA C  P.58.2
UV7_droAna  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIK-- EGPALGD VA C  P.58.2
UV7_droPse  G N AFVIFMFANRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIK-- EGPALGD AA C  P.58.2
UV7_droWil  G N AFVIFMFSNRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIK-- EGPALGD IA C  P.58.2
UV7_droMoj  G N AFVIFMFGSRKSLRTPANILVMNLAIC D FLM--LVKC P IAIYNNIQ-- EGPALGD AA C  P.58.2
UV7_pedHum  G N FLIIYLFLRKRSLRTPSNVFIFNLAVS D SLL--LLKM P VFIINSFY-- LGPALGN LG C  P.58.2
UV7_ixoSca  . . .........RRRIRSQANLLVFNLALS D LLM--VLEI P LLVYNSLK-- LRPALGV WG C  P.58.2
UV5_plePay  G N AIVMYIFFSAKTLRTPTNMFVIGLAMA D LLM--MSKT P VFIYNCFH-- LGPVFGQ IG C  P.58.2
UV5_hasAda  G N AIVIYIFSVSKSLRTPTNMFVIGLAMA D LLM--MSKT P VFIYNCFH-- LGPVFGQ LG C  P.58.2
UV5_braKug  G N GVVIWVFASAKSLRTPSNLFVINLAVL D FLM--MLKT P VFIVNSFN-- EGPIWGK TG C  P.58.2
UV5_triLon  G N GVVIWIFSSAKSLRTPSNMFVINLAVL D FIM--MMKT P VFIVNSFN-- EGPIWGK FG C  P.58.2
UV5_triGra  G N GVVIWIFSSAKSLRTPSNMFVINLAVL D FIM--MMKT P VFIVNSFN-- EGPIWGK FG C  P.58.2
UV5a_dapPu  G N GVVIWIFTNCKSLRTPSNMLVVNLAIL D MLM--MLKS P VMIINSYN-- EGPIWGK LG C  P.58.2
UV5b_dapPu  G N GIVIYIFSTTKELKTPSNILILNLAIC D FIM--MIKT P IFIVNSFN-- EGPVFGR LG C  P.58.2
UV5_papXut  G N GLVIFIFSASKSLRTPSNLLVVQLAVL D FLM--MLKA P IFIYNSIK-- RGFASGV IG C  P.58.2
UV5_manSex  G N GMVIFIFSTTKSLRTSSNFLVLNLAIL D FIM--MAKA P -FIYNSAM-- RGFAVGT VG C  P.58.2
UV5_apiMel  G N GLVIWIFCAAKSLRTPSNMFVVNLAIC D FFM--MIKT P IFIYNSFN-- TGFALGN LG C  P.58.2
UV5_nasVit  G N GLVIWIFCAAKSLRTPSNMFVVNLAIC D FMM--MLKT P IFIYNSFH-- TGFALGN LG C  P.58.2
UV5_diaNig  G N GLVIWVFSSAKTLRTPSNIFVINLALY D FIM--MLKT P IFIYNSFN-- LGFGLGQ LG C  P.58.2
UV5_lucCru  G N GLVLWIFSTSKSLKTASNMFVVNLAFC D FIM--MMKM P IFVYNSFN-- RGYALGH IG C  P.58.2
UV5_triCas  G N GLVIWIFSTSKSLRTASNMFVVNLAIC D FAM--MIKT P IFIYNSFY-- RGFALGH LG C  P.58.2
UV5_anoGam  G N GLVIWIFIAAKSLRTPSNVFVINLAIC D FFM--MAKT P IFIYNSFT-- KGFTLGN LG C  P.58.2
UV4_droMel  G N GMVIWIFSTSKSLRTPSNMFVLNLAVF D LIM--CLKA P IFIYNSFH-- RGFALGN TW C  P.58.2
UV3_droMel  G N GLVIWVFSAAKSLRTPSNILVINLAFC D FMM--MVKT P IFIYNSFH-- QGYALGH LG C  P.58.2
UV5_rhoPro  G N GLVIWIFSTAKTLRTPSNIFVVNLAIC D FLM--MSKT P IFIYNSFK-- LGYALGH RA C  P.58.2
UV5_pedHum  G N GIVIWIFTTSKNLRTASNVFVVNLAIF D FIM--MAKT P IMIYNSMN-- LGFECGF VW C  P.58.2
UV5_acyPis  G N GLVIWVFCVAKPLRTPSNIFVINLALC D FVM--MAKA P IFILGSIN-- RGY-QGH FL C  P.58.2
UVB_anoGam  G N GIVLWIFGTSKSLRNGSNMFIINLAIF D LLM--MCEM P MFLVNSFS-- ERLVGYG VG C  P.58.2
UVB_diaNig  G N GIVLWIFATTKSLRTPSNMFVVNQALL D LLM--MIEM P MFVLNSLYF- QRPIGWE MG C  P.58.1
UVB_manSex  G N GIVIWIFSTSKSLRSASNMFVINLAVF D LMM--MLEM P LLIMNSFY-- QRLVGYQ LG C  P.58.2
UVB_apiMel  G N CCVIWIFSTSKSLRTPSNMFIVSLAIF D IIM--AFEM P MLVISSFM-- ERMIGWE IG C  P.58.2
UVB_nasVit  G N GCVVWIFSTSKVLRTPSNLFIINLALF D LVM--ALEI P MLIINSFI-- ERMIGWG LG C  P.58.2
UV5B_droMe  G N GLVIWIFSTSKSLRTPSNLLILNLAIF D LFM--CTNM P HYLINATV-- GYIVGGD LG C  P.58.2
UVB_acyPis  G N GLVLWIFCVSKPLRTPSNLFVLNLALC D FSM--VLVL P ILIYDSID-- HKY-PGH LQ C  P.58.2
UVB_megVic  G N GLVLWIFCVSKPLRTPSNLFVLNLALC D FSM--VLVL P ILIYDSID-- HKY-PGH LQ C  P.58.2

RHO1_bosTa  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTLYTSLH-- GYFVFGP TG C  P.59.2
RHO1_homSa  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-LGGF T STLYTSLH-- GYFVFGP TG C  P.581
RHO1_monDo  I N FLTLYVTIQHKKLRTPLNYILLNLAIA D LFMV-FGGF T MTLYTSLH-- GYFVFGP TG C  P.59.2
RHO1_ornAn  I N FLTLYVTIQHKKLRTPLNYILLNLAFA N HFMV-LGGF T TTLYTSLH-- GYFVFGP TG C  P.59.2
RHO1_galGa  V N FLTLYVTIQHKKLRTPLNYILLNLVVA D LFMV-FGGF T TTMYTSMN-- GYFVFGV TG C  P.59.2
RHO1_anoCa  I N FLTLFVTIQHKKLRTPLNYILLNLAVA N LFMV-LMGF T TTMYTSMN-- GYFIFGT VG C  P.59.2
RHO1_xenTr  I N FMTLYVTIQHKKLRTPLNYILLNLVFA N HFMV-LCGF T VTMYTSMH-- GYFIFGQ TG C  P.59.2
RHO1_neoFo  I N FLTLYVTVQHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTMYTAMN-- GYFVFGV VG C  P.59.2
RHO1_latCh  I N FLTLFVTIQHKKLRTPLNYILLDLAVA D LCMV-FGGF F VTMYSSMN-- GYFVLGP TG C  P.59.2
RHO1_angAn  V N FLTLYVTIEHKKLRTPLNYILLNLAVA N LFMV-FGGF T TTVYTSMH-- GYFVFGE TG C  P.59.2
RHO1_conMy  I N FLTLYVTIEHKKLRTPLNYILLNLAVA D LFMV-FGGF T TTMYTSMH-- GYFVFGP TG C  P.59.2
RHO1_takRu  V N FLTLFVTVKHKKLRTPLNYVLLNLAVA D LFMV-IGGF T VTLYTALH-- AYFVLGV TG C  P.59.2
RHO1_leuEr  V N FLTLFVTIQHKKLRQPLNYILLNLAVS D LFMV-FGGF T TTIITSMN-- GYFIFGP AG C  P.59.2
RHO1_calMi  V N FLTLYVTFEHKKLRQPLNFILLNLAVA D LFMV-FGGF F ITVYTSLH-- GYFVFGV TG C  P.59.2
RHO1_petMa  V N FLTLFVTVQHKKLRTPLNYILLNLAVA N LFMV-LFGF T LTMYSSMN-- GYFVFGP TM C  P.59.2
RHO1_letJa  V N FLTLFVTVQHKKLRTPLNYILLNLAMA N LFMV-LFGF T VTMYTSMN-- GYFVFGP TM C  P.59.2
RHO1_geoAu  V N FLTLFVTVQHKKLRTPLNYILLNLAVS N LFMI-LFGF T TTMYTSMN-- GYFVFGP TM C  P.59.2
RHO2_calMi  I N GLTLLVTVKHKKLRQPLNFILLNLAVA D LFMV-FGGF F ITVYTSLH-- GYFVFGV TG C  P.59.2
RHO2_galGa  I N LLTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFVFGP VG C  P.59.2
RHO2_taeGu  I N FLTLLVTFKHKKLRQPLNYILVNLAVA D LCMA-CFGF T VTFYTAWN-- GYFVFGP IG C  P.59.2
RHO2_podSi  I N LLTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFIFGP IG C  P.59.2
RHO2_anoCa  I N ILTLLVTFKHKKLRQPLNYILVNLAVA D LFMA-CFGF T VTFYTAWN-- GYFIFGP IG C  P.59.2
RHO2_neoFo  I N LLTLVVTFKHKKLRQPLNYILVNLAVA D LFMV-CFGF T VTFSTAIN-- GYFIFGP RG C  P.59.2
RHO2_latCh  I N FLTLLVTFKHKKLRQPLNYILVNLAVA S LFMV-VFGF T VTFYSSLN-- GYFVLGP MG C  P.59.2
RHO2_gekGe  L N GLTLFVTFQHKKLRQPLNYILVNLAAA N LVTV-CCGF T VTFYASWY-- AYFVFGP IG C  P.59.2
RHO2_pheMa  L N GLTLFVTFQHKKLRQPLNYILVNLAVA N LLMV-ICGF T VTFYTSWY-- GYFVFGP MG C  P.59.2
RHO2_geoAu  V N FMTLFVTFKLKKLRQPLNFILVNLCVA D LLMI-MFGF T TTFYTAMN-- GYFVFGP TG C  P.59.2
RHO2_danRe  I N GLTLLVTAQHKKLRQPLNFILVNLAVA G TIMV-CFGF T VTFYSAIN-- GYFVLGP TG C  P.59.2
RHO2d_danR  I N GLTLLVTAQHKKLRQPLNFILVNLAVA G TIMV-CFGF T VTFYTAIN-- GYFVLGP TG C  P.59.2
RHO2c_danR  I N GLTLVVTAQHKKLRQPLNFILVNLAVA G TIMV-CFGF T VTFYTAIN-- GYFVLGP TG C  P.59.2
RHO2a_danR  I N VLTLVVTAQHKKLRQPLNYILVNLAFA G TIMV-IFGF T VSFYCSLV-- GYMALGP LG C  P.59.2
RHO2b_danR  I N VLTLLVTAQHKKLRQPLNYILVNLAFA G TIMA-FFGF T VTFYCSIN-- GYMALGP TG C  P.59.2
RHO2_takRu  I N GLTLLVTAQNKKLRQPLNYILVNLAVA G LIMC-AFGF T ITITSAVN-- GYFILGA TA C  P.59.2
RHO2_gasAc  I N GLTLLVTAQNKKLRQPLNYILVNLAVA G LIMC-AFGF T ITITSAVN-- GYFILGA TA C  P.59.2
RHO2_oreNi  I N GLTLFVTAQNKKLRQPLNYILVNLAVA G LIMC-CFGF T ITITSAIN-- GYFVLGT TF C  P.59.2
RHO2_hipHi  I N GLTLFVTAQNKKLRQPLNYILVNLAVA G LIMC-CFGF T ITITSAFN-- GYFILGA TF C  P.59.2
RHO2_mulSu  I N GLTLLVTFQNKKLQQPLNYILVNLAVV G LIMC-AFGF T ITITSALN-- GYFILGP TF C  P.59.2
RHO2_pomMi  I N ALTLLVTFQNKKLRQPLNFILVNLAVA G LIMC-AFGF T ITITSALN-- GYFILGA TF C  P.59.2
RHO2_oryLa  I N ALTLVVTAQNKKLRQPLNFILVNLAVA G LIMV-CFGF T VCIYSCMV-- GYFSLGP LG C  P.59.2
SWS2_ornAn  I N LLTVICTIKYKKLRSHLNYILVNLAVS N MLVV-CVGS A TAFYSFAH-- MYFVLGP TA C  P.59.2
SWS2_anoCa  I N VLTIFCTFKYKKLRSHLNYILVNLSVS N LLVV-CVGS T TAFYSFSN-- MYFSLGP TA C  P.59.2
SWS2_utaSt  I N VLTIFCTFKYKKLRSHLNYILVNLAVS N LLVV-CIGS T TAFYSFAQ-- MYFSLGP TA C  P.59.2
SWS2_taeGu  I N ALTVLCTAKYKKLRSHLNYILVNLAVA N LLVV-CVGS T TAFYSFSQ-- MYFALGP LA C  P.59.2
SWS2_neoFo  I N VLTIICTFKYKKLRSHLNYILVNLAVA N LIVV-GFGS T TAFYSFSQ-- MYFAWGP LA C  P.59.2
SWS2_galGa  I N TLTIFCTARFRKLRSHLNYILVNLALA N LLVI-LVGS T TACYSFSQ-- MYFALGP TA C  P.59.2
SWS2_xenTr  L N LLTIICTVKYKKLRSHLNYILVNLAVA N LIVI-CFGS T TAFYSFSQ-- MYFSLGT LA C  P.59.2
SWS2_geoAu  L N FLTVFVTIKYKKLRSHLNYILVNLAIA N LIVV-CCGS T LAFYSFMH-- KYFILGP LF C  P.59.2
SWS2_takRu  I N VLTIACTIQYKKLRSHLNYILVNLAFS N LLVT-TVGS F TCFCCFFV-- RYMIVGP LG C  P.59.2
SWS2_gasAc  I N ALTVACTVQNKKLRSHLNYILVNLAVS N LLVS-GVGA F TAFLSFAA-- RYFVLGT LA C  P.59.2
SWS1_homSa  L N AMVLVATLRYKKLRQPLNYILVNVSFG G FLLC-IFSV F PVFVASCN-- GYFVFGR HV C  P.59.2
SWS1_ailMel L N ATVLVATLRYRKLRQPLNYILVNVSLA G FVYC-I-SV S TVFIASCH-- GYFIFGR HV C  P.59.2 Caniformia-restricted deletion
SWS1_canFam L N GTVLVATLRYKKLRQPLNYILVNVSLG G FLYC-I-SV S TVFIASCQ-- GYFVFGR HV C  P.59.2 Caniformia-restricted deletion
SWS1_felCat L N ATVLVATLRYRKLRQPLNYILVNVSLG G FLYC-VSSV S IVFITSCH-- AYFIFGR H VC  P.59.2
SWS1_monDo  L N AVVLVATLRYKKLRQPLNYILVNVSLC G FIFC-IFAV F TVFISSSQ-- GYFIFGR HV C  P.59.2
SWS1_smiCr  L N GVVLIATLRYKKLRQPLNYILVNISLA G FIFC-VFSV F TVFVSSSQ-- GYFVFGR HV C  P.59.2
SWS1_tarRo  L N AVVLIATLRYKKLRQPLNYILVNISLA G FIFC-VISV F TVFISSSQ-- GYFIFGR HV C  P.59.2
SWS1_taeGu  L N AIVLIVTIKYKKLRQPLNYILVNISVS G LMCC-VFCI F TVFIASSQ-- GYFVFGK HM C  P.59.2
SWS1_anoCa  L N AIILIVTVKYKKLRQPLNYILVNISFA G FLFC-TFSV F TVFMASSQ-- GYFFFGR HV C  P.59.2
SWS1_utaSt  L N AIILIVTVKYKKLRQPLNYILVNISFA G FLFC-VFSV F TVFLASSQ-- GYFFFGR HI C  P.59.2
SWS1_neoFo  L N AIVLFVTIKYKKLQQPLNYILVNISLA G FIFC-FFGV F AVFIASCQ-- GYFIFGK TV C  P.59.2
SWS1_galGa  L N AVVLWVTVRYKRLRQPLNYILVNISAS G FVSC-VLSV F VVFVASAR-- GYFVFGK RV C  P.59.2
SWS1_xenLa  L N FIVLLVTIKYKKLRQPLNYILVNITVG G FLMC-IFSI F PVFVSSSQ-- GYFFFGR IA C  P.59.2
SWS1_petMa  L N AIVLIVTVKCKKLRQPLTYMLVNISAA G LVFC-LFSI S TVFLFSTQ-- GYFVFGP TV C  P.59.2
SWS1_geoAu  L N AIVLVVTIKYKKLRQPLNYILVNISAA G LVFC-LFSI S TVFVASMQ-- GYFFLGP TI C  P.59.2
SWS1_danRe  M N GIVLFVTMKYKKLRQPLNYILVNISLA G FIFD-TFSV S QVSVCAAR-- GYYSLGY TL C  P.59.2
SWS1_oryLa  L N FVVLLATAKYKKLRVPLNYILVNITFA G FIFV-TFSV S QVFLASVR-- GYYFFGQ TL C  P.59.2
LWS_homSap  T N GLVLAATMKFKKLRHPLNWILVNLAVA D LAET-VIAS T ISVVNQVY-- GYFVLGH PM C  P.59.2
LWS_ailMel  T N GLVLAATMRFKKLRHPLNWILVNLAVA D LAET-VIAS T ISVVNQIY-- GYFVLGH PL C  P.59.2
LWS_monDom  T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-VIAS T ISVINQIY-- GYFILGH PL C  P.59.2
LWS_macEug  T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-LIAS T ISVINQIY-- GYFILGH PM C  P.59.2
LWS_smiCra  T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-IIAS T ISVINQIY-- GYFILGH PM C  P.59.2
LWS_ornAna  T N GLVLVATMKFKKLRHPLNWILVNLAVA D LGET-LIAS T ISVINQIF-- GYFILGH PM C  P.59.2
LWS_galGal  T N GLVLVATWKFKKLRHPLNWILVNLAVA D LGET-VIAS T ISVINQIS-- GYFILGH PM C  P.59.2
LWS_anoCar  T N GLVLVATAKFKKLRHPLNWILVNLAIA D LGET-VIAS T ISVINQIS-- GYFILGH PM C  P.59.2
LWS_xenTro  T N GLVLVATLKFKKLRHPLNWILVNMAIA D LGET-VIAS T ISVCNQIF-- GYFVLGH PM C  P.59.2
LWS_petMar  S N GLVLVATVKFKKLRHPLNWIIVNLAIA D ILET-IFAS T ISVCNQVY-- GYFILGH PM C  P.59.2
LWS_letJap  T N GLVLVATMKFKKLRHPLNWILVNLAIA D ILET-IFAS T ISVCNQVF-- GYFILGH PM C  P.59.2
LWS_geoAus  T N GLVLVATLKFKKLRHPLNWILVNLAIA D IGET-IFAS T VSVVNQIF-- GYFILGH PL C  P.59.2
LWS_neoFor  T N GLVLMATYKFKKLRHPLNWILVNLAIA D LGET-LIAS T ISVTNQIF-- GYFILGH PM C  P.59.2
LWS_takRub  T N GLVLVATAKFKKLRHPLNWILVNLAIA D LGET-VFAS T ISVCNQFF-- GYFILGH PM C  P.59.2
LWS_gasAcu  T N GLVLVATAKFKKLQHPLNWILVNLAIA D LGET-VFAS T ISVCNQFF-- GYFILGH PM C  P.59.2
LWS1_calMi  T N GLVLVATVRFKKLRHPLNWILVNMALA D LGET-VLAS T VSVANQFF-- GYFILGH PL C  P.59.2
LWS2_calMi  T N GLVLVATWKFKKLRHPLNWILVNLAIA D LGET-LFAS T ISICNQVF-- GYFILGH PM C  P.59.2
PIN_galGal  V N GLVIVVSICYKKLRSPLNYILVNLAVA D LLVT-LCGS S VSLSNNIN-- GFFVFGR RM C  P.59.2
PIN_colLiv  V N GLVIVVSIRYKKLRSPLNYILVNLAMA D LLVT-LCGS S VSFSNNIN-- GFFVFGK RL C  P.59.2
PIN_taeGut  L N GLVIVVSVRHKRLRSPLNYILLNLAVA N LLVT-LCGS S VSLSNNIS-- GFFVFGE RL C  P.59.2
PIN_utaSta  V N GLVIVVSIQYKKLRSPLNYILVNLAIA D LLVT-SFGS T LSFANNIY-- GFFVLGQ TA C  P.59.2
PIN_podSic  V N GLVIVVSVQFKKLRSPLNYVLVNLAVA D LLVT-FFGS T ISFVNNAQ-- GFFIFGQ AT C  P.59.2
PIN_pheMad  A N GLVIAVSVRFKRLRSPLNYILVNLATA D LLVT-FFGS I ISFVNNAV-- GFFVFGK TA C  P.59.2
PIN_xenTro  V N GLVIVVTLKYKKLRSPLNYILVNLAIA N LLVT-IFGS S VSFSNNVV-- GYFFMGK TM C  P.59.2
PIN_bufJap  V N GMVIVVSLKYKKLRSPLNYILVNLAVA D ILVT-MFGS T VSFHNNIF-- GFFTLGK LV C  P.59.2
VAOP_galGa  E N LAVILVTFKFKQLRQPVNYVIVNLSVA D FLVS-LTGG T ISFLANLK-- GYFYMGH WA C  P.59.2
VAOP_taeGu  E N LAVILVTFKFKQLRQPINYIIVNLSVA D FLVS-LTGG T ISFLTNLK-- GYFFMGY WA C  P.59.2
VAOP_anoCa  E N FTVILVTIKFKQLRQPLNYVIVNLSVA D FLVS-LIGG T ISFSTNLK-- GYFYMGH WA C  P.59.3
VAOP_xenTr  E N FIVILVTAKFKQLRQPLNYIIVNLSVA D FLVS-VIGG T ISIATNSR-- GYFYLGS WA C  P.59.2
VAOP_danRe  E N FTVMLVTFRFQQLRQPLNYIIVNLSLA D FLVS-LTGG S ISFLTNYH-- GYFFLGK WA C  P.59.2
VAOP_rutRu  E N FAVMLVTFRFTQLRKPLNYIIVNLSLA D FLVS-LTGG T ISFLTNYH-- GYFFLGK WA C  P.59.2
VAOP_takRu  E N FLVMFITFKFKQLRQPLNYIIVNLAIA D FLVS-LTGG L ISFLTNAR-- GYFFLGR WA C  P.59.2
VAOP_petMa  E N FAVIVVTARFRQLRQPLNYVLVNLAAA D LLVS-AIGG S VSFFTNIK-- GYFFLGV HA C  P.59.2
PPIN_anoCa  L N TAVIAITIKYRQLRQPINYSLVNLAIA D LGAA-LLGG S LNVETNAV-- GYYNLGR VG C  P.59.2
PPIN_xenTr  L N VTVIVVTFKYRQLRHPINYSLVNLAIA D LGVT-VLGG A LTVETNAV-- GYFNLGR VG C  P.59.2
PPINa_petM  L N STVIIVTLRHRQLRHPLNFSLVNLAVA D LGVT-VFGA S LVVETNAV-- GYFNLGR VG C  P.59.2
PPIN_letJa  L N STVVIVTLRHRQLRHPLNFSLVNLAVA D LGVT-VFGA S LVVETNAV-- GYFNLGR VG C  P.59.2
PPIN_danRe  L N VTVITVTLKYKQLRQPLNFALVNLAVA D LGCA-VFGG L PTVVTNAM-- GYFSLGR VG C  P.59.2
PPIN_ictPu  L N MVVIIVTVRYKQLRQPLNYALVNLAVA D LGCP-VFGG L LTAVTNAM-- GYFSLGR VG C  P.59.2
PPIN_oncMy  M N VLVIMVTMRHRKLRQPLNYALVNLAVA D LGCA-LFGG L PTMVTNAM-- GYFSMGR LG C  P.59.2
PPINb_takR  L N VLVIVVTMKHRQLRQPLSYALVNLAIC D LGCA-LFGG I PTTITSAM-- GYFSLGR VG C  P.59.2
PPINb_tetN  L N VLVIVVTLKHRQLRQPLNYALVNLAIC D LGCA-LFGG I PTTVTSAM-- GYFSLGR LG C  P.59.2
PPINb_gasA  L N ALVIVVTARHRQLRQPLSYALVNLAVC D LGCA-ACGG L PTTVTSAM-- GYFSLGR AG C  P.59.2
PPINa_gasA  L N ATVIIVTLMHKQLRQPLNYALVNMALA D LGTA-MTGG V LSVVNNAQ-- GYFSLGR SG C  P.59.2
PPINa_takR  L N ATVIIVSLMHKQLRQPLNYALVNMAVA D LGTA-MTGG L LSVVNNAQ-- GYFSLGR TG C  P.59.2
PPINa_tetN  L N ATVIIVSLMHKQLRQPLNYALVNMAAA D LGTA-VSGG L LSVVNNAQ-- GHFSLGR TG C  P.59.2
PPINa_cioI  L N ILVIVATLKNKVLRQPLNYIIVNLAVV D LLSG-FVGG F ISIAANGA-- GYFFWGK TM C  P.59.2
PPINa_cioS  L N ILVITATLKNKVLRQPLNYIIVNLAVV D LLSG-LVGG V ISIFANGA-- GYFFWGK FM C  P.59.2
PPINb_cioI  L N GFVIIATMKNKKLRQPLNYIIINLSIA D FLSG-LVGG F IGMISNSA-- GYFYFGK TV C  P.59.2
PPINb_cioS  L N LLVIVATYKNKDLRRPINYIIVNLAVA D LTCS-VVGG L LGVLNNGA-- GYYFLGK SV C  P.59.2
PARIE_utaS  N N SLVIAVTLKNPQLRNPINIFILNLSFS D LMMS-LCGT T IVIATNYY-- GYFYLGR KF C  P.59.2
PARIE_anoC  N N FLVIAVTLKNPQLRNPINIFILNLSFS D LMMS-ICGT T IVIATNYH-- GYFYLGR RF C  P.59.2
PARIE_xenT  N N AIVILVTLKHPQLRNPINIFILNLSFS D LMMA-LCGT T IVVSTNYH-- GYFYLGK QF C  P.59.2
PARIE_takR  N N SLAIAVMLKNPSLLQPINIFILSLAVS D LMIG-LCGS L VVTITNYH-- GSFFIGH TA C  P.59.2
PARIE_tetN  N N GLAITVMLKNPALLQPINIFILSLAVS D LMIG-LCGS L VVTITNYQ-- GSFFIGH TA C  P.59.2
PARIE_gasA  N N VLVITVLVRNPSLLQPMNVFILSLAVS D LMIG-LCGS L VVTITNYH-- GSFFIGH TA C  P.59.2
PARIE_danR  N N VLVIAVMVKNLHFLNAMTVIIFSLAVS D LLIA-TCGS A IVTVTNYE-- GSFFLGD AF C  P.59.2
ENCEPH_hom  N N LLVLVLYYKFQRLRTPTHLLLVNISLS D LLVS-LFGV T FTFVSCLR-- NGWVWDT VG C  P.59.2
ENCEPH_oto  N N LLVLVLYYKFPRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLR-- NGWVWDT VG C  P.59.2
ENCEPH_lox  N N LLVLVLYYKFQRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLR-- NGWVWDT VG C  P.59.2
ENCEPH_pte  N N LLVLVFYYKFQQVRTPFYLFLVNISFS D LLVS-FFGV T FTFVSCLR-- NGWVWDT VG C  P.59.2
ENCEPH_mus  G N LLVLLLYSKFPRLRTPTHLFLVNLSLG D LLVS-LFGV T FTFASCLR-- NGWVWDA VG C  P.59.2
ENCEPH_can  C H FCPQKGFLEFQRLRTPTHLLLVNLSLS D LLVS-LFGV T FTFVSCLR-- NGWVWDS VG C  P.59.2
ENCEPH_mon  N N LLVLVLYYKFQRLRTPTHLFLVNISFN D LLVS-LFGV T FTFVSCLR-- SGWVWDS VG C  P.59.2
ENCEPH_ano  N N LLVLVLYAKFKRLRTPTHLFLVNISLS D LLVS-LFGV S FTFGSCLR-- HRWVWDA AG C  P.59.2
ENCEPH_gal  N N LLVLVLYYKFKRLRTPTNLFLVNISLS D LLVS-VCGV S LTFMSCLR-- SRWVWDA AG C  P.59.2
ENCEPH_dan  N N IIVIILYSRYKRLRTPTNLLIVNISVS D LLVS-LTGV N FTFVSCVK-- RRWVFNS AT C  P.59.2
ENCEPH_tak  N N FVVLALYCRFKRLRTPTNLLLVNISLS D LLVS-LFGI N FTFAACVQ-- GRWTWTQ AT C  P.59.2
ENCEPH_gas  N N VVVIVLYCKFKRLRTPTNLLVVNISLS D LLVS-VIGI N FTFVSCIR-- GGWTWSR AT C  P.59.2
ENCEPH_ory  N N LLVILLYCKFKRLRTPTSLLLVNISLS D LLVS-VVGI N FTLASCVK-- GRWMWSQ AT C  P.59.2
ENCEPH_xen  N N LLVLILYCKFKRLQTPTNLLFFNTSLC H FVFS-LLAI T FTFMSCVR-- GSWAFSV EM C  P.59.2
ENCEPH_cal  N N ILVLLLYYKFKRLRTPTNLLLVNISVS D LLVS-VFGL S FTFVSCTQ-- GRWGWDS AA C  P.59.2
ENCEPH_squ  N N LLMLVLYCKFKRLRTPTNLFLVNISIS D LLLS-VFGV I FTFVSCVK-- GRWVWDS AA C  P.59.2
ENCEPH_pet  N N LLLVALFVGFKRLQTPTNLLLVNISLS D LLVS-VFGN T LTLVSCVR-- RRWVWGN GG C  P.59.2
ENCEPH4_br  N N FVVILLIGCHRQLRTPFNLLLLNMSVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.59.2
ENCEPH4_br  N N FVVILLIGCHRQLRTPFNLLLLNVSVA D LLVS-VCGN T LSFASAVQ-- HRWLWGR PG C  P.59.2
TMT5_braFl  S N GAVVLLFLKFRQLRTPFNMLLLNMSVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.59.2
TMT5_braBe  S N GAVVVLFLKFPQLRTPFNLLLLNMAVA D LLVS-VCGN T LSFASAVR-- HRWLWGR PG C  P.59.2
TMT_monDom  S N FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LSGT T LSFASSIQ-- GRWIGGK HG C  P.59.2
TMT_macEug  N N FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LTGT T LSFASSIR-- GRWIAGY HG C  P.59.2
TMT_galGal  N N LIVLILFCKFKTLRNPVNMLLLNISIS D MLVC-ISGT T LSFASNIH-- GKWIGGE HG C  P.59.2
TMT_taeGut  N N LIVLILFCKFKTLRNPVNMLLLNISVS D MLVC-ISGT T LSFASNIR-- GKWIGGD HA C  P.59.2
TMT_anoCar  N N LVVLILFCKFKTLRNPVNMLLLNISAS D MLVC-ISGT T LSFVSNIY-- GRWIGGE HG C  P.59.2
TMT_xenTro  N N FVVLILFCKFKTLRTPVNMMLLNISAS D MLVC-VSGT T LSFTSSIK-- GKWIGGE YG C  P.59.2
TMT_ornAna  N N LIVLILFCKFKALRNPVNMIMLNISAS D MLVC-VSGT T LSFASNIS-- GRWIGGD PG C  P.59.2
TMT_danRer  N N LVVLVLFCKFKTLRTPVNMLLLNISIS D MLVC-MFGT T LSFASSVR-- GRWLLGR HG C  P.59.2
TMT_tetNig  N N FIVLLLFCKFKKLRTPVNVLLLNISVS D MLVC-LFGT T LSFASSLR-- GRWLLGR SG C  P.59.2
TMT_takRub  N N FVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIR-- GRWLLGR IG C  P.59.2
TMT_gasAcu  N N LVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSLR-- GKWLLGR SG C  P.59.2
TMT_oryLat  N N FVVLILFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIR-- GRWLLGR GG C  P.59.2
TMTa1_anoC  N N LLVLVLFCRNKVLRSPINLLLMNISLS D LMIC-IVGT P FSFAASTQ-- GKWLIGP AG C  P.59.2
TMTa1_xenT  N N LVVLILFCQYKVLRSPINMLLMNISLS D LMVC-ILGT P FSFAASTQ-- GHWLIGE IG C  P.59.2
TMTa1_danR  N N LLVLVLFGRYKVLRSPINFLLVNICLS D LLVC-VLGT P FSFAASTQ-- GRWLIGD TG C  P.59.2
TMTb_danRe  N N TLVLVLFCRYKVLRSPMNCLLISISVS D LLVC-VLGT P FSFAASTQ-- GRWLIGR AG C  P.59.2
TMTa_gasAc  N N LLVLVLFCRYKMLRSPINLLLINISIS D LLVC-VLGT P FSFAASTQ-- GRWLIGE GG C  P.59.2
TMTb_gasAc  S N FLVLALFCRYRALRTPMNLLLVSISAS D LLVS-MVGT P FSFAASTQ-- GRWLIGR AG C  P.59.2
TMTa_oryLa  N N LLVLVLFCRYKILRSPINLLLINISIS D LLVC-VLGT P FSFAASTQ-- GRWLIGE GG C  P.59.2
TMTb_oryLa  S N LLVLALFCRYRALRTPMNLLLVSISVS D LLVS-VLGT P FSFAASTQ-- GRWLIGR AG C  P.59.2
TMTa_pimPr  N N TLVLILFCRYKVLRSPMNYLLVSIAVS D LLVC-VLGT P FSFAASTQ-- GRWLIGR AG C  P.59.2
TMTa_takRu  N N LLVLVLFCRYKMLRSPINLLLMNISIS D LLVC-VLGT P FSFAASTQ-- GRWLIGE AG C  P.59.2
TMTb_takRu  S N FLVLALFCRYRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQ-- GRWLIGR AG C  P.59.2
TMTa_tetNi  S N LLVLVLFCRFKVLRSPINLLLVNISVS D LLVC-VLGT P FSFAASTQ-- GRWLIGA AG C  P.59.2
TMTb_tetNi  S N LLVLALFCRFRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQ-- GRWLLGR AG C  P.59.2
TMTa_oncMy  S N LFVLLVFARFQVLRTPINLILLNISVS D MLVC-IFGT P FSFAASLY-- GRWLIGA HG C  P.59.2
TMTa1_calM  N N LLVLVLFCKYKVLRSPMNMLLLNISVS D MLVC-ICGT P FSFAASVQ-- GRWLVGE QG C  P.59.2
TMTa2_calM  N N LLVLLLFVCFKEIRTPLNMILLNISLS D LSVC-VFGT P FSFAASIY-- RRWLIGH KG C  P.59.2
TMTx_braFl  N N STTLYLVGRYKQLRTPFNILMVNLSVS D LLMC-VLGT P FSFVSSLH-- GRWMFGH SG C  P.59.2
TMTPIN_str  N N GIVMILFARFPSLRHPINSFLFNVSLS D LIIS-CLAS P FTFASNFA-- GRWLFGD LG C  P.59.2
TMTy_braFl  T N LLTVLVFWCFKSLRTPFHLYLGGIALS D LLVA-ALGS P FAVASAVG-- ERWLFGR AV C  P.59.2
ENCEPH_str  G N SVVLFLFAWDRHLRTPTNMFLLSLTIS D WLVT-VVGI P FVTASIYA-- HRWLFAH VG C  P.59.2
TMT_apiMel  A N LLVAIVIVKDAQLWTPVNVILFNLVFG D FLVS-IFGN P VAMVSAAT-- GGWYWGY KM C  P.59.2
TMT1_anoGa  L N IFVIALMYKDVQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAIS-- HRWLYGK SI C  P.59.2
TMT2_anoGa  L N LFVIALMCKDMQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAIS-- HRWIFGR TL C  P.59.2
TMT_aedAeg  L N LFVIALMCKDVQLWTPINIILFNLVCS D FSVS-IIGN P FTLTSAIS-- RHWIFGR TV C  P.59.2
TMT_culPip  L N LFVIALMCKEVQLWTPMNIILLNLVCS D FSVS-IVGN P FTLSSAIS-- HRWLFGR KL C  P.59.2
TMT_triCas  L N LTVIIFMLKERQLWSPLNIILFNLVVS D FLVS-VLGN P WTFFSAIN-- YGWIFGE TG C  P.59.2
TMT_bomMor  L N LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALF-- HRWIFGH TM C  P.59.2
TMT_helVir  L N LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALF-- HRWIFGK TM C  P.59.2
TMT_rhoPro  G N LIVIIIMCRDKNLWTPVNFILFNVIVS D FSVA-ALGN P FTLASAIA-- KRWFFGQ SM C  P.59.2
TMT_acyPis  F N TCVIFIMIRDTRLWTPQNVIIFNLATS D LAVS-VLGN P VTLAAAIT-- KGWIFGQ TI C  P.59.2
TMTa_dapPu  M N IVVVVIILNDSQKMTPLNWMLLNLACS D GAIA-GFGT P ISAAAALK-- FTWPFSH EL C  P.59.2
TMTb_dapPu  M N VVVVIVILNDSQRMTPLNWMLLNLACS D GAIA-GFGT P ISTAAALE-- FGWPFSQ EL C  P.59.2
TMT1_plaDu  S N GVIMYLYFKDKSLRSPMNLLFVNLAMS D FTVA-FFGA M FQFGLTCTR- KYMSPGM AL C  P.59.1
TMT2_plaDu  L N VLVLVLFIKDRKLRSPNNFLYVSLALG D LLVA-VFGT A FKFIITARK- TLLREED GF C  P.59.1

TMT_triCys  L N GLVIAVLIKYIRTITNTNIIVLSMSCA N ILIP-LLGS P LSATSSLM-- RKWQFGN GG C  P.59.2
CUBOP_carR  L N MIVLITFYRLRHKLAFKDALMASMAFS D VVQA-IVGY P LEVFTVVD-- GKWTFGM EL C  P.59.2
MEL1_acrMi  L N SVVILTFLLDRSLLFPANLIILSIAIS D WLMS-VVPN I MGGVANAS-- NDLPFTD WS C  P.59.2
ENC_nemVec  T N TIVVIIFISSQRLHTTPNLILFSMSVC D WLMA-TMAK S VGIYGNAR-- YWPTVGK VT C  P.59.2
ENC_nemVec  T N TIVVITFIFSKRLHTTPNLILFSMSVC D WLMA-AMAK S VGIYGNAR-- YWPTVGK VT C  P.59.2
ENC_nemVec  L N GIVLIIFLATRSLRTIPNMILLSMAWA D WLMA-CLAD A VGAYANAN-- NWPSMVG GL C  P.59.2

RGR1_homSa  L N TLTIFSFCKTPELRTPCHLLVLSLALA D SGIS-LNAL V AATSSLL--- RRWPYGS DG C  P.59.3
RGR1_ornAn  L N GLTIASFRKIKELRTPSNLLVVSLALA D SGIC-LNAL M AALSSFL--- RHWPYGA EG C  P.59.3
RGR1_galGa  L N GLTIISFRKIKELRTPSNLLVLSIALA D CGIC-INAF I AAFSSFL--- RYWPYGS EG C  P.59.3
RGR1_xenTr  L N GLTLLSFYKIRELRTPSNLFIISLAVA D TGLC-LNAF V AAFSSFL--- RYWPYGS EG C  P.59.3
RGR1_gasAc  L N AVTIAAFLKVRELRTPSNFLVFSLAVA D IGIS-MNAT I AAFSSFL--- RYWPYGS DG C  P.59.3
RGR2_danRe  L N AISVLAFLRVREMQTPNNFFIFNLAVA D LSLN-INGL V AAYACYL--- RHWPFGS EG C  P.59.3
RGR2_pimPr  L N LISVLAFLRVREIQTPNNFFIFNLAVA D LSLN-INGL V AAYASYL--- RYWPFGS EG C  P.59.3
RGR2_tetNi  L N AISIVSFLTVKEMRNPSNFFVFNLALA D ISLN-VNGL I AAYASYL--- RYWPFGQ DG C  P.59.3
RGR2_gasAc  L N AISIASFLRVKEMWNPSNFFVFNLAVA D ICLN-VNGL T AAYASYL--- RYWPFGQ DG C  P.59.3
RGR2_oryLa  L N AISILAFLRVKEMRSPSSFLVFNLALA D ISLN-INGL T AAYASYL--- RYWPFGQ EG C  P.59.3
RGR1_calMi  L N GLTLLAFYKIKELRTPSNLLITSLALS D FGIS-MNAF I AAFSSFL--- RYWPYGS EG C  P.59.3
RGRa_cioIn  G Y SLLFVIFAKRPDLKK-KNKFLLSLATS D LLIT-VHVF A STIAAFA--- PQWPFGD LG C  P.59.3
RGRa_cioSa  G Y GLLFVIFAKSPDLKK-KNRFLFSLAVS D LLIT-IHVV A SVVASFQ--- SEWPFGS IG C  P.59.3
RGRb1_cioI  G Y AVYFGAIWRSKTLQT-RHIWLTSLACG D IIMM-VHLI L ESLSSLGM-- GHRPRQN FE C  P.59.2
RGRb2_cioI  G Y SVYILAIWSSKKLQT-KHIWLTSLACA D LLMM-VHLF M DGLSSFHQ-- GRRPKGI FE C  P.59.2
RGRb2_cioS  G Y SIYLRAIWSSRKLQT-RHIWLTSLACA D LIMM-VHLF M DGLSSFHQ-- GRRPKGN FE C  P.59.2

PER1_lotGi  L S LLVALTFIREKGLFKYGRAWLHISLAI A NVGV-VGAF P FSGSSSFS-- GRWLYGS GM C  P.59.2
PER1_aplCa  L N LLTALTFYKDTKLTKGSQPWLHILLAL A NVGV-VAPS P FPASSSFS-- GRWLYGS TM C  P.59.2
PER1_todPa  L C GMCIIFLARQSPKPRRKYAILIHVLIT A MAV--NGGD P AHASSSIV-- GRWLYGS VG C  P.58.2
PER1_homSa  S N IIVLGIFIKYKELRTPTNAIIINLAVT D IGVS-SIGY P MSAASDLY-- GSWKFGY AG C  P.59.2
PER1_ornAn  S N VIVLGIFVKFEELRTATNAIIINLAVT D IGVS-GIGY P MSAASDLH-- GSWKFGH AG C  P.59.2
PER1_monDo  S N VIVLGIFVKYKALRTATNTIIINLAVT D IGVS-SIGY P MSAASDLY-- GSWKFGY DG C  P.59.2
PER1_xenTr  S N IIVLGIFVKYKELRTATNAIIINLAFT D IGVS-GIGY P MSAASDLH-- GSWKFGY VG C  P.59.2
PER1_gasAc  S N IVVLLMFWKFKELRTATNFIIINLAFT D IGVA-GIGY P MSAASDIH-- GSWKFGY AG C  P.59.2
PER1a_sacK  L S SVNFRMLLSNPDYCSKAGNFFLSLAVT D LCVC-IFET P FSAFSHHA-- GFWIFGD TA C  P.59.2
PER1b_sacK  G N SVVLEMFRRYKELLSPSAILLISLALA D LGLT-IFGM S LSCVSSFA-- GRWLFGK FG C  P.59.2
PER3_braFl  E N GITLATFTKFRSLRSPTTMLLVHLAIA D LGIC-IFGY P FSGASSLR-- SHWLFGG VG C  P.59.2
PER3_braBe  E N GITLATFSKFRSLRSPTTMLLVHLAIA D LGIC-IFGY P FSGASSLR-- SHWLFGG VG C  P.59.2
PER3_hadAd  G N GLVLVTFLRFRVLVTPTTLLLVNLAVS D LGLI-LFGF P FSASSSLS-- AKWIFGE GG C  P.59.2
PER2_patYe  G N LLIIIVFAKRRSVRRPINFFVLNLAVS D LIVA-LLGY P MTAASAFS-- NRWIFDN IG C  P.59.2
PER1_braFl  G N IFAIIVFLTEKEFRKKEHNSFALNLAIA D LSVCVFAY P SSTISGYA-- GEWMLGD VG C  P.60.2
PER1_braBe  G N VITITVFLTEKEFRKKQQNGFVLNLAIA D LSVCVFAY P SSAIAGYA-- GRWVLGD VG C  P.60.2
PER2_braFl  G N ATVVLMFMLKWRQLCRKANLLIINLAAV D LCISVFGY P FSASSGFA-- NQWLFSD AI C  P.60.2
PER2_braBe  G N ATVVLMFIMKWRQLCRKANLLVINLAAA N LCITIFGY P FSASSGYA-- HQWLFPD AI C  P.60.2
PER2a_strP  G N ITVICVLCRYRTFRKRSINLLLINMAAS D LGVSVAGY P LTTVSGYW-- GRWLFGD VG C  P.60.2
PER2b_strP  G N ITVLCVLCRYGTFRKRSVNILLMNMAVS D LGVSVAGY P LTAISGYR-- GRWVFAD IG C  P.60.2

NEUR1_homS  G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
NEUR1_calJ  G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
NEUR1_dasN  G N GYVLYMSSKRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
NEUR1_canF  G N GYVLYMSSRRKKKLRPAEIMTINLAIC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
NEUR1_bosT  G N GYVLYMSSRRKKKLRPAEIMTVNLAIC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
NEUR1_musM  G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW FG C  P.59.2
NEUR1_loxA  G N GYVLYMSCRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FVIISCFC-- HRWVFGW IG C  P.59.2
NEUR1_ochP  G N GYVLYMSSRRKKKLRPAEIMTINLAVC D LGIS-VVGK P FTIISCFC-- HRWVFGW IG C  P.59.2
NEUR1_monD  G N GYVIYMSSKRKKKLRPAEIMTVNLAVC D LGIS-VVGK P FTIISCFS-- HRWVFGW VG C  P.59.2
NEUR1_ornA  G N GYVIYMSSRRKKKLRPAEIMTVNLAVC D LGIS-VVGK P FTIVSCFC-- HRWVFGW MG C  P.59.2
NEUR1_galG  G N GYVIFMSSKRKKKLRPAEIMTVNLAVC D LGIS-VVGK P FSIISFFS-- HRWIFGW MG C  P.59.2
NEUR1_xenT  G N GYVIYMACSRKKKLRPAEIMTINLAVC D LGIS-VTGK P FAIVSCFS-- HRWVFGW NA C  P.59.2
NEUR1_danR  G N GYVMYMTFKRKTKLKPPEIMTLNLAIF D FGIS-VSGK P FFIVSSFS-- HRWLFGW QG C  P.59.2
NEUR1_calM  G N GYVIYLSITQKRKLKPPEILITNLAIS D FGMS-VGGQ P FLIISCFS-- HRWIFGW VG C  P.59.2
NEUR1a_bra  G N GRVLWLSYRCRARLRPVEMFVVSLAVA D VGLS-LVGH P FAAASSLM-- GRWSFGS AG C  P.59.2
NEUR1b_bra  G N GRVLWLSYRNWAKLRPVELFVVSLAVT D VGIS-VFGY P FAASSSLL-- GRWSFGS AG C  P.59.2
NEUR_strPu  G N ISVIVISLRKREKLKPIDLLTINLAIA D FLIC-VVSY P LPMISAFR-- HRWSFGK FG C  P.59.2
NEUR2_galG  G N SILLYISYKKKHLLKPAEYFIINLAIS D LAMT-LTLY P LAVTSSLS-- HRWLYGK HI C  P.59.2
NEUR2_anoC  G N SILLYVSYKKKNLLKPAEYFMINLAIS D LGMT-LTLY P LAVTSSLA-- HRWLFGQ QV C  P.59.2
NEUR2_xenT  G N SMLLLVAYRKRSILKPAEFFIVNLSIS D LGMT-GTLF P LAIPSLFA-- HRWLFDK VT C  P.59.2
NEUR2_danR  G N GMLLFVAYRKRSSLKPAEFFVVNLSVS D LGMT-LSLF P LAIPSALA-- HRWLFGE IT C  P.59.2
NEUR2_calM  G N SVLLFVAYRKRQILKPAEYFVANLAVS D ISMT-VTLL P LAISSNFS-- HRWLFVS KP C  P.59.2
NEUR3_galG  G N SAVLATAVKRSSLLKSPELLTVNLAVA D IGMA-ISMY P LAIASAWN-- HAWLGGD AS C  P.59.2
NEUR3_taeG  G N SAVLATAVKRSSLLKPPELLTVNLAVA D IGMA-LSMY P LAIASAWS-- HAWLGGD AS C  P.59.2
NEUR3_xenT  G N CAVLATAVKCSSHLKAPDLLSINLAVA D LGMA-ISMY P LAIASAWN-- HAWLGGD AS C  P.59.2
NEUR3_anoC  G N SMVLAVAVKRSSCLRSPELLTVNLAAT D LGMG-LSMY P LAIASAWN-- HAWLGGE AT C  P.59.2
NEUR3b_dan  G N LMVLVMAYKRSNHMKPPELLSVNLAVT D LGAA-VTMY P LAVASAWN-- HHWIGGD VS C  P.59.2
NEUR3a_dan  G N AAVLLTAAWRHSVLKAPELLTVNLAVT D IGMA-LSMY P LSIASAFN-- HAWIGGD PS C  P.59.2
NEUR3a_tet  G N ASVLFSASRRLTPLKAPELLTVNLAVT D IGMA-LSMY P LSIASAFN-- HAWMGGD TA C  P.59.2
NEUR3_petM  G N GAVLGVAARRWAKLKAPELLSVNLALT D LGIA-ASIY P LAVASAWN-- HRWLGGQ PV C  P.59.2
NEUR4_ornA  G N SMVIFILHRQRGILNPTDYLTFNLAVS D ASVS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.62.0
NEUR4_galG  G N SVVIFVLYKQRHLLQPTDYLTFNLAVS D ASIS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.62.0
NEUR4_taeG  G N SIVIFVLYKQRHVLQPTDYLTFNLAVS D ASIS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.62.0
NEUR4_anoc  G N SIVIFVLYRQRAGLQPTDYLTFNLAVS D ASVS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.62.0
NEUR4_xenT  G N SIVIFVLYKQRANLLPTDYLTFNLAVS D ASTS-VFGY S RGIIEIFNVF RDDGFSI WT C  P.62.0
NEUR4_danR  G N SIVIFVLFRQRSTLQPTDYLTLNLAVS D ASIS-VFGY S RGILEIFNIF KDSGYSV WT C  P.62.0
NEUR4_tetN  G N TVVLFVLVRQRSSLQPTDLLTFNLAVS D ASIS-VFGY S RGIIQIFNVF QDSGFSI WT C  P.62.0
NEUR4_gasA  G N SLVMFVLYRQRASLQSTDFLTLNLAIS D ASIS-IFGY S RGILEIFNIF NDDGTWI WT C  P.62.0
NEUR4_calM  G N SIVIFILYRQRLSLQPPDYLTLNLAVS D ASIS-IFGY S RGIIEIFNVF RDDGFSI WT C  P.62.0

UROPS1_tri  G N MLVFLTFYKHASLRTTSNLFIINLAIT D LLTG-GIKD T LFIYGLTS-- YNWPKSA IL C  P.58.2
UROPS2_tri  G N GAVLLVLRYHHDDIKSASNYFITNLALTD FLLG-VLCM P CILISCLN-- GQWVFGQ TL C  P.58.2
GPR17_homS  G N TLALWLFIRDHKSGTPANVFLMHLAVA D LSCV--LVL P TRLVYHFSG- NHWPFGE IA C  P.58.1
CYSLTR1_ho  G N GFVLYVLIKTYHKKSAFQVYMINLAVA D LLCV--CTL P LRVVYYVHK- GIWLFGD FL C  P.58.1
P2RY8_homS  G N LFSLWVLCRRMGPRSPSVIFMINLSVT D LMLA--SVL P FQIYYHCNR- HHWVFGV LL C  P.58.1
BDKRB2_hom  E N IFVLSVFCLHKSSCTVAEIYLGNLAAA D LILA--CGL P FWAITISNN- FDWLFGE TL C  P.58.1
SSTR1_homS  G N SMVIYVILRYAKMKTATNIYILNLAIA D ELLM--LSV P FLVTSTLL-- RHWPFGA LL C  P.58.2
OPRL1_homS  G N CLVMYVILRHTKMKTATNIYIFNLALA D TLVL--LTL P FQGTDILL-- GFWPFGN AL C  P.58.2
OPRM1_homS  G N FLVMYVIVRYTKMKTATNIYIFNLALA D ALAT--STL P FQSVNYLM-- GTWPFGT IL C  P.58.2
CCR4_homSa  G N SVVVLVLFKYKRLRSMTDVYLLNLAIS D LLFV--FSL P FWGYYAA--- DQWVFGL GL C  P.58.3
TACR2_homS  G N AIVIWIILAHRRMRTVTNYFIVNLALA D LCMA-AFNA A FNFVYASH-- NIWYFGR AF C  P.59.2
GALR1_homS  G N SLVITVLARSKKPRSTTNLFILNLSIA D LAYL-LFCI P FQATVYAL-- PTWVLGA FI C  P.59.2
QRFPR_homS  G N ALVFYVVTRSKAMRTVTNIFICSLALS D LLIT-FFCI P VTMLQNIS-- DNWLGGA FI C  P.59.2
PPYR1_homS  G N LCLMCVTVRQKEKANVTNLLIANLAFS D FLMC-LLCQ P LTAVYTIM-- DYWIFGE TL C  P.59.2
NPY1R_homS  G N LALIIIILKQKEMRNVTNILIVNLSFS D LLVA-IMCL P FTFVYTLM-- DHWVFGE AM C  P.59.2
GPR19_homS  G N SLVCLVIHRSRRTQSTTNYFVVSMACA D LLIS-VAST P FVLLQFTT-- GRWTLGS AT C  P.59.2
HCRTR1_hom  G N TLVCLAVWRNHHMRTVTNYFIVNLSLA D VLVT-AICL P ASLLVDIT-- ESWLFGH AL C  P.59.2
GPR161_hom  G N LVIVVTLYKKSYLLTLSNKFVFSLTLS N FLLS-VLVL P FVVTSSIR-- REWIFGV VW C  P.59.2
ADRA1D_hom  G N LLVILSVACNRHLQTVTNYFIVNLAVA D LLLS-ATVL P FSATMEVL-- GFWAFGR AF C  P.59.2
ADRB2_homS  G N VLVITAIAKFERLQTVTNYFITSLACA D LVMG-LAVV P FGAAHILM-- KMWTFGN FW C  P.59.2
ADRB1_melG  G N VLVIAAIGSTQRLQTLTNLFITSLACA D LVVG-LLVV P FGATLVVR-- GTWLWGS FL C  P.59.2
PRLHR_homS  G N CLLVLVIARVRRLHNVTNFLIGNLALS D VLMC-TACV P LTLAYAFEP- RGWVFGG GL C  P.59.1
NMUR2_homS  G N VLVCLVILQHQAMKTPTNYYLFSLAVS D LLVL-LLGM P LEVYEMWRN- YPFLFGP VG C  P.59.1
ADORA2A_ho  G N VLVCWAVWLNSNLQNVTNYFVVSLAAA D IAVG-VLAI P FAITIS---- TGFCAAC HG C  P.59.4
TRHR_homSa  G N IMVVLVVMRTKHMRTPTNCYLVSLAVA D LMVLVAAGL P NITDSIY--- GSWVYGY VG C  P.60.3
NPY2R_homS  G N SLVIHVVIKFKSMRTVTNFFIANLAVA D LLVN-TLCL P FTLTYTLM-- GEWKMGP VL C  P.59.2

Indels in the TM4-EC2-TM5 region

The retinal plug loop region EC2 contains the second half of the extracellular disulfide bond which is very important to the overall structural stability of the GPCR molecule. After years of back and forth, it was eventually concluded that 92% of gene family members, including all opsin though not all GPCR do contain a disulfide linking extracellular domain EC2 to TM3.

OpsinStability.jpg

That requires more than just comparative genomic conservation of the two cysteines because overall conservation can be quite high. Furthermore, it is quite difficult to establish homological correspondences in EC2 because of gapping issues (see alignment below). However non-existence or non-conservation of the cysteine does establish absence. All of the opsin outgroup GPCR considered here do contain a disulfide. The disulfide thus is not at all diagnostic for opsin-class GPCR and its absence reliably implies sequencing error.

These regions can only be aligned after careful determination of anchor residues on both sides of the disulfide cysteine. Some are subtle, involving small reduced alphabets of similar residues and imperfect invariance rather than absolute conservation. The anchors consist of a universally conserved W in TM4, a proline prior to TM4 exiting the membrane, the GWS.Y..E region unique to opsins, an aromatic residue 3 residues after the C, and finally the reliable P......CY region well within TM5.

This region has been quite permissive with regards to indels over evolutionary time. To account for observed length variations, six indels are needed within opsins and dozens more for the full spectrum of rhodopsin-class GPCR. Unsurprisingly almost all of these can be assigned to the EC2 extracellular loop rather than to TM4 or TM5. Somehow the disulfide bond has been able to accommodate these abrupt length variations without loss of steric capacity for formation, suggesting the loop is otherwise weakly constrained.

However more than the disulfide and membrane boundaries are important here for this twisted β-hairpin. Almost all opsins are further distinguished by four other conserved residues Y..E.....C..DY (in rhodopsin numbering Y178 E181 C187 D190 Y191) which presumably play the same role in both determined opsin structures. The conservation of the final tyrosine (often tryptophan) persists into GPCR such as tachykinin receptor with highest overall opsin blast scores, though it is not sufficiently universal to occur in the other three structurally determined GPCR.

Various studies pertain to the phenomenal conservation of these four non-cysteine loop residues. First, interactions with other proteins can be eliminated as these take place on the cytoplasmic side. Proper folding of newly synthesized opsin is a valid consideration. Y191 is proximal to the 9-methyl group of 11-cis retinal in the dark state; Y191A changes hydrolysis of the Schiff base, folded structure after photobleaching and chromophore release. R177 forms an ion pair with D190 but comparative genomics quickly shows this has narrow applicability.

E181 has been interpreted as the counterion in Meta I rhodopsin (displacing E113 in the dark state); the reorganization of EC2 then propagates to TM3 via a push from the disulfide. Yet it's hard to see how this could work universally because in LWS the residue is invariably histidine and in VAOP where it is always serine (as SK covarying anomaly). These residues lack the negative charge to offset the Schiff lysine. D and E are also found sporadically in non-opsin GPCR where counterion makes no apparent sense -- this could be pursued in the structure of ADRB1.

Almost all of the indels resolve to insertions despite their relative rarity compared to deletions proteomewide. These insertions evidently arose from some other mechanism than extension of splice donors or acceptors (indicated below by underlining) because the site of insertion is elsewhere, even allowing for gap uncertainties. If some sequence microstructure predisposed certain regions to insertions, little of that may remain 500 myr later. All the key developments in ciliary imaging opsins took place prior to lamprey divergence.

Overall, the pattern of insertions but not deletions suggest a structurally determined floor to admissible loop lengths. After discounting derived forms, all opsin classes other than neuropsins have the same minimal length, including cnidarian and ctenophore opsins. Thus the ancestral opsin likely had this length and pattern:

W.........P..GWS.Y..E.....C..DW........SY...........P.......Y

These region provides various synapamorphies for opsin classes. Each of these is represented by a single proxy sequence that accurately represents its ortholog class. For example, all UV7 arthropod opsins have the same two residue insertion at the end of TM4, not just the Rhodnius prolixis representative shown. Homoplasy can be seen in some instances (eg the five imaging cilopsins and the four dimly related neuropsins both have two extra residues following Y191) but it remains manageable when it occurs in two well-separated regions of what is a very deep and comprehensive gene tree.

OpsinEC3indels.png


Indels in other opsins

Informative indels would be very helpful in the peropsin/neuropsin/rgropsin group of opsins because their sequence relationships to ciliary and melanopsins are too weak to determine root topology. Note intron patterns, another class of even rarer genetic event and so even better suited for deep time scales, has already illuminated branching relationships to a certain extent.

(to be continued)

See also: Curated Sequences | Ancestral Introns | Cytoplasmic face | Ancestral Sequences | Alignment | Update Blog