Opsin evolution: key critters: Difference between revisions

From genomewiki
Jump to navigationJump to search
No edit summary
m (add category tag)
 
(168 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Some species such as drosophila have lost all ciliary opsins [ref] -- clearly they are not essential for a successful visually complex flying insect with 5-color vision and circadian rhythm. Bees, annelids, and mammals retain ciliary opsins so we know this must be the ancestral bilateran state state. This predicts ciliary opsins in cnidaria and indeed one was [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2013938 just found] in cnidaria. One sees the importance of complete genomes here (versus transcripts or immunostained sections): absence of ciliary opsin evidence in a genome is truly evidence of ciliary opsin absence.  
This page and its updates have [[Opsin_evolution:_key_critters_%28protostomes%29|moved here]] to improve content organization.  


When the eye is reduced to a single pigment cell backing a single photoreceptor cell, the opsin of that species will be expressed only in one cell of the entire body. In this situation, the opsin may never show up in transcript collections, even with subtraction of common ones.
== Key Critters: introduction to genome projects opsins ==
Some species such as drosophila have lost all ciliary opsins -- clearly this class of genes is not essential for a successful visually complex flying insect with 5-color vision and circadian rhythm (as one might have assumed from vertebrates). Other protostome lineages such as nematodea (eg Caenorhabditis elegans) function successfully without any vision at all, making this 'model organism' completely irrelevent to the evolutionary study of vision.


Vertebrates could never have evolved cilliary opsin vision had the bilateran ancestor possessed the limited opsin repertoire of fruit fly. Thus most pressing question is -- assuming rhabdomeric opsins were thoroughly entrenched in the earliest imaging eyes and photoreception systems -- what kept ciliary opsins around in early bilatera (and even cnidaria) so that they could later be co-opted for ciliary opsin-based vision? We could also ask why vertebrates did not stay on the rhabdomeric track of early deuterostomes but instead underwent this profound switch to the 'untested' ciliary track. It is not at all clear what advantages ciliary offers over rhabdomeric -- ever miss swatting a fly?
However bees, annelids, and mammals retain ciliary opsins so it follows -- pervasive, detailed convergence at the molecular level being impossible -- this must be the ancestral bilateran state state. In turn that suggests ciliary opsins in cnidaria and indeed that has been recently established in the lensing eye.  


Fossil dna does not go back nearly this far back, the nearly non-existent fossil record of soft body parts is unhelpful, and transcripts plus genomes of living species are 450-550 million years removed from the crucial ancestral chain of events. For example, transcript-labelled thin sections of photoreceptor systems in modern amphioxus only speak to the current situation, as does extraction of opsin genes from the new assembly, and speak to the history only by inference. The situation today may be seriously different from the ancestor in terms of both innovations and losses. And that history is not necessarily the most parsimonius (even though we will often assume that).
When the eye is reduced to a single pigment cell backing a single photoreceptor cell, the opsin of that species may be expressed only in one cell of the entire body. In this situation, the opsin may never show up in transcript collections, even with subtraction of common ones. One sees the importance of complete genomes here (versus transcripts or immunostained sections alone): absence of ciliary opsin evidence in a genome is truly evidence of ciliary opsin absence.


It's worth expanding on this perpetual source of confusion by emphasizing [http://genomewiki.ucsc.edu/index.php/Opsin_evolution:_trichromatic_ancestral_mammal here too] that contemporary tunicates, lancelets, and lamprey are not ancient, ancestral, antiquated, archaic, character-retaining, dead-end, failed experiments, frozen in time, genetically stationary, living fossil, primitive, primordial, relic, or  retro species. They're full modern -- the tree of life is right-justified. Indeed their genes, regulatory signalling systems, and enzymes may be more finely honed than mammals because of more rapid evolution attributable to larger effective population sizes, reproductive mode, generation time, and marine selective predatory pressures.
Vertebrates could never have evolved ciliary opsin vision had the bilateran ancestor possessed the limited opsin repertoire of fruit fly. Thus the most pressing question is -- assuming rhabdomeric opsins were thoroughly entrenched in the earliest bilateran imaging eyes and photoreception systems -- what kept ciliary opsins around in early bilatera? Recall early diverging deuterostomes (xenoturbellids, urchins, acorn worms, tunicates, and lancelets) lack imaging vision -- that emerged in full modern form on the lamprey stem.  


However we can hope that ancestral character traits will still be reflected to some extent in these earlier diverging species and that with enough complete opsin repertoires from taxonomically appropriate species, the ancestral genes and even visual systems can be reconstructed at key nodes on the phylogenetic tree. The story describing the evolution of the human eye then amounts to describing the status at these successive nodes and perhaps interpolating between them. There are definitely limits to knowledge here as extant metazoans provide only 35 nodes between sponge and human -- gaps between nodes may average 30 million years but can seriously exceed that. This is offset by the occasional proposal of new deuterostome branches (Xenoturbella, Convoluta).
Conversely, assuming cnidaria use ciliary opsins, what kept rhabdomeric opsins around so that they could later be co-opted by protostomes for their form of opsin-based vision? Evolution is strictly 'use it or lose it' over these time frames. Here cnidaria, or at least their larva, may also use rhabdomeric opsins. It seems that both classes of opsins have retained roles in most species, but very different classes were promoted to the imaging role in different branches of Bilatera. In fly, ciliary opsins have winked out; in nematode, both ciliary and rhabdomeric opsins are gone. While irrevocable, these losses would scarcely receive comment in non-model organisms.


The ideal set of genomes needed to study the evolution of the metazoan eye is only partly completed, underway, or even proposed. In some cases, the genome size of clade representatives is so large (eg lungfish at 25x human) the species may never be sequenced, though opsin transcripts could still be obtained. In others, the rate of evolution has been so fast so long that very little information about photoreception at ancestral nodes can remain (eg Oikopleura). Hagfish opsins, which would conveniently break up the crucial lamprey long branch, are not available at GenBank but here the animal has adopted a deep water (dark) habitat, meaning that its cone opsin genomic repertoire will be highly reduced, if not gone entirely, in its markedly degenerated eyes. (Its other opsins could still be informative.)
It's important to understand contemporary representatives of early diverging species (relative to the sequence of divergence nodes leading to human) are not archaic failed experiments nor primitive living fossils frozen in evolutionary time. Quite the contrary, all surviving extant species are equally successful and fully modern -- the tree of life is right-justified. Indeed their genes, regulatory signalling systems, and enzymes may be more finely honed than slowly evolving mammals because of more rapid evolution attributable to larger effective population sizes, reproductive mode, short generation time, and marine selective predatory pressures.


Model organism choices do not always coincide with genome sequenceability, transcriptome projects, nor (worst of all) with slow-evolving less derived species. Finally, most sequencing speaks to narrow anthropocentric interests, whereas the more broadly conceived sequencing need is greatest farther back (to break up branches). The evolution of the eye needs a rather different portfolio of genomes than a typical disease gene because of the earlier intrinsic timing of the innovative events. In fact, one product of the investigation here is to spell out these needed genomes. Of course one obvious genome choice is the cubomedusan jellyfish Carybdea marsupialis with its 24 eyes of 6 types.
However we can still hope that ancestral character traits will still be reflected to some extent in these earlier diverging species and that with enough complete opsin repertoires from taxonomically appropriate species, ancestral genes and even whole visual systems can be reconstructed at key ancestral nodes on the phylogenetic tree. The story describing the evolution of the human eye then amounts to describing its status at these successive nodes with perhaps interpolative speculation between them. Definitely limits to knowledge exist because living metazoans provide only 35 nodes between sponge and human -- gaps between nodes may average 30 million years but can greatly exceed that (eg 135 myr between bird and platypus). This is offset by the occasional proposal for new deuterostome branches (Xenoturbella, Convoluta) or basal metazoan (Ctenophores.


It's worth reviewing genome status and recent experimental literature on key species. While abstracts are readily available at PubMed, access to free full text is unpredictable, so those links are collected when available. It suffice to reference only recent articles because they cite the earlier literature and citation in turn of their paper are collected by Google Scholar (or AbstractsPlus at PubMed). Most opsin sequences in the [[Opsin evolution]] reference sequence collection have a  PubMed accession as a field their fasta header database; those can simply be [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12435605,15936279,11591373,9427550,17463225,14981504,15096614,9256070,17961206,11874910,15514158,16311335,16291092,11318381,11976887,9427550 compiled] to an active link that opens all of them in one PubMed window.
The ideal set of genomes needed to study the evolution of the metazoan eye is only partly completed, underway, or not even proposed yet. In some cases, the genome size of clade representatives is so large (eg lungfish at 25x human) the species may never be sequenced, though satisfactory opsin transcripts could still be obtained. In others, the rate of evolution has been so fast so long that very little information about photoreception at ancestral nodes has been retained (eg the tunicate Oikopleura). Hagfish opsins, which would conveniently break up the crucial lamprey long branch, are not available at GenBank but here the animal has adopted a deep water habitat, meaning that its cone opsin genomic repertoire will be highly reduced, if not gone entirely, in its markedly degenerated eyes, though whatever remains of its opsins could still be informative.


[[Image:OpsineyePhylo.png|OpsineyePhylo.png]]


<tt>Figure adapted from: [http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=17684563 Acoel Flatworms Are Not Platyhelminthes: Evidence from Phylogenomics] Hervé Philippe et al PLoS ONE. 2007 Aug 8</tt>
[[Image:MoreBilatGenes.png|left|]]


* <span style="color: #990099;">Callorhinchus (elephantshark) chondrichthyes opsins</span>
The impact of adding more genomes is to uncover more genes of the common bilateran ancestor that were masked by lineage-specific losses. Recall the beatle genome Tribolium uncovered 126 additional genes absent in other insect genomes but nonetheless present in human. Humans themselves of course have lost hundreds of genes
even relative to the first land animal, so here too we need to pool mammalian and amniote gene pools to reconstruct that ancestor.


Five ray-finned fish genomes are available but these have major lineage-specific expansions and are quite derived. Some sequences are available from lobed-finned fish and coelocanth genome has been proposed. This makes the [http://www.sciencemag.org/cgi/content/full/314/5807/1892 preliminary genome assembly] of the much earlier diverging Callorhinchus (oft-misspelled) and skate transcipts very special because it is the "last stop" before lamprey.  
Model organism choices do not always coincide with genome sequenceability, transcriptome projects, nor (worst of all) with more slow-evolving and less derived species. Finally, most sequencing speaks to narrow anthropocentric interests, whereas the sequencing need more broadly conceived is greatest farther back (to break up long branches). The evolution of the eye needs a rather different portfolio of genomes than a typical human disease gene because of the earlier intrinsic timing of the innovative events. In fact, one product of the investigation here is to spell out these needed genomes. Of course one obvious genome choice are cubomedusan jellyfish with their 24 eyes of 6 types.


This [http://www.flmnh.ufl.edu/fish/Gallery/Descript/GhostShark/GhostShark.html large-eyed cartilaginous fish] lives to depths to 200m on the continental shelf of southern Australia and New Zealand but migrates into coastal estuaries to lay egg cases (lower image) in sand and muddy substrates. The distinctively-shaped egg cases are sometimes found washed ashore after storms. They are up to 25cm long, 10cm wide, and take up to eight months to hatch. The one studied member of the genus has a vitamin A1-based photopigment with maximum absorbance at 499 nm presumably adapted to its overall photic environment.
It's worth reviewing genome status and recent experimental literature on key species. While abstracts are readily available at PubMed, access to free full text is unpredictable, so those links are collected when available. It suffices to reference only recent articles because those in turn cite the earlier literature and citation in turn of their paper are collected by Google Scholar (or AbstractsPlus at PubMed). Most opsin sequences in the Opsin evolution reference sequence collection have a PubMed accession as a field in their fasta header database; those can simply be compiled to an active link that opens all of them in one PubMed window.
 
<br clear = "all">
 
[[Image:OpsineyePhylo.png|left]]<br clear = "all">
Figure adapted from: Acoel Flatworms Are Not Platyhelminthes: Evidence from Phylogenomics (H Philippe et al PLoS ONE. 2007 Aug 8)
 
<br clear = "all">
 
=== Deuterostomes moved to separate article ===
 
The key critter article has been broken down into 3 smaller articles -- deuterostomes are now [[Opsin_evolution:_key_critters_(deuterostomes)|here.]]
 
Chondrichthyes: Callorhinchus milii (elephantshark)        13 opsins
Agnatha:        Petromyzon marinus (lamprey)                9 opsins
Agnatha:        Eptatretus burgeri (hagfish)                0 opsins
Urochordata:    Ciona intestinalis (tunicate)                4 opsins
Echinodermata:  Stronglyocentrotus purpuratus (sea urchin)  6 opsins
Hemichordata:  Saccoglossus kowalevskii (acornworm)        1 opsin
Deuterostomia:  Xenoturbella bocki + Convoluta pulchra      0 opsins
 
=== Cnidaria and Porifera moved to separate article ===
The key critter article has gotten too large -- cnidaria are now [[Opsin_evolution:_key_critters_(cnidaria)|here.]]
 
Cubozoa: Tripedalia cystophora .. 1 ciliary opsin
Cubozoa: Carybdea marsupialis (jellyfish) .. probable opsins
Anthozoa: Nematostella vectensis (sea anemone) .. claimed opsins
Hydrozoa: Hydra magnipapillata (hydra) .. claimed opsins
Hydrozoa: Cladonema radiatum (jellyfish) .. claimed opsins
Porifera, Placozoa, Choanoflagellates .. 0 opsins
 
== <span style="color: #990099;">Lophotrochozoa: 13 opsins</span> ==
[[Image:Opsin_lopho_larvae.png|left|]]
 
This is a monophyletic group (in the mind of evo-devo practitioners) of bilaterans reflecting a [basal split [http://kentsimmons.uwinnipeg.ca/16cm05/1116/16anim5.htm deep within protostomes.] The classification is based both on molecular considerations and a shared larval form with ciliated wheel, in contrast to characters of adult animals such as segmentation.
 
Lophotrochozoa is not recognized at GenBank so blast searches cannot be restricted to Lophotrochozoa. However Entrez and PubMed searches can be so restricted using boolean queries. In terms of genome projects Lophotrochozoa currently consists of 7 species of flatworms, molluscs, and annelids. However, it [http://www.ucmp.berkeley.edu/phyla/lophotrochozoa.html also contains] Brachiopoda, Bryozoa, Entoprocta, Nemertea, Sipuncula, etc  which collectively account for less than 3,000 of the 5.7 million nucleotide sequences at GenBank and no annotated opsins.
 
The Lophotrochozoa have not been surveyed as a whole for those that might be 'living fossils' in terms of opsins and photoreceptor structures. Even those would not necessarily make good genome projects because of genome size and compositional issues. However Annelida has been thoroughly considered by Purschke, Arendt et al in a recent offline, off-Pubmed review (Arthropod Structure & Development 35(2006) 211-230).
 
<br clear="all" />
 
 
=== <span style="color: #990099;">Annelida: Platynereis dumerilii (ragworm) .. 3 opsins</span> ===
 
This small annelid may be an emerging model organism, though plans for genome sequencing in France have apparently collapsed. (Indeed all metazoan genomic sequencing in Europe has ceased.) Three [http://www.ncbi.nlm.nih.gov/pubmed/16311335,15514158,11874910 recent papers] have established that Platynereis qualifies as living fossil, at least with respect to ancestral anatomy and development, slowly evolving protein sequences and retention of genes and ancestral introns, and further has retained ciliary opsins.
 
That is to say, fruit fly and nematode have proven unfortunate choices because of so many lost genes and signalling pathways, rapid evolution and highly derived characters. Lophotrochozoa may thus give us very significant insight into the bilateran ancestor that had appeared lost from consideration of just Arthropoda. It should be noted though that not all insects should be written off, Anopheles has also retained ciliary opsins.
 
[[Image:Opsin_platynereis.png|left|]]
 
Platynereis develops various pairs of eyes going by localization of opsin expression: inverse larval eyes used in phototaxis (just one pigment cell and one photoreceptor cell) and two pairs of everse adult eyes needed for adult vision. These originate from an initially unsplit single anlage. These eyes use exclusively rhabdomeric photoreceptor cells and corresponding rhabdomeric-class opsins as expected from phylogenetic position. However two paired structures in the developing median brain dorsal to the apical organ express an opsin that unambiguously classifies as ciliary. Further, a retinal homeobox (specific to ciliary pineal eyes) and circadian rhythm regulator bmal are also expressed at this location in Platynereis. However the pigment cells necessary for directional photoreception are missing. This all fits with a role for the ciliary opsin as the primary receptor underlying circadian rhythm which does not require directionality.
The [http://www.sciencemag.org/cgi/content/full/306/5697/869#REF5 emerging picture] is Urbilatera having both ciliary and rhabdomeric structures. The later specialized structure was lost but the photoreceptor component retained in vertebrates in the form of melanopsins expressed in retinal ganglion cells.
 
Remarkably, Platynereis contains a '''second ciliary opsin next to alpha tubulin''': Using the initial ciliary opsin (a transcript with unknown intronation) as probe at various GenBank databases, a genomeWiki contributer found a 171,779 bp survey sequence in the high throughput genomic sequence HTGS division (meaning it would be overlooked using Blast of the nucleotide division) had a good match in the unannotated contig CT030681, submitted 05-DEC-2005 by Genoscope as 6 ordered contigs (the last of which proves reverse-complemented). 
 
This second opsin, being genomic, after difficult recovery of full length gene from a moderate match, could be intronated (unlike the original transcript) assuming GT-AG splice junctions (like 99% of all genes and 100% of all known opsins). These introns had positions and phases identical to ciliary -- but not Go or Gq -- deuterostome opsins. Assuming the first opsin is not derived as a processed retrogene from the first, it can be intronated via homological alignment. These are stored in the <span style="color: #990099;">Opsin Classifier</span> as CILI1_plaDum and CILI2_plaDum, resp.
 
 
[[Image:opsin_parallels.png|left|]]
 
Using the second opsin as blastp query against our phylogenetically dispersed collection of 225 hand-curated Eumetazoan opsins (including new cnidarian ciliary opsins), it classifies in the encephalopsin-to-pinopsin area in accord with independent classification by intron pattern and close homology with the experimentally characterized Platynereis first opsin. The percent identity to deuterostome opsins is not only quite high (considering the immense round-trip time since common ancestor) but also overwelmingly concentrated on invariant and near-invariant amino acids characteristic of ciliary opsins. Thus this second Platynereis opsin cannot be a pseudogene (unless that happened yesterday or so).
For purposes of conserved synteny [eg establishing orthology to  related opsins in other lochotrophoran genomes], other coding genes on this contig using blastx vs metazoan proteins) can be considered. The only other gene is alpha-tubulin, at positions 124517-122811, downstream from the second ciliary opsin at 46848-87956 using original contig ordering.
 
Recall the Arendt group used antibody to acetylated alpha-tubulin was used as marker for stabilized microtubules in cilia and axons. They needed the sequence for that. Probably the larger contig was then sequenced as part of the genome feasibility survey. There was no particular reason to look at this contigs for opsins at that time, which would be hard to distinguish from abundant non-photoreceptor rhodopsin-superfamily genes or generic GPCR.
Supposing  Platynereis has 15,000 coding genes, this is quite a coincidence to have two genes adjacent that might be critical to the same photoreceptor structure. If these two genes are transcribed divergently (lie on different strands) after fixing (reverse-complementing) the last contig piece, then symmetric transcriptional regulatory element DNA (read the same whichever strand), this could mean the second opsin is tethered to alpha-tubulin production in terms of co-expression in some cell types. Transcribed in the same direction is less attractive as operons are rare in eukaryotes, though read-through is not unheard of and that too could be developmentally regulated in extent.
Re-assembly CT030681 using multi-exon bridging is possible. It turned out pieces 1 and 2 were irrelevent, piece 3 had exons 1,2,3 of the opsin on the plus strand, piece 4 had opsin exon 4 and 5  on the minus strand to piece-coordinate 41,899 for the stop codon. This piece also contains the first three exons of alpha tubulin also on the minus strand beginning at 36,767. Its initial methionine is stranded as a solitary phase 0 codon on the end of 5' UTR, 36,707-05. The remaining two exons of alpha tubulin are on the minus strand of piece 5.
 
Joining piece 3 with reverse-complemented pieces 4 and 5 then fixes orientations to the plus strand and establishes intron sizes subject to the two strings of Ns. This results in parallel gene order CILI2_plaDum+ TUBA_plaDum+, that is tubilin downstream of the opsin with an intergenic gap of 5,132 bp. If there is any coordination of expression by read-thru, on the upstream end it would have to involve the regulatory regions of the opsin.
 
The fifth exon of CILI2_plaDum has too weak match with that of CILI1_plaDum to be found by conventional searching. However the dna where it has to be located is squeezed between exon 4 and the start of tubulin, reducing query size. Blastx of that dna against the full-blown set of opsins turns up a consistent match candidate in frog and skate opsins. Looking at the intron phasing validates the match since the splice acceptor AG is 1 of 16 dinucleotides, the phase 0 required by exon 4 (and ancestral ciliary phase) is 1 of 3 possible phases, and 1 of 2 strand requirement have together a 1 in 96 chance of random occurence, more than sufficient in conjunction with the blast expectation of 1.1e-06.
 
This opsin if co-expressed with CILI1_plaDum would amount to 'circadian rhythm color vision'. Alternately it might be expressed at a different developmental stage or in an unsuspected auxillary photoreceptor.
 
=== <span style="color: #990099;">Annelida: Capitella sp (marine worm) .. 2 opsins</span> ===
 
Capitella is a small segmented benthic marine worm most closely related, in the genome project sense, to its fellow annelid Platynereis. The taxonomy of the genus Capitella was thoroughly muddled by a quaint 1976 starch gel electrophoresis allozyme [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=1257794 study]; Linnean nomenclature has never been developed for the 6 alleged species defined there. The isolate used in the JGI genome project is called Capitella sp. I ES-2005 instead of Capitella capitata.
 
The last of 3,709,316 trace reads were taken in Nov 2005. As with Lottia, a multi-year lag ensued in release of the assembly, deposition in GenBank, and publication of central paper. As of Dec 2008, the only access to the genome is through [http://genome.jgi-psf.org/cgi-bin/runAlignment?db=Capca1&advanced=1 JGI Blast.]  The genome is small at 240 Mb and distributed across 10 chromosomes.
 
This is a subsurface deposit feeder associated with organic-rich mud, seemingly not conducive to an extensive visual system. However an extensive 1993 study of both larval and adult eyes was [http://www3.interscience.wiley.com/cgi-bin/abstract/109920591/ABSTRACT published] in the now-defunct Journal of Morphology (online acces $25). Developing larva hava a pair of eyespots consisting of one sensory cell, one pigment cell, and one support cell. The photoreceptor cell has an array of parallel microvilli with cisternae. It is surrounded by a diaphragm formed by a pigment cell  ring of microvilli-like structures. These last but a few days because at metamorphosis the larval eyespots are greatly reduced. Adults have one pair of eyes built of 2-3 pigment cells and one sensory cell in juveniles increased by 2-3 more in adults.
 
Unusual morphological aspects of Capitella eyes can be placed within the overall context of photoreceptor cells and eyes in Annelida, whose ultrastructural issues were carefully reviewed by Purschke in an off-PubMed journal "Arthropod Structure & Development" v35:4, 2006 (viewing issue full text costs $175). In addition to rhabdomeric and ciliary types, less-known [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=8440775,8642076,4183638 phaosomous] photosensory cells are discussed. Phaosomes (Greek: phaos = light, soma = body) were first described in the earthworm dermal photoreceptors as a central intracellular cavity (phaosome) filled with microvilli but may represent a derived form. They occur at various extraocular sites such as dermus and genitalia (in butterflies). Multiple types of photoreceptors thus provide a potential role for the diversity of opsins observed in the genome.
 
It's clear from Purschke's review that photoreceptors require a combination of ultrastructure, transcript expression mapping, and genomics. In other words, it's necessary to account for all the opsins found in the genome. Many photoreceptors have been overlooked entirely, notably the undirected type (no pigment cell backing); many others have stalled out in controversy for lack of gene availability.
 
I found a number of related opsin fragments in Capitella using various queries but surprisingly no counterpart to Platynereis ciliary opsins. One, stored as MEL1_capCap, clusters consistently with melanopsins and shares two exon breaks. It may be an ortholog of the rhabdomeric Platynereis opsin. The second MEL2_capCap is more distantly related. Reliable full length genes will require a cdna program which so far is totally lacking.
   
   
I made an exhaustive search of the WGS and Trace divisions of GenBank on 5 Nov 2007, recovering many complete exons but most fragmentary genes. The opsin classifier can easily place these fragments. Overall, Callorhinchus appears to have a full complement of vertebrate opsin genes. The exceptions are RHO2, SWS1, SWS2 (oddly also missing in skate and dogfish ESTs) apparently leaving elephantshark with only RHO1 and LWS rod/cone pigments. Parietopsin was also missing. Two encephalopsin- and two melanopsin-class opsins were found. The RGR, peropsin, and neuropsin genes will prove important in better determining their overall gene tree placement (which an October 2007 [http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001054 opsin phylogeny paper] placed deeply within rhabdopsins).
[[Image:Opsin_capitella.png|left|]]
<br clear="all" />
 
=== <span style="color: #990099;">Annelida: Helobdella robusta (leech) .. 2 opsins</span> ===
 
The JGI genome project for the leech Helobdella robusta is well along with 3,168,749 traces, a very recent  assembly to [http://genome.jgi-psf.org/cgi-bin/runAlignment?db=Helro1&advanced=1 blast], but no cdna. The genome is fairly small at 300 Mb but does not appear reduced in terms of gene count. Fifteen unannotated 100 kbp contigs are available at the HTG division of GenBank; these do not contain opsins but might otherwise suggest gene and retroposon densities and extent of synteny retention. The genome had not been submitted to GenBank by Dec 07.
 
[[Image:Opsin_helobdella.png|left|]]
 
Helobdella could be considered a promising [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12888005 emerging] experimental system because techniques such as large-scale whole-mount in situ hybridization screening, RNA interference, and morpholino knock-down are established. It's not clear however that leech retains the degree of ancestral characters as nereid polychaetes. Until a cdna program is established, it will prove very difficult to annotate complete coding genes. The nearest species with a transcript program is the earthworm Lumbricus rubellus with 19,934 ests (but no opsins).
 
Helobdella  is a rhynchobdellid, which is to say (ελεο marsh, ῥύγχος snout, βδελλα leech) a California marsh leech with a muscular straw-like proboscis in a retractable sheah for puncturing prey. Thus it is not closely related to the medicinal leech, Hirudo medicinalis. The anatomy of the closely-spaced single pair of eyes was [http://jcs.biologists.org/cgi/reprint/2/3/341.pdf intensively studied] 40 years ago. An eye in this group consists of 30-100 photoreceptive cells in a deep pigment cup providing [http://www-personal.umich.edu/~davegins/leech.html directional vision.]  Larvae are not free swimming but stay in the albuminous fluid of a cocoon. The 88 Pubmed articles include many on body plan gene expression but only [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=17508218,9518525 two on eyes] and these tangentially. We can only hope the genome project will stimulate additional studies of leech photoreceptors. It seems that every lab uses a different strain  if not a [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=17073933 different species.]
 
I recovered two Helobedella opsin genes on 4 Dec 07 from the erratic JGI server (if no matches, close and restart with a fresh window). The full length gene, stored as MEL2_helRob has 2 conserved introns characteristic of melanopsins and its best matches there. It is likely an ortholog of a similar gene in Schistosoma, Schmidtea, Capitata, and Platynereis.  The 231 aa fragment stored as MEL2_helRob has best match to octopus and chordate melanopsins and shares the first (and possibly second) intron position and phase with them. The parent scaffold 39 may contain tandem opsins or alternatively represent a misassembly. No counterpart to the ciliary opsin of Platynereis emerged. That gene -- which must have been present in the common ancestor with annelid -- could have been lost or is simply missing from the current assembly.
 
<br clear="all" />
 
=== <span style="color: #990099;">Mollusca: Aplysia californica (sea hare).. 2 opsins</span> ===
 
Aplysia has a pair of cephalic dorsal pit eyes just anterior to the rhinophores. The eyes are quite small at 600 microns diameter, with a spherical lens and a tiny one square millimeter retina with approximately 7000 rhabdomeric photoreceptors. Despite a [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=6520253,430406,1686464,912043,83262,2059192,3001240,17498918 fair number of studies] of eyes and rhinophores  involved in vision, circadian rhythm and phototactic head-waving, the opsins have not been characterized beyond  immunoblot  (positive for etinal photoreceptors, rhinophores, cerebral ganglia and ventral abdominal ganglia giant cell R2). There is evidence for G protein alpha subunits  Gq, Gi, and Go families, phospholipase C, and an inositol 1,4,5-trisphosphate receptor in the rhinophore but this may be for chemoreception.
 
The sea hare genome has recently be sequenced by Broad Institute. Sizeable assembled contigs are now open to tblastn at the "wgs" division of GenBank (which allows the exon pattern to be extracted). Despite the assembly, sequencing continues: 212,159 new traces were added in the last week of Nov 07. This illustrates the need to always check the primary data repository when a gene seems missing -- millions of traces might not be used in the assembly. However a close-in query is needed to get a match.
 
I located the first known Aplysia opsin in the 20874 bp contig AASC01108363 on 2 Dec 2007. It had a significant expectation value (e-60) but the best match percent identity within the opsin reference collection (to fellow mollusks) was only 118/319 (36%). Otherwise the best matches are consistently vertebrate melanopsins. This gene is a strong candidate for an invertebrate melanopsin ortholog. It is stored as MOLL_MEL_aplCal.
 
Indeed, there are four exons but precise boundaries are difficult to locate at this low percent identity without cdna or reliably intronated guide sequence from a closely related species. However 2 introns clearly have identical position and phase to vertebrate melanosins and a 3rd quite likely; otherwise there has been intron loss in Aplysia. The contig unfortunately does not contain any information (according to blastx) on adjacent genes (synteny) despite 10 kbp still available 3'. No counterpart to the Platyerneis ciliary opsin could be found.
 
On 28 Dec 07 I located a full length peropsin PER_aplCal, a likely ortholog (from exon breaks and best-blast) to squid retinochome which has an excellent  [http://www.pnas.org/cgi/content/full/97/26/14263 structural model and counterion study.] The Aplysia peropsin is well-represented with 11 transcripts from pedal-pleural ganglia, CNS (adult and juvenile 1), metacerebral cells, and MCC metacerebral neurons but only terminal exons are found in the assembly. However the cdna provide a window to the trace archives which allows accurate intronation of the full gene.
 
It is not at all clear what relationship these lophotrochozoan peropsins have to deuterostome peropsins, nor why they seem missing altogether in ecdysozoa, nor what their ancestral status is. The 3 molluscan peropsins cluster cleanly enough with vertebrate peropsins but overlap only partially in intron placement. That could result from relatively recent intron gain and loss or reflect a much deeper ancestral splitting of peropsin classes. Representatives of these may survive more completely in echinoderms, hemichordates, and cephalochordates. Peropsin may very well be capable of ciliary opsin type signaling with trans-retinal as agonist.
 
At this point,  Aplysia is not a Rosetta stone for opsin evolution. It is however the first mollusk with a genome assembly. This may eventually allow confident transfer of orthology validated by synteny, intron pattern, and indels. The eyes appear homologous in many aspects to those of Arthropoda supporting the common ancester of Protostomia having rhabdomeric lensing eyes, though true across-the-board homology of all eye components is a very complex subject.


* <span style="color: #990099;">Petromyzon (lamprey) agnathan opsins</span>
[[Image:Opsin_aplysia.png|left|]]
<br clear="all" />


Lamprey are the favored outgroup to jawed vertebrates. Geotria and Petromyzon split 280-220 myr ago (helpful in breaking a long branch in half) whereas Lethenteron-Petromyzon was much later 20 myr (unhelpful, too close). Their photoreceptor systems have been studied in [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=AbstractPlus&list_uids=9427550,17463225,15683562,15312025,11774340,9210581,9082962,2776863,2834011 considerable depth].  
=== <span style="color: #990099;">Mollusca: Lottia gigantea (limpet) .. 2 opsin</span> ===
The limpet Lottia gigantea was intended to be the first lophotrochozoan for whole genome sequencing but that goal slipped. It has ancestral-like spiral cleavage and trochophore larva. The genome is small relative to other molluscs at 500 mbp. Some 5.3 million traces were sequenced by May 2005.  In Jan 2007 the sequencing center presented the genome at a meeting talk. However by Dec 2007 no paper had appeared. Recently JGI enabled [http://genome.jgi-psf.org/cgi-bin/runAlignment?db=Lotgi1&advanced=1 blast of the assembly] and display on their funky browser. However nothing was submitted to Genbank. JGI predicts 4 rhodopsins for its KOG gene collection; however none are recognized by the Opsin Classifier. No transcripts are available, though other molluscs have numerous ests. A German group suggests that the genome sequenced was in fact Lottia scutum.


Geotria australis has a full complement of imaging opsins LWS, SWS1, SWS2, RHO2, and RHO1 as well as the other major opsin classes. This implies that the ancestral vertebrate possessed photopic (bright light) cone-based vision with the potential for pentachromacy, circadian rhythm, pupilary reflex, and pineal and parapineal functions. Photoreceptor morphology and spectral sensitivity can change during various phases of the lamprey lifecycle, a lineage-specific complication that does not concern us.  
Under these circumstances, I annotated two Lottia melanopsin in Dec 07, MEL1_lotGig and MEL2_lotGig. Their best match is to other Gq-coupled molluscan opsins, with the first probably an ortholog. Both genes have 3 exons with the two splice positions and phases identical to those of melanopsin (which in vertebrates has numerous other introns). A long run-on carboxy terminus is also seen here. It needs to be established whether these introns are ancestral generic GPCR introns or diagnostic and informative of melanopsins as a gene class. No counterpart to the ciliary opsin of ragworm was immediately apparent.  


The lamprey genome project has stalled, despite accruing an impressive 19 million traces. It seems a 16 kbp retroposon has expanded enormously, making assembly all but impossible. However new sequencing technologies make an immense collection of cDNA affordable. This would permit a pseudo-assembly of exons flanked by some at least genomic dna. That is, transcripts are aligned into the trace archives to obtain non-coding context for exons. This allows some topics to be studied (intron retention, invariant non-coding) but not others (upstream regulatory, chromosomal gene order). Labelling techniques such as FISH could provide some linkages but not gene-level order; perhaps this could be done at the level of BACs using exons from all possible gene pairs. Alternatively, the genome of Geotria might present fewer issues.
On 28 Dec 07, I recovered a peropsin, PER_lotGig, very likely orthologous to a peropsin in squid (called retinochrome there) and Aplysia (PER_todPac, PER_aplCal). Extensive structural and experimental evidence is [http://www.pnas.org/cgi/content/full/97/26/14263 available for squid] which likely transfers over, notably the  Glu181 counterion [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=14981504 proposed ancestral]. The Lottia and Aplysia peropsins are intronated identically and by inference the squid. However these differ in some respects from chordate peropsins, suggesting either intron gain or loss or alternately a small 'cloud' of ancient peropsins that were intronated slightly differently in early metazoa.  


The high lamprey trace coverage is very helpful to this project. Because most of the opsin sequences are from non-genomic species, the intron structures were not known. I've mapped all GenBank opsin data for Geotria and Lethenteron into Petromyzon using these traces, not only obtaining a nearly full spectrum of new sequences but also sequences parsed for intron breaks and phase. Further, opsin classes not available in other lamprey were collected  using chondrichthyes genes as query. In all cases, intron placement remains perfectly conserved from lamprey to mammal, indeed to amphioxus with some complications. These could help validate orthology candidates in earlier species, especially to clarify the presumably ancient but still cryptic origin of imaging opsins. Intron gain and loss is rare in many lineages -- the vast majority of human introns are conserved back to Urbilatera, indeed to sponge.
Lottia is not emerging as a model organism. There are only a handful of studies at PubMed and none on vision. The adult limpet has a pair of eyespots at the base of its cephalic tentacles that likely house a rhabdomeric opsin, perhaps the one annotated here. There may be a second role for paired eyespots in the free-swimming larva for those five days ([http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=1088535&blobtype=pdf thoroughly reviewed] for chiton trochophores by Arendt and Wittbrodt but not Lottia specifically). Circadian rhythm might involve an additional opsin. The adult is an algal gardener that clears and defends intertidal areas -- raiding limpets are sensed (visually?) and driven off. The opsin sequence found here, stored as MOLL_MEL_lotGig, suggests rapid divergence rather than living fossil character. However patellogastropods such as Lottia with symmetrical non-coiled, conical shells are sometimes taken as ancestral form.


* <span style="color: #990099;"> Eptatretus (hagfish) agnathan opsins</span>


Hagfish, after decades of back-and-forth, are now [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=AbstractPlus&list_uids=9866205,12927130,17276090,11820840,17301331,1129982,17377535,15288047,7566650,17051155 sistered] with lamprey, news not accepted yet by the Taxonomy division of GenBank. However including rogue taxa such as Oikapleura can severely skew results in molecular phylogeny studies. Monophyly of Cyclostoma is unfortunate if true -- an earlier node would very much help in understanding the origin of the eye. It would be better to put aside tree topology until rare genomic events can be developed and simply procede with opsin sequencing.
[[Image:Opsin_lottia.png|left|]]
<br clear="all" />


Jawless fish first appeared in the Ordovician. Hagfish and lamprey split well after the Cambrian, roughly 430 myr ago according to molecular clocks. That's a time span comparable to divergence of human from shark. The oldest fossil hagfishes are Late Carboniferous (330 myr). The two extant hagfish groups split some 75 myr ago (human from mouse). Only recently has it been possible to obtain hagfish eggs and embryos and revisit the neural crest issue. Hagfish experienced an extra round of HOX gene expansion, undercutting both HOX cluster copy number as a hallmark of supposed vertebrate body plan innovation and premature speculation on 1R and 2R whole genome duplication.  
=== <span style="color: #990099;">Platyhelminthes:  Schmidtea mediterranea (planaria) .. 1 opsin</span> ===


Hagfish are nocturnal in aquaria and deep-sea in their natural habitat -- a new Eptatretus species was even captured at a hydrothermal vent. This lifestyle is not conducive to a well-preserved imaging opsin portfolio, though hagfish still have circadian rhythm (based in the preoptic nucleus) and dermal photoreceptors though no pineal gland.  The non-imaging paired eyes lack cornea, lens, vitreous body, and extrinsic eye muscle but nonetheless the retina and optic nerve react with opsin antibody. The eyes are larger in Eptatretus than in Myxine, where they are partly covered by the trunk musculature. However 1.3 mm is still quite small for an eye. After comparison of all extant genera, Fernholm and Holmberg concluded in 1975 that the hagfish eyes are secondarily degenerated from more conventional eyes adapted for shallow water (for example an early lens placode disappears). The comparative anatomy of hagfish eyes has an excellent discussion in The Biology of Hagfishes (JM Jorgensen), pages 542-543 available by google book search.
The common planaria Schmidtea mediterranea has a  865 Mb genome very recently assembled from 17 million traces to 10x and placed in the wgs division of GenBank, after an initial impasse attributable to high AT (69%), repeat content (46%) and high clonal heterozygosity. The genome project is described in a [http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/PlanarianSEQ.pdf white paper] and has a dedicated site [http://planaria.neuro.utah.ed SmedDb.] It has a strong EST collection as well.
 
The planarian central nervous system consists of a bilobed brain and two longitudinal ventral nerve tracts connected by commissural neurons. When planarians are decapitated they can completely regenerate a new brain, including new eyes, a boon to [http://www.pnas.org/cgi/pmidlookup?view=long&pmid=10781056 opsin research.] The structure of the eye had already been described by 1915. Regeneration of the nervous system is a [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids17390146,10220416,17881371,16439195 very active] research area.
 
I began with various fragmentary opsins and ESTs and recovered a nearly complete melanopsin (including all introns) from trace archives. It is stored at the Opsin Classifer as RHAB_schMed and discussed in the Schistostoma section as a likely ortholog. Since the site of expression is known from hybridization and no other Schmidtea opsins are apparent, this is likely the principal photoreceptor both here and in Schistostoma. No counterpart to the Platynereis ciliary opsins can be found in the current assembly, indicating (since they could hardly have been invented in Platynereis) their loss in Platyhelminthes is a derived condition.
   
   
No hagfish opsins have been sequenced; no genome project is scheduled.  Even if hagfish imaging opsins are mostly gone, other ciliary and rhabdomeric opsins could be quite informative. Hagfish may have information about a critical era in deuterostome imaging opsin evolution.
 
[[Image:Opsin_planaria.png]]
<br clear="all" />


* <span style="color: #990099;">Hypsibius (water bear) tardigrade no opsins yet</span>
=== <span style="color: #990099;">Platyhelminthes: Schistosoma mansoni (trematode) .. 3 opsins</span> ===
 
The blood fluke Schistosoma mansoni is a major agent of [http://en.wikipedia.org/wiki/Schistosoma_mansoni schistosomiasis (bilharziasis),] infecting more than 200 million people worldwide, with the fresh water snail (Biomphalaria glabrata -- a large EST project) as intermediate host. As an endoparasite residing deep inside lungs, hepatoportal circulation, and mesenteric veins, it would not seem a promising species for eyespots or even circadian rhythm opsins. However at least two life stages are affected by light: the hatching of the miracidium from the egg and emergence of cercaria from the snail. These swim upwards to the surface of the water and are also affected by shadows and turbulence.
 
GPCR proteins are the target of approximately half of all pharmaceuticals. For that reason, a Schistoma opsin came to be [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=11166392 studied.] That gene is expressed in the [http://www.path.cam.ac.uk/~schisto/SchistoLife/Cercaria.html miracidia and cercaria stages] but down-regulated in the adult. Expression is localized to sub-tegumental structures at the front end of cercariae. Full text of the 2001 article remains locked behind a sick commercial firewall, as does a 1975 electron microscopy [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=1117374 study] of photoreceptor lamellae seen as extensions of modified cilia.
 
Version 4.0 of genome is [http://www.genedb.org/genedb/smansoni/ readily available] for blast though it is missing from GenBank as are two million of the 3.8 million total traces (7x) despite NAID funding. It's unclear whether the [http://bioinfo.iq.usp.br/schisto/ extensive EST set] of 31000 assembled sequences is available there.  The Schistosoma genome is approximately 270 MB with low GC content 34%,  moderate retroposon levels andwith an estimated 15-20,000 coding genes.
 
I determined the intron structure of the published opsin gene (called MEL1_schMan in the opsin classifier) which classifies with melanopsins. Using this as probe, a second full length paralogous opsin MEL2_schMan was annotatable. While percent identity was only 46%, the intron structure and alignment classification were identical. Possibly this second gene has a role in the miracidium, though the first gene is expressed in both stages, more compatibly with "two color non-imaging" eyes. MEL3_schMan is similarly intronated and fairly diverged.
 
The first opsin is more closely related in sequence to the sole known opsin in Schmidtea, RHAB_schMed where it possibly plays a homologous role. As queries, these proteins turn up closest matches at GenBank EST in other platyhelminthes. These observations do not support the notion of horizonal gene transfer of opsins from the host snail, another Lophotrochozoan which by itself might favor sequence clustering. It would be feasible to explore synteny in both platyhelminthes.
 
I investigated conservation of intron position and phase using the reliably intronated match with either MEL1_gasAcu of stickleback minnow (or equally MEL1a_braFlo of amphioxus). Here the percent identity is fairly low (39%)  but enough patches of good matching suffice to reliably anchor the alignment. There is perfect agreement of the first three intron positions and phases, below.
 
This is strong evidence for a very deep connection vertical descent of these genes from a common ancestor (eg, orthology) because these introns are highly specific to melanopsin within the opsin superfamily, ie are not generic GPCR introns as seen from the total mismatch to Ixodes, Apis, and vertebrate ciliary opsins. These same introns are predicted for opsins from transcript species such as LOPH_RHO_plaDum (Platynereis dumerilii) and MOLL_MEL_patYes (scallop). It remains to be demonstrated that all these melanopsins play a conserved consistent homologous role.
 
 
 
[[image:Opsin_loph_mel_introns.png|left|]]
<br clear="all" />
 
== <span style="color: #990099;">Ecdysozoa  .. 5-42 opsins</span> ==
 
This group, which includes insects, other arthropods, and species immediately basal to them, is taken here as the other wing of Protostomia, ie as sister group to Lophotrochozoa. The focus here is on new genomes which have not been so extensively explored as say Drosophila, especially on species that might contribute to reconstruction of the last common ancestor to Ecdysozoa (resp. Protostomia and Urbilatera). Opsins in genomic species have the advantages of determinable exon breaks and flanking syntenic genes, and so better prospects for establishing accurate homological relationships.
 
 
=== <span style="color: #990099;">Panarthropoda: Hypsibius dujardini (water bear) tardigrade 0 opsins</span> ===


A  5x genome project for Hypsibius dujardini, a phylum of microscopic ecdysozoan was approved in July 2007 but Broad has not yet begun trace reads on the small 70 mbp genome (suggesting densely spaced genes with small introns as this is not likely highly derived). It could prove very useful for opsins as [http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/EcdysozoaProposalFinalPDF.pdf tardigrades] are basal to all of Arthropoda and so shed light on that last common ancester. In fact with accompanying centipede, horseshoe crab, amphipod, and priapulid genomes, the whole ecdysozoan ancester will be accessible.  
A  5x genome project for Hypsibius dujardini, a phylum of microscopic ecdysozoan was approved in July 2007 but Broad has not yet begun trace reads on the small 70 mbp genome (suggesting densely spaced genes with small introns as this is not likely highly derived). It could prove very useful for opsins as [http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/EcdysozoaProposalFinalPDF.pdf tardigrades] are basal to all of Arthropoda and so shed light on that last common ancester. In fact with accompanying centipede, horseshoe crab, amphipod, and priapulid genomes, the whole ecdysozoan ancester will be accessible.  
Line 57: Line 214:
Nothing is currently known about photoreception or opsins in tardigrades -- or even if they have eyes. However it looks like we can expect some rhabdomeric opsins at the minumum in front of these pigment cups. However the current GSS and EST collections (about 6000 sequences) do not currently contain any convincing matches using various rhabdomeric and ciliary opsins as tblastn queries. Tardigrade photos and movies provided by [http://www.bio.unc.edu/faculty/goldstein/lab/willow.html Goldstein Lab]
Nothing is currently known about photoreception or opsins in tardigrades -- or even if they have eyes. However it looks like we can expect some rhabdomeric opsins at the minumum in front of these pigment cups. However the current GSS and EST collections (about 6000 sequences) do not currently contain any convincing matches using various rhabdomeric and ciliary opsins as tblastn queries. Tardigrade photos and movies provided by [http://www.bio.unc.edu/faculty/goldstein/lab/willow.html Goldstein Lab]


[[Image:tardi.png|center|]]
[[Image:tardi.png|left|]]
<br clear="all" />


 
=== <span style="color: #990099;">Chelicerata: Ixodes scapularis (tick) 1 opsin</span> ===
* <span style="color: #990099;">Ixodes (tick)</span>


The genome project was completed long ago but has experienced a multi-year bottleneck in assembly release and publication. However contigs built from 19.4 million traces should become available to tblastn of the GenBank "wgs" division by late 2007. Ixodes has a very conservative genome (regretably 2.1 gbp in size), seemingly far less derived than drosophilids in matters such as intron, gene retention,  and protein sequence conservation. This, in conjunction with the helpful phylogenetic position of chelicerate outgroup to the many insect genomes, has improved prospects for reconstructing the ancestal opsin repertoire of Arthropoda and eventually Protostomia and UrBilatera.
The genome project was completed long ago but has experienced a multi-year bottleneck in assembly release and publication. However contigs built from 19.4 million traces should become available to tblastn of the GenBank "wgs" division by late 2007. Ixodes has a very conservative genome (regretably 2.1 gbp in size), seemingly far less derived than drosophilids in matters such as intron, gene retention,  and protein sequence conservation. This, in conjunction with the helpful phylogenetic position of chelicerate outgroup to the many insect genomes, has improved prospects for reconstructing the ancestal opsin repertoire of Arthropoda and eventually Protostomia and UrBilatera.


A large collection of annotated Ixodes ESTs is available at the [http://compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/gireport.pl?gudb=i_scapularis DFCI Gene Index] of which 3 are marked up (2 wrongly) as opsins. Using the [http://genomewiki.ucsc.edu/index.php/Opsin_evolution Opsin Classifier], I recovered the full length gene for the first of these TC19272 on 24 Nov 07, intronated the transcript at the Trace Archives (4 introns, superb coverage), and added it to the [http://genomewiki.ucsc.edu/index.php/Opsin_evolution classifier fasta collection] as RHAB1_ixoSca. It classifies with rhabdomeric opsins (ie with deuterostome melanopsins) with a very respectable 57% maximal percent protein identity. The second and third intron have classical ancestral position (following GWSR and LAK) and phase (2 and 0). Synteny awaits assembly of large contigs -- adjacent exons are not spanned by single traces. An apparent ciliary opsin fragment in Ixodes was located using that of Platynereis dumerilii as probe, it is stored as CILI_ixoSca but needs further analysis.
A large collection of annotated Ixodes ESTs is available at the [http://compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/gireport.pl?gudb=i_scapularis DFCI Gene Index] of which 3 are marked up (2 wrongly) as opsins. Using the [[Opsin_evolution|Opsin Classifier]], I recovered the full length gene for the first of these TC19272 on 24 Nov 07, intronated the transcript at the Trace Archives (4 introns, superb coverage), and added it to the [[Opsin_evolution|classifier fasta collection]] as RHAB1_ixoSca. It classifies with rhabdomeric opsins (ie with deuterostome melanopsins) with a very respectable 57% maximal percent protein identity. The second and third intron have classical ancestral position (following GWSR and LAK) and phase (2 and 0). Synteny awaits assembly of large contigs -- adjacent exons are not spanned by single traces. An apparent ciliary opsin fragment in Ixodes was located using that of Platynereis dumerilii as probe, it is stored as CILI_ixoSca but needs further analysis.


* <span style="color: #990099;">Daphnia (water flea)</span>
=== <span style="color: #990099;">Crustacea: Daphnia pulex (water flea) .. 1-37 opsins</span> ===


An 8.7x genome assembly was released in July 2007 at [http://www.jgi.doe.gov/Daphnia/ JGI] with further support at [http://wfleabase.org/ wFleaBase]. This crustacean provides a potentially [http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=17612412 important outgroup] to insects (together forming Pancrustacea). However the opsin story, summarized in a [https://dgc.cgb.indiana.edu/display/daphnia/Carla+Caceres meeting abstract] is an embarrassment of riches, not conducive to deducing ancestral arthropod genome content. The total number of opsin genes came in at 37, comprised of 22 rhabdomeric opsins (mostly long wavelength), 7 ciliary opsins (pteropsins), and 8 in a novel family without close affiliates. This seems excessive but Daphnia has ommatidia (compound eyes), circadian rhythms, and a need to assess water turbidity and depth. Planned in situ hybridization studies may illuminate biological roles of these opsins. The pteropsins are probably of most interest here.
An 8.7x genome assembly was released in July 2007 at [http://www.jgi.doe.gov/Daphnia/ JGI] with further support at [http://wfleabase.org/ wFleaBase]. This crustacean provides a potentially [http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=17612412 important outgroup] to insects (together forming Pancrustacea). However the opsin story, summarized in a [https://dgc.cgb.indiana.edu/display/daphnia/Carla+Caceres meeting abstract] is an embarrassment of riches, not conducive to deducing ancestral arthropod genome content. The total number of opsin genes came in at 37, comprised of 22 rhabdomeric opsins (mostly long wavelength), 7 ciliary opsins (pteropsins), and 8 in a novel family without close affiliates. This seems excessive but Daphnia has ommatidia (compound eyes), circadian rhythms, and a need to assess water turbidity and depth. Planned in situ hybridization studies may illuminate biological roles of these opsins. The pteropsins are probably of most interest here.
[[Image:Opsin_daphniaJGI.png|left]]


Gene models have not been submitted yet to GenBank but are likely extractable by text query at wFleaBase. What is needed here however is not the clutter of 37 sequences but their collapse into UV, blue, long, pteropsin, and novel ancestral representatives. This would remove 'noise' from lineage-specific expansions. The intron structure could provide very important support to classification schemes.  
Gene models have not been submitted yet to GenBank but are likely extractable by text query at wFleaBase. What is needed here however is not the clutter of 37 sequences but their collapse into UV, blue, long, pteropsin, and novel ancestral representatives. This would remove 'noise' from lineage-specific expansions. The intron structure could provide very important support to classification schemes.  


The expansions may have arisen through retroprocessing (rather than segmental duplication) of a few master exonic genes, which would then be the orthologs to other arthropod opsins. Indeed the intronation pattern -- typically far more [http://genomewiki.ucsc.edu/index.php/Ancestral_introns:_SGSH deeply conserved] than protein sequence -- could link pteropsins more convincingly to lophotrochoan and deuterostome opsins than alignments with percent identities in the 20's.  
The expansions may have arisen through retroprocessing (rather than segmental duplication) of a few master exonic genes, which would then be the orthologs to other arthropod opsins. Indeed the intronation pattern -- typically far more [[Ancestral_introns:_SGSH|deeply conserved]] than protein sequence -- could link pteropsins more convincingly to lophotrochoan and deuterostome opsins than alignments with percent identities in the 20's.  


This blast twilight zone is especially dangerous for photoreceptor opsins because they are embedded in much larger gene family of generic rhodopsin and GPCR which share many structural and signaling properties. A slowly evolving generic rhodopsin might well score higher than fast evolving photoreceptor opsins. Gene expansions are noted for markedly enhanced rates as copies neo- or subfunctionalize. The generic rhodopsin might also share diagnostic residues through convergence at least at the level of statistical signficance ambiguity. Consequently intron location/phase and synteny can provide important backup.
This blast twilight zone is especially dangerous for photoreceptor opsins because they are embedded in much larger gene family of generic rhodopsin and GPCR which share many structural and signaling properties. A slowly evolving generic rhodopsin might well score higher than fast evolving photoreceptor opsins. Gene expansions are noted for markedly enhanced rates as copies neo- or subfunctionalize. The generic rhodopsin might also share diagnostic residues through convergence at least at the level of statistical signficance ambiguity. Consequently intron location/phase and synteny can provide important backup.
Line 78: Line 237:
The synteny circle surviving at this phylogenetic depth will be local (optimistically Pancrustacean). That is, the blue opsin of Daphnia might in synteny with Drosophila (ie establish orthology) but not to Platynereis ciliary opsin much less any vertebrate opsin (eg encephalopsin). This could be remedied to some extent by ancestral gene order reconstruction. The degree to which synteny can contribute to validating orthology relations within opsins is not currently known.
The synteny circle surviving at this phylogenetic depth will be local (optimistically Pancrustacean). That is, the blue opsin of Daphnia might in synteny with Drosophila (ie establish orthology) but not to Platynereis ciliary opsin much less any vertebrate opsin (eg encephalopsin). This could be remedied to some extent by ancestral gene order reconstruction. The degree to which synteny can contribute to validating orthology relations within opsins is not currently known.


* <span style="color: #990099;">Carybdea marsupialis (jellyfish) cnidarian no opsins yet</span>
I found one ciliary opsin for Daphnia in the process of expanding the known one in Anopheles tko crustacea. Stored at the Opsin Classifier as ENCEPH_dapPul, it is potentially an ortholog, as are new ciliary opsins from Culex, Aedes, Tribolium, and Bombyx. However this gene and potentially its photoreceptor structure are missing in Drosophila, Nasonia, and other genomes.
Cnidarians are the earliest diverging invertebrates with multicellular light-detecting organs, called ocelli. Photodetectors include simple eyespots, pigment cups, complex pigment cups with lenses, and camera-type eyes with a cornea, lens, and retina. These remarkable eyes are located on sensory clubs called rhopalia with four lining the bell of Each houses six eyes: a pair of pit ocelli, a pair of slit ocelli, and two unpaired lens eyes with counterparts to cornea, cellular lens and retina of ciliated photoreceptors. Anatomically, ocelli have bipolar sensory photoreceptor cells interspersed among nonsensory pigment cells with the apical end making the light-receptor and the basal end forming an axon that synapses with second-order neurons to form what amounts to ocular nerves. 
<br clear="all" />


The spectral sensitivity of neritic (near-shore) lens eyes of a box jellyfish, Tripedalia cystophora [http://jeb.biologists.org/cgi/content/full/209/19/3758#REF17 recently considered] by MM Coates et al was interpreted as a single vitamin A-1 based opsin with peak sensitivity near 500 nm (blue-green). However nothing was sequenced.
=== <span style="color: #990099;">Panarthropoda: Tribolium castaneum (flour beetle) .. 3 opsins</span> ===


The most striking jellyfish from the perspective of a complex set of eyes is Carybdea marsupialis, as [http://www.biology.appstate.edu/faculty/martinvj.htm reviewed by VJ Martin]. Antibody studies based on vertebrate cone/rod opsins are doubtful because of cross reactivity to generic GPCR proteins; again no opsins have been sequenced yet.. This would make a great genome to study provided the retroposon and base composition are not unwieldy. Nematostella and Hydra, whatever their other genomic merits, sits in the Anthozoa and Hydrozoa respectively, types of cnidarian lacking elaborate visual systems.
[[Image:TriCasEyes.png|left|]]


Vision has roles in the reproduction and feeding of cubomedusae which can find each other and chase, catch, and eat teleost fish. A patch of Pelagia nocticula 10 square miles in extent and 35 feet deep recently destroyed a salmon farm off Northern Ireland.
The red flour beetle, which is highly dark-adapted in lifestyle, has lost its blue opsin according to both the [http://www.nature.com/nature/journal/v452/n7190/full/nature06784.html newly published genome project] and specialized experimental querying, retaining the other two ancestral color vision opsins and encephalopsin (which is called pteropsin in insects though likely a strict ortholog). The Tribolium genome article [http://www.nature.com/nature/journal/v452/n7190/extref/nature06784-s1.pdf 110 page supplemental] contains an excellent Table S14 of all known genes involved in insect eye development.


* <span style="color: #990099;">Amphimedon (sponge) and earlier metazoan lack opsins</span>
Insect opsins are expressed non-uniformly across individual eye units (ommatidia) within compound eyes. In Drosophila, six peripheral photoreceptor cells R1-R6 express LW opsin which detect brightness, projecting into the upper optic neuropil (lamina). Central photoreceptors R7 and R8 provide color vision via UV, blue, and LW opsins that project into the second (medulla). The dorsal rim area ommatidia are modified to detect polarized light.
Sponges lie at the base of multicellular animals. They are not noted for eyes. However demosponge larva do exhibit phototaxis (shadow seeking under coral rubble) but the action spectrum is [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=11976887 reportedly] a better fit to a flavin or carotenoid chromophore. The genome of Amphimedon queenslandica has been available for years at the Trace Archives but never assembled. The species was formerly called Reniera spp. and it is still carried under that name at [http://www.jgi.doe.gov/sequencing/why/CSP2005/reniera.html JGI Genome]. Consequently tblastn of contigs is not available without do-it-yourself assembly.  


The Oakley group reported [http://www.plosone.org/article/info:doi/10.1371/journal.pone.0001054 searching] for sponge opsins but finding only "non-opsin, rhodopsin-class GPCR genes" from Amphimedon. Similarly, no opsins were located in the even earlier diverging placozoan Trichoplax, choanoflagellate Monosiga, and fungal genomes. This fits a picture of photoreceptor opsins first appearing subsequent to sponge in eumetazoa cnidarians. However these were not de novo genes but rather evolved out of the already-rich cauldron of GPCR gene copies.  
The [http://www.frontiersinzoology.com/content/4/1/24 comparative genomics of ommatidia number and opsin utilization] is indicated in the figure. Opsin gene loss raises different issues, namely replacement, from the more familiar gene gain issues (differential rewiring). After discussing various sequential mutational scenarios and the necessity of each step being adaptive or at least near-neutral, Jackowska et al settle upon expansion of LW opsin expression into all photoreceptor cells, resulting in co-  
expression with blue opsin in some R8 cells and UV-opsin in R7cells. This is followed by loss of expression or pseudogenization of blue opsin. Although co-expression defeats the purpose (via spectral summation) of separate opsins that enable color vision, there are precedents in butterflies and (typically nocturnal) vertebrates.


Some later diverging species such as the model organism C. elegans lost all of their opsin genes, making them useless in Urbilateran ancestor reconstruction. This argues for much more intensive genomic sampling so as to sidestep the widespread problem of gene loss in model organisms.
[[Image:Opsins_tribolium.png|left|]]
 
<tt> -=-=-= coming real soon! -=-=- </tt>
It's also known how Apis and Manduca (also genome project species) end up with nine photoreceptor cells per ommatium instead of eight -- it's due to duplication of R7 cell fate (across all ommatidia). That raises the interesting question of whether such cell duplication simply results in duplication of opsin expression at the molecular level. That's not the quite the case today because the two central R7-like cells exhibit differential opsin expression. It's not known whether additional mutations were needed to attain this.
 
In summary, insect genomes are fairly straightforward in terms of their contribution to establishing the ancestral arthropod visual system, but their real value lies in the extensive comparative data available within Insecta, ecological studies of adaptive vision, and the experimental genetic opportunities within Drosphila (eg a recent article exploring [http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0060097 deviations] from ommatidia expressing but a single opsin). However no single insect genome can serve all purposes because of gene loss (eg ciliary opsins in Drosophila).


* <span style="color: #990099;">Ciona (tunicate) urochordate opsins</span>
That's also the case for non-opsin GPCR which have gained a new importance given the possibly paraphyly of the opsin gene tree (ie some opsin gene duplicates may have given up retinal to signal via other agonists). Here we are fortunate to have a [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=18054377 genome-wide inventory] of neurohormone GPCRs in Tribolium. This turns up 20 biogenic amine GPCR (21 in Drosophila, 19 in  bee), 48 neuropeptide GPCR (45 in Drosophila,35 in honey bee), and 4 protein hormone GPCRs (4 in Drosophila, 2 in bee) with likely ligands for 45 of the 72 Tribolium GPCR. The flour beetle retains an ancestral vasopressin GPCR and cognate peptide unlike other studied insects which are not adapted to such an extremely  dry environment. On the other hand, Tribolium lacks allatostatin-A, kinin, and corazonin. This covers comparative genomics of 340 million years of insect GPCR evolution -- it is very common for new agonist/receptor couples to arise and old ones to disappear. Again we see genome density sampling will need to be high to sort out Urbilatera.
<br clear="all" />


* <span style="color: #990099;">Branchiostoma (amphioxus) cephalochordate opsins</span>
=== <span style="color: #990099;">Panarthropoda: Pediculus humanus (louse) .. 3 opsins</span> ===


* <span style="color: #990099;">Stronglyocentrotus (sea urchin) echinoderm opsins</span>
[[Image:Opsin_louse.png|left|]]


* <span style="color: #990099;">Saccoglossus (acorn worm) hemichordate opsins</span>
The body louse genome, being favorably small at 108 Mbp, is well along with 2.2 million traces and a contig assembly hopefully disentangled from its [http://aem.asm.org/cgi/content/full/73/5/1659 endosymbiont bacterium.] Sequencing is [http://www.vectorbase.org/sections/Docs/org_docs/phumanus/BodyLouseGenomeWhitePaper.pdf medically motivated.] The lifestyle of this hemimetabolous (nymph-like adult, no pupal stage) insect does not suggest a full spectrum of metazoan photoreceptors; indeed we shall find but 3 opsins. Even that seems a lot for a [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=3174177 single lateral ocellus of 130 rhabdomeric photoreceptor cells] lacking Semper and dedicated pigment cells. The broader interest here is intronation and synteny of these opsins (hence orthology), not available in many insects with opsin studies. It requires quite dense sampling to get ancestral introns for each arthropod opsin class because high rates of intron gain and loss can occur.


* <span style="color: #990099;">Xenoturbella plus Convoluta no opsins yet</span>
I reconstructed 3 multi-exon louse opsin genes on 24 Dec 07 by tblastn of numerous queries against GenBank wgs database division. These apparent rhabdomeric imaging opsins are stored in the Opsin Classifier as INSE_LWS_pedHum, INSE_UVV1_pedHum, and INSE_UVV2_pedHum. Louse otherwise seems a gene loss story in terms of relic ciliary opsins or even melanopsins so not especially favorable for retention of ancestral characters. The new opsins potentially provide trichromatic color vision to the louse in the short, blue, and long wavelength photoreception regimes, though lambda max awaits experimentation as the second ultraviolet opsin could be either re-tuned or co-opted for some other function, [http://jeb.biologists.org/cgi/content/full/208/12/2347 as in bumblebee] where a UV opsin is expressed in proximal lamina rim, antennal lobe, central complex and protocerebrum clusters. That seems likely because INSE_UVV2_pedHum is back to ancestral tyrosine in (bovine rhodopsin) position E113 whereas true ultraviolet insect opsins all specify phenylalanine here (which relaxes lambda max into the ultraviolet, ie closer to that of free retinal).


* <span style="color: #990099;">Schmidtea (planaria) opsins</span>
CA Hill of the louse genome annotation team discussed 3 opsins back in a June 2007 email session, calling PHUM001073 perhaps an ultraviolet opsin while rejecting a fourth PHUM000074. These gene models are not released to GenBank nor is that terminology used in the meagre search capabilities of [http://phumanus.vectorbase.org/SequenceData/Genome/ P. humanus VectorBase.] Upon whole proteome file download, PHUM001073-RA turns out to be an unintronated dna fragment matching residue 44 to stop codon of INSE_UVV1_pedHum. PHUM000074-RA has nothing to do with opsins. PHUM005795-RA is missing the first 49 residues of INSE_LWS_pedHum but otherwise identical. PHUM001044-RA is a fragment beginning at residue 55 of INSE_UVV2_pedHum. In short, it's hard to find full length genes without benefit of the Opsin Classifier, cdna, or ab initio gene predictor.


* <span style="color: #990099;">Platynereis (polychaete) lophotrochozoan opsins</span>
[[Category:Comparative Genomics]]


* <span style="color: #990099;">Capitella (annelid) lophotrochozoan opsins</span>
=== <span style="color: #990099;">Panarthropoda: Rhodnius prolixus (kissing bug) .. 2 opsins</span> ===


* <span style="color: #990099;">Lottia and Aplysia lophotrochozoan opsins</span>


* <span style="color: #990099;">Nematostella (anemone) cnidarian  opsins</span>
Yet another genome project completed long ago at the trace level but sitting around unassembled. In August 2008 some 6,879,098 trace reads and 16,284 EST sequences were available. This number of traces is more than adequate for a good assembly but for now, opsins must be fished out by exon by exon using blastn at the trace archives.


* <span style="color: #990099;">Hydra (hydra) cnidarian opsins</span>
This delay is unfortunate: Rhodnius prolixus, a large blood-sucking hemipteran insect that is carrier for a parasitic protozoan (Trypanosoma cruzi) responsible for Chagas disease through bites around the eyes and mouth. Chagas disease is a currently  incurable tropical disease that damages the heart and nervous system. Rhodnius is nocturnal, with possible implications for its opsin repertoire, but becomes active at night. It is found in  South and Central America, primarily in domesticated rural areas, currently affecting 16-18 million people and killing around 20,000 people annually. Darwin is [http://en.wikipedia.org/wiki/Charles_Darwin's_illness sometimes claimed] to have suffered from Chagas disease as a result of a bite (implausibly in northern Argentina) reported in Voyage of the Beagle diaries


Rhodnius clearly has two distinct opsins but no apparent ciliary pteropsin or melanopsin. The first is a long wavelength sensitive gene most closely related (84% identity) to Tribolium but whose intronation pattern is closest to Apis (a phase 00 intron is missing in Rhodnius). Thw second Rhodnius opsin classifies with insect UV opsins, most closely to that of Apis.
<pre>
>INSE_LWS_rhoPro Rhodnius prolixus (kissing_bug) frag missing first and last exons
0  2
1 YQFPPLNPLWHGILGFVIGVLGIISIVGNGMVIFIFSSTKTLRTPSNLLVVNLAFSDFLMMFTMSPPMVINCYNETWVL 1
2 GPLMCELYGMLGSLFGCASIWTMTMIALDRYNVIVK 0
0 GISAKPMTNKTAMLRILLVWAFSIMWTVFPFFGWNR 2
1 YVPEGNMTACGTDYLTKNWVSRSYILVYSVFVYFLPLFTIIYSYFFILQ 0
0 AVSAHEKQMREQAKKMNVASLRSAEAANTSAEAKLAK 0^0 VALMTISLWFMAWTPYLVINYSGIFETISISPLFTIWGSLFAKANAVYNPIVYAIR 2
1  *0


[[Category:Comparative Genomics]]
>INSE_UVV_rhoPro Rhodnius prolixus (kissing_bug) fragment, novel introns, 67% identity Apis
0  0
1 ASTSGNIRTLGWNLSPEDLKHIPEHWLSYPEPEPILNYALGVLYIFFMLIALIGNGLVIWIFST 2
1 AKTLRTPSNIFVVNLAICDFLMMSKTPIFIYNSFKLGYALGHRACQIFALLGSFSGIGASATNAVIAYDRYR 2
1 ERFSTKCTFDYLTPTSEIRNFV  MSLIIYFYSQIVSHVIIHEHNLREQ 0
0 AKKMNVESLRSNANMHTQSAEIRIAKAAITICFLFVASWTPYAVLALIGAYGNQ 2
1 DLLTPAVTMIPACACKAVACVDPYVYAISHPRYR 2
1 QELSKKFPWLDIKEAPAPSSVDANSTATEMTLPTQTSPAEA* 0
</pre>

Latest revision as of 19:15, 20 January 2010

This page and its updates have moved here to improve content organization.

Key Critters: introduction to genome projects opsins

Some species such as drosophila have lost all ciliary opsins -- clearly this class of genes is not essential for a successful visually complex flying insect with 5-color vision and circadian rhythm (as one might have assumed from vertebrates). Other protostome lineages such as nematodea (eg Caenorhabditis elegans) function successfully without any vision at all, making this 'model organism' completely irrelevent to the evolutionary study of vision.

However bees, annelids, and mammals retain ciliary opsins so it follows -- pervasive, detailed convergence at the molecular level being impossible -- this must be the ancestral bilateran state state. In turn that suggests ciliary opsins in cnidaria and indeed that has been recently established in the lensing eye.

When the eye is reduced to a single pigment cell backing a single photoreceptor cell, the opsin of that species may be expressed only in one cell of the entire body. In this situation, the opsin may never show up in transcript collections, even with subtraction of common ones. One sees the importance of complete genomes here (versus transcripts or immunostained sections alone): absence of ciliary opsin evidence in a genome is truly evidence of ciliary opsin absence.

Vertebrates could never have evolved ciliary opsin vision had the bilateran ancestor possessed the limited opsin repertoire of fruit fly. Thus the most pressing question is -- assuming rhabdomeric opsins were thoroughly entrenched in the earliest bilateran imaging eyes and photoreception systems -- what kept ciliary opsins around in early bilatera? Recall early diverging deuterostomes (xenoturbellids, urchins, acorn worms, tunicates, and lancelets) lack imaging vision -- that emerged in full modern form on the lamprey stem.

Conversely, assuming cnidaria use ciliary opsins, what kept rhabdomeric opsins around so that they could later be co-opted by protostomes for their form of opsin-based vision? Evolution is strictly 'use it or lose it' over these time frames. Here cnidaria, or at least their larva, may also use rhabdomeric opsins. It seems that both classes of opsins have retained roles in most species, but very different classes were promoted to the imaging role in different branches of Bilatera. In fly, ciliary opsins have winked out; in nematode, both ciliary and rhabdomeric opsins are gone. While irrevocable, these losses would scarcely receive comment in non-model organisms.

It's important to understand contemporary representatives of early diverging species (relative to the sequence of divergence nodes leading to human) are not archaic failed experiments nor primitive living fossils frozen in evolutionary time. Quite the contrary, all surviving extant species are equally successful and fully modern -- the tree of life is right-justified. Indeed their genes, regulatory signalling systems, and enzymes may be more finely honed than slowly evolving mammals because of more rapid evolution attributable to larger effective population sizes, reproductive mode, short generation time, and marine selective predatory pressures.

However we can still hope that ancestral character traits will still be reflected to some extent in these earlier diverging species and that with enough complete opsin repertoires from taxonomically appropriate species, ancestral genes and even whole visual systems can be reconstructed at key ancestral nodes on the phylogenetic tree. The story describing the evolution of the human eye then amounts to describing its status at these successive nodes with perhaps interpolative speculation between them. Definitely limits to knowledge exist because living metazoans provide only 35 nodes between sponge and human -- gaps between nodes may average 30 million years but can greatly exceed that (eg 135 myr between bird and platypus). This is offset by the occasional proposal for new deuterostome branches (Xenoturbella, Convoluta) or basal metazoan (Ctenophores.

The ideal set of genomes needed to study the evolution of the metazoan eye is only partly completed, underway, or not even proposed yet. In some cases, the genome size of clade representatives is so large (eg lungfish at 25x human) the species may never be sequenced, though satisfactory opsin transcripts could still be obtained. In others, the rate of evolution has been so fast so long that very little information about photoreception at ancestral nodes has been retained (eg the tunicate Oikopleura). Hagfish opsins, which would conveniently break up the crucial lamprey long branch, are not available at GenBank but here the animal has adopted a deep water habitat, meaning that its cone opsin genomic repertoire will be highly reduced, if not gone entirely, in its markedly degenerated eyes, though whatever remains of its opsins could still be informative.


MoreBilatGenes.png

The impact of adding more genomes is to uncover more genes of the common bilateran ancestor that were masked by lineage-specific losses. Recall the beatle genome Tribolium uncovered 126 additional genes absent in other insect genomes but nonetheless present in human. Humans themselves of course have lost hundreds of genes even relative to the first land animal, so here too we need to pool mammalian and amniote gene pools to reconstruct that ancestor.

Model organism choices do not always coincide with genome sequenceability, transcriptome projects, nor (worst of all) with more slow-evolving and less derived species. Finally, most sequencing speaks to narrow anthropocentric interests, whereas the sequencing need more broadly conceived is greatest farther back (to break up long branches). The evolution of the eye needs a rather different portfolio of genomes than a typical human disease gene because of the earlier intrinsic timing of the innovative events. In fact, one product of the investigation here is to spell out these needed genomes. Of course one obvious genome choice are cubomedusan jellyfish with their 24 eyes of 6 types.

It's worth reviewing genome status and recent experimental literature on key species. While abstracts are readily available at PubMed, access to free full text is unpredictable, so those links are collected when available. It suffices to reference only recent articles because those in turn cite the earlier literature and citation in turn of their paper are collected by Google Scholar (or AbstractsPlus at PubMed). Most opsin sequences in the Opsin evolution reference sequence collection have a PubMed accession as a field in their fasta header database; those can simply be compiled to an active link that opens all of them in one PubMed window.


OpsineyePhylo.png


Figure adapted from: Acoel Flatworms Are Not Platyhelminthes: Evidence from Phylogenomics (H Philippe et al PLoS ONE. 2007 Aug 8)


Deuterostomes moved to separate article

The key critter article has been broken down into 3 smaller articles -- deuterostomes are now here.

Chondrichthyes: Callorhinchus milii (elephantshark)         13 opsins
Agnatha:        Petromyzon marinus (lamprey)                 9 opsins
Agnatha:        Eptatretus burgeri (hagfish)                 0 opsins
Urochordata:    Ciona intestinalis (tunicate)                4 opsins
Echinodermata:  Stronglyocentrotus purpuratus (sea urchin)   6 opsins
Hemichordata:   Saccoglossus kowalevskii (acornworm)         1 opsin
Deuterostomia:  Xenoturbella bocki + Convoluta pulchra       0 opsins

Cnidaria and Porifera moved to separate article

The key critter article has gotten too large -- cnidaria are now here.

Cubozoa: Tripedalia cystophora .. 1 ciliary opsin
Cubozoa: Carybdea marsupialis (jellyfish) .. probable opsins
Anthozoa: Nematostella vectensis (sea anemone) .. claimed opsins
Hydrozoa: Hydra magnipapillata (hydra) .. claimed opsins
Hydrozoa: Cladonema radiatum (jellyfish) .. claimed opsins

Porifera, Placozoa, Choanoflagellates .. 0 opsins

Lophotrochozoa: 13 opsins

Opsin lopho larvae.png

This is a monophyletic group (in the mind of evo-devo practitioners) of bilaterans reflecting a [basal split deep within protostomes. The classification is based both on molecular considerations and a shared larval form with ciliated wheel, in contrast to characters of adult animals such as segmentation.

Lophotrochozoa is not recognized at GenBank so blast searches cannot be restricted to Lophotrochozoa. However Entrez and PubMed searches can be so restricted using boolean queries. In terms of genome projects Lophotrochozoa currently consists of 7 species of flatworms, molluscs, and annelids. However, it also contains Brachiopoda, Bryozoa, Entoprocta, Nemertea, Sipuncula, etc which collectively account for less than 3,000 of the 5.7 million nucleotide sequences at GenBank and no annotated opsins.

The Lophotrochozoa have not been surveyed as a whole for those that might be 'living fossils' in terms of opsins and photoreceptor structures. Even those would not necessarily make good genome projects because of genome size and compositional issues. However Annelida has been thoroughly considered by Purschke, Arendt et al in a recent offline, off-Pubmed review (Arthropod Structure & Development 35(2006) 211-230).



Annelida: Platynereis dumerilii (ragworm) .. 3 opsins

This small annelid may be an emerging model organism, though plans for genome sequencing in France have apparently collapsed. (Indeed all metazoan genomic sequencing in Europe has ceased.) Three recent papers have established that Platynereis qualifies as living fossil, at least with respect to ancestral anatomy and development, slowly evolving protein sequences and retention of genes and ancestral introns, and further has retained ciliary opsins.

That is to say, fruit fly and nematode have proven unfortunate choices because of so many lost genes and signalling pathways, rapid evolution and highly derived characters. Lophotrochozoa may thus give us very significant insight into the bilateran ancestor that had appeared lost from consideration of just Arthropoda. It should be noted though that not all insects should be written off, Anopheles has also retained ciliary opsins.

Opsin platynereis.png

Platynereis develops various pairs of eyes going by localization of opsin expression: inverse larval eyes used in phototaxis (just one pigment cell and one photoreceptor cell) and two pairs of everse adult eyes needed for adult vision. These originate from an initially unsplit single anlage. These eyes use exclusively rhabdomeric photoreceptor cells and corresponding rhabdomeric-class opsins as expected from phylogenetic position. However two paired structures in the developing median brain dorsal to the apical organ express an opsin that unambiguously classifies as ciliary. Further, a retinal homeobox (specific to ciliary pineal eyes) and circadian rhythm regulator bmal are also expressed at this location in Platynereis. However the pigment cells necessary for directional photoreception are missing. This all fits with a role for the ciliary opsin as the primary receptor underlying circadian rhythm which does not require directionality.

The emerging picture is Urbilatera having both ciliary and rhabdomeric structures. The later specialized structure was lost but the photoreceptor component retained in vertebrates in the form of melanopsins expressed in retinal ganglion cells.

Remarkably, Platynereis contains a second ciliary opsin next to alpha tubulin: Using the initial ciliary opsin (a transcript with unknown intronation) as probe at various GenBank databases, a genomeWiki contributer found a 171,779 bp survey sequence in the high throughput genomic sequence HTGS division (meaning it would be overlooked using Blast of the nucleotide division) had a good match in the unannotated contig CT030681, submitted 05-DEC-2005 by Genoscope as 6 ordered contigs (the last of which proves reverse-complemented).

This second opsin, being genomic, after difficult recovery of full length gene from a moderate match, could be intronated (unlike the original transcript) assuming GT-AG splice junctions (like 99% of all genes and 100% of all known opsins). These introns had positions and phases identical to ciliary -- but not Go or Gq -- deuterostome opsins. Assuming the first opsin is not derived as a processed retrogene from the first, it can be intronated via homological alignment. These are stored in the Opsin Classifier as CILI1_plaDum and CILI2_plaDum, resp.


Opsin parallels.png

Using the second opsin as blastp query against our phylogenetically dispersed collection of 225 hand-curated Eumetazoan opsins (including new cnidarian ciliary opsins), it classifies in the encephalopsin-to-pinopsin area in accord with independent classification by intron pattern and close homology with the experimentally characterized Platynereis first opsin. The percent identity to deuterostome opsins is not only quite high (considering the immense round-trip time since common ancestor) but also overwelmingly concentrated on invariant and near-invariant amino acids characteristic of ciliary opsins. Thus this second Platynereis opsin cannot be a pseudogene (unless that happened yesterday or so).

For purposes of conserved synteny [eg establishing orthology to related opsins in other lochotrophoran genomes], other coding genes on this contig using blastx vs metazoan proteins) can be considered. The only other gene is alpha-tubulin, at positions 124517-122811, downstream from the second ciliary opsin at 46848-87956 using original contig ordering.

Recall the Arendt group used antibody to acetylated alpha-tubulin was used as marker for stabilized microtubules in cilia and axons. They needed the sequence for that. Probably the larger contig was then sequenced as part of the genome feasibility survey. There was no particular reason to look at this contigs for opsins at that time, which would be hard to distinguish from abundant non-photoreceptor rhodopsin-superfamily genes or generic GPCR.

Supposing Platynereis has 15,000 coding genes, this is quite a coincidence to have two genes adjacent that might be critical to the same photoreceptor structure. If these two genes are transcribed divergently (lie on different strands) after fixing (reverse-complementing) the last contig piece, then symmetric transcriptional regulatory element DNA (read the same whichever strand), this could mean the second opsin is tethered to alpha-tubulin production in terms of co-expression in some cell types. Transcribed in the same direction is less attractive as operons are rare in eukaryotes, though read-through is not unheard of and that too could be developmentally regulated in extent.

Re-assembly CT030681 using multi-exon bridging is possible. It turned out pieces 1 and 2 were irrelevent, piece 3 had exons 1,2,3 of the opsin on the plus strand, piece 4 had opsin exon 4 and 5 on the minus strand to piece-coordinate 41,899 for the stop codon. This piece also contains the first three exons of alpha tubulin also on the minus strand beginning at 36,767. Its initial methionine is stranded as a solitary phase 0 codon on the end of 5' UTR, 36,707-05. The remaining two exons of alpha tubulin are on the minus strand of piece 5.

Joining piece 3 with reverse-complemented pieces 4 and 5 then fixes orientations to the plus strand and establishes intron sizes subject to the two strings of Ns. This results in parallel gene order CILI2_plaDum+ TUBA_plaDum+, that is tubilin downstream of the opsin with an intergenic gap of 5,132 bp. If there is any coordination of expression by read-thru, on the upstream end it would have to involve the regulatory regions of the opsin.

The fifth exon of CILI2_plaDum has too weak match with that of CILI1_plaDum to be found by conventional searching. However the dna where it has to be located is squeezed between exon 4 and the start of tubulin, reducing query size. Blastx of that dna against the full-blown set of opsins turns up a consistent match candidate in frog and skate opsins. Looking at the intron phasing validates the match since the splice acceptor AG is 1 of 16 dinucleotides, the phase 0 required by exon 4 (and ancestral ciliary phase) is 1 of 3 possible phases, and 1 of 2 strand requirement have together a 1 in 96 chance of random occurence, more than sufficient in conjunction with the blast expectation of 1.1e-06.

This opsin if co-expressed with CILI1_plaDum would amount to 'circadian rhythm color vision'. Alternately it might be expressed at a different developmental stage or in an unsuspected auxillary photoreceptor.

Annelida: Capitella sp (marine worm) .. 2 opsins

Capitella is a small segmented benthic marine worm most closely related, in the genome project sense, to its fellow annelid Platynereis. The taxonomy of the genus Capitella was thoroughly muddled by a quaint 1976 starch gel electrophoresis allozyme study; Linnean nomenclature has never been developed for the 6 alleged species defined there. The isolate used in the JGI genome project is called Capitella sp. I ES-2005 instead of Capitella capitata.

The last of 3,709,316 trace reads were taken in Nov 2005. As with Lottia, a multi-year lag ensued in release of the assembly, deposition in GenBank, and publication of central paper. As of Dec 2008, the only access to the genome is through JGI Blast. The genome is small at 240 Mb and distributed across 10 chromosomes.

This is a subsurface deposit feeder associated with organic-rich mud, seemingly not conducive to an extensive visual system. However an extensive 1993 study of both larval and adult eyes was published in the now-defunct Journal of Morphology (online acces $25). Developing larva hava a pair of eyespots consisting of one sensory cell, one pigment cell, and one support cell. The photoreceptor cell has an array of parallel microvilli with cisternae. It is surrounded by a diaphragm formed by a pigment cell ring of microvilli-like structures. These last but a few days because at metamorphosis the larval eyespots are greatly reduced. Adults have one pair of eyes built of 2-3 pigment cells and one sensory cell in juveniles increased by 2-3 more in adults.

Unusual morphological aspects of Capitella eyes can be placed within the overall context of photoreceptor cells and eyes in Annelida, whose ultrastructural issues were carefully reviewed by Purschke in an off-PubMed journal "Arthropod Structure & Development" v35:4, 2006 (viewing issue full text costs $175). In addition to rhabdomeric and ciliary types, less-known phaosomous photosensory cells are discussed. Phaosomes (Greek: phaos = light, soma = body) were first described in the earthworm dermal photoreceptors as a central intracellular cavity (phaosome) filled with microvilli but may represent a derived form. They occur at various extraocular sites such as dermus and genitalia (in butterflies). Multiple types of photoreceptors thus provide a potential role for the diversity of opsins observed in the genome.

It's clear from Purschke's review that photoreceptors require a combination of ultrastructure, transcript expression mapping, and genomics. In other words, it's necessary to account for all the opsins found in the genome. Many photoreceptors have been overlooked entirely, notably the undirected type (no pigment cell backing); many others have stalled out in controversy for lack of gene availability.

I found a number of related opsin fragments in Capitella using various queries but surprisingly no counterpart to Platynereis ciliary opsins. One, stored as MEL1_capCap, clusters consistently with melanopsins and shares two exon breaks. It may be an ortholog of the rhabdomeric Platynereis opsin. The second MEL2_capCap is more distantly related. Reliable full length genes will require a cdna program which so far is totally lacking.

Opsin capitella.png


Annelida: Helobdella robusta (leech) .. 2 opsins

The JGI genome project for the leech Helobdella robusta is well along with 3,168,749 traces, a very recent assembly to blast, but no cdna. The genome is fairly small at 300 Mb but does not appear reduced in terms of gene count. Fifteen unannotated 100 kbp contigs are available at the HTG division of GenBank; these do not contain opsins but might otherwise suggest gene and retroposon densities and extent of synteny retention. The genome had not been submitted to GenBank by Dec 07.

Opsin helobdella.png

Helobdella could be considered a promising emerging experimental system because techniques such as large-scale whole-mount in situ hybridization screening, RNA interference, and morpholino knock-down are established. It's not clear however that leech retains the degree of ancestral characters as nereid polychaetes. Until a cdna program is established, it will prove very difficult to annotate complete coding genes. The nearest species with a transcript program is the earthworm Lumbricus rubellus with 19,934 ests (but no opsins).

Helobdella is a rhynchobdellid, which is to say (ελεο marsh, ῥύγχος snout, βδελλα leech) a California marsh leech with a muscular straw-like proboscis in a retractable sheah for puncturing prey. Thus it is not closely related to the medicinal leech, Hirudo medicinalis. The anatomy of the closely-spaced single pair of eyes was intensively studied 40 years ago. An eye in this group consists of 30-100 photoreceptive cells in a deep pigment cup providing directional vision. Larvae are not free swimming but stay in the albuminous fluid of a cocoon. The 88 Pubmed articles include many on body plan gene expression but only two on eyes and these tangentially. We can only hope the genome project will stimulate additional studies of leech photoreceptors. It seems that every lab uses a different strain if not a different species.

I recovered two Helobedella opsin genes on 4 Dec 07 from the erratic JGI server (if no matches, close and restart with a fresh window). The full length gene, stored as MEL2_helRob has 2 conserved introns characteristic of melanopsins and its best matches there. It is likely an ortholog of a similar gene in Schistosoma, Schmidtea, Capitata, and Platynereis. The 231 aa fragment stored as MEL2_helRob has best match to octopus and chordate melanopsins and shares the first (and possibly second) intron position and phase with them. The parent scaffold 39 may contain tandem opsins or alternatively represent a misassembly. No counterpart to the ciliary opsin of Platynereis emerged. That gene -- which must have been present in the common ancestor with annelid -- could have been lost or is simply missing from the current assembly.


Mollusca: Aplysia californica (sea hare).. 2 opsins

Aplysia has a pair of cephalic dorsal pit eyes just anterior to the rhinophores. The eyes are quite small at 600 microns diameter, with a spherical lens and a tiny one square millimeter retina with approximately 7000 rhabdomeric photoreceptors. Despite a fair number of studies of eyes and rhinophores involved in vision, circadian rhythm and phototactic head-waving, the opsins have not been characterized beyond immunoblot (positive for etinal photoreceptors, rhinophores, cerebral ganglia and ventral abdominal ganglia giant cell R2). There is evidence for G protein alpha subunits Gq, Gi, and Go families, phospholipase C, and an inositol 1,4,5-trisphosphate receptor in the rhinophore but this may be for chemoreception.

The sea hare genome has recently be sequenced by Broad Institute. Sizeable assembled contigs are now open to tblastn at the "wgs" division of GenBank (which allows the exon pattern to be extracted). Despite the assembly, sequencing continues: 212,159 new traces were added in the last week of Nov 07. This illustrates the need to always check the primary data repository when a gene seems missing -- millions of traces might not be used in the assembly. However a close-in query is needed to get a match.

I located the first known Aplysia opsin in the 20874 bp contig AASC01108363 on 2 Dec 2007. It had a significant expectation value (e-60) but the best match percent identity within the opsin reference collection (to fellow mollusks) was only 118/319 (36%). Otherwise the best matches are consistently vertebrate melanopsins. This gene is a strong candidate for an invertebrate melanopsin ortholog. It is stored as MOLL_MEL_aplCal.

Indeed, there are four exons but precise boundaries are difficult to locate at this low percent identity without cdna or reliably intronated guide sequence from a closely related species. However 2 introns clearly have identical position and phase to vertebrate melanosins and a 3rd quite likely; otherwise there has been intron loss in Aplysia. The contig unfortunately does not contain any information (according to blastx) on adjacent genes (synteny) despite 10 kbp still available 3'. No counterpart to the Platyerneis ciliary opsin could be found.

On 28 Dec 07 I located a full length peropsin PER_aplCal, a likely ortholog (from exon breaks and best-blast) to squid retinochome which has an excellent structural model and counterion study. The Aplysia peropsin is well-represented with 11 transcripts from pedal-pleural ganglia, CNS (adult and juvenile 1), metacerebral cells, and MCC metacerebral neurons but only terminal exons are found in the assembly. However the cdna provide a window to the trace archives which allows accurate intronation of the full gene.

It is not at all clear what relationship these lophotrochozoan peropsins have to deuterostome peropsins, nor why they seem missing altogether in ecdysozoa, nor what their ancestral status is. The 3 molluscan peropsins cluster cleanly enough with vertebrate peropsins but overlap only partially in intron placement. That could result from relatively recent intron gain and loss or reflect a much deeper ancestral splitting of peropsin classes. Representatives of these may survive more completely in echinoderms, hemichordates, and cephalochordates. Peropsin may very well be capable of ciliary opsin type signaling with trans-retinal as agonist.

At this point, Aplysia is not a Rosetta stone for opsin evolution. It is however the first mollusk with a genome assembly. This may eventually allow confident transfer of orthology validated by synteny, intron pattern, and indels. The eyes appear homologous in many aspects to those of Arthropoda supporting the common ancester of Protostomia having rhabdomeric lensing eyes, though true across-the-board homology of all eye components is a very complex subject.

Opsin aplysia.png


Mollusca: Lottia gigantea (limpet) .. 2 opsin

The limpet Lottia gigantea was intended to be the first lophotrochozoan for whole genome sequencing but that goal slipped. It has ancestral-like spiral cleavage and trochophore larva. The genome is small relative to other molluscs at 500 mbp. Some 5.3 million traces were sequenced by May 2005. In Jan 2007 the sequencing center presented the genome at a meeting talk. However by Dec 2007 no paper had appeared. Recently JGI enabled blast of the assembly and display on their funky browser. However nothing was submitted to Genbank. JGI predicts 4 rhodopsins for its KOG gene collection; however none are recognized by the Opsin Classifier. No transcripts are available, though other molluscs have numerous ests. A German group suggests that the genome sequenced was in fact Lottia scutum.

Under these circumstances, I annotated two Lottia melanopsin in Dec 07, MEL1_lotGig and MEL2_lotGig. Their best match is to other Gq-coupled molluscan opsins, with the first probably an ortholog. Both genes have 3 exons with the two splice positions and phases identical to those of melanopsin (which in vertebrates has numerous other introns). A long run-on carboxy terminus is also seen here. It needs to be established whether these introns are ancestral generic GPCR introns or diagnostic and informative of melanopsins as a gene class. No counterpart to the ciliary opsin of ragworm was immediately apparent.

On 28 Dec 07, I recovered a peropsin, PER_lotGig, very likely orthologous to a peropsin in squid (called retinochrome there) and Aplysia (PER_todPac, PER_aplCal). Extensive structural and experimental evidence is available for squid which likely transfers over, notably the Glu181 counterion proposed ancestral. The Lottia and Aplysia peropsins are intronated identically and by inference the squid. However these differ in some respects from chordate peropsins, suggesting either intron gain or loss or alternately a small 'cloud' of ancient peropsins that were intronated slightly differently in early metazoa.

Lottia is not emerging as a model organism. There are only a handful of studies at PubMed and none on vision. The adult limpet has a pair of eyespots at the base of its cephalic tentacles that likely house a rhabdomeric opsin, perhaps the one annotated here. There may be a second role for paired eyespots in the free-swimming larva for those five days (thoroughly reviewed for chiton trochophores by Arendt and Wittbrodt but not Lottia specifically). Circadian rhythm might involve an additional opsin. The adult is an algal gardener that clears and defends intertidal areas -- raiding limpets are sensed (visually?) and driven off. The opsin sequence found here, stored as MOLL_MEL_lotGig, suggests rapid divergence rather than living fossil character. However patellogastropods such as Lottia with symmetrical non-coiled, conical shells are sometimes taken as ancestral form.


Opsin lottia.png


Platyhelminthes: Schmidtea mediterranea (planaria) .. 1 opsin

The common planaria Schmidtea mediterranea has a 865 Mb genome very recently assembled from 17 million traces to 10x and placed in the wgs division of GenBank, after an initial impasse attributable to high AT (69%), repeat content (46%) and high clonal heterozygosity. The genome project is described in a white paper and has a dedicated site SmedDb. It has a strong EST collection as well.

The planarian central nervous system consists of a bilobed brain and two longitudinal ventral nerve tracts connected by commissural neurons. When planarians are decapitated they can completely regenerate a new brain, including new eyes, a boon to opsin research. The structure of the eye had already been described by 1915. Regeneration of the nervous system is a very active research area.

I began with various fragmentary opsins and ESTs and recovered a nearly complete melanopsin (including all introns) from trace archives. It is stored at the Opsin Classifer as RHAB_schMed and discussed in the Schistostoma section as a likely ortholog. Since the site of expression is known from hybridization and no other Schmidtea opsins are apparent, this is likely the principal photoreceptor both here and in Schistostoma. No counterpart to the Platynereis ciliary opsins can be found in the current assembly, indicating (since they could hardly have been invented in Platynereis) their loss in Platyhelminthes is a derived condition.


Opsin planaria.png

Platyhelminthes: Schistosoma mansoni (trematode) .. 3 opsins

The blood fluke Schistosoma mansoni is a major agent of schistosomiasis (bilharziasis), infecting more than 200 million people worldwide, with the fresh water snail (Biomphalaria glabrata -- a large EST project) as intermediate host. As an endoparasite residing deep inside lungs, hepatoportal circulation, and mesenteric veins, it would not seem a promising species for eyespots or even circadian rhythm opsins. However at least two life stages are affected by light: the hatching of the miracidium from the egg and emergence of cercaria from the snail. These swim upwards to the surface of the water and are also affected by shadows and turbulence.

GPCR proteins are the target of approximately half of all pharmaceuticals. For that reason, a Schistoma opsin came to be studied. That gene is expressed in the miracidia and cercaria stages but down-regulated in the adult. Expression is localized to sub-tegumental structures at the front end of cercariae. Full text of the 2001 article remains locked behind a sick commercial firewall, as does a 1975 electron microscopy study of photoreceptor lamellae seen as extensions of modified cilia.

Version 4.0 of genome is readily available for blast though it is missing from GenBank as are two million of the 3.8 million total traces (7x) despite NAID funding. It's unclear whether the extensive EST set of 31000 assembled sequences is available there. The Schistosoma genome is approximately 270 MB with low GC content 34%, moderate retroposon levels andwith an estimated 15-20,000 coding genes.

I determined the intron structure of the published opsin gene (called MEL1_schMan in the opsin classifier) which classifies with melanopsins. Using this as probe, a second full length paralogous opsin MEL2_schMan was annotatable. While percent identity was only 46%, the intron structure and alignment classification were identical. Possibly this second gene has a role in the miracidium, though the first gene is expressed in both stages, more compatibly with "two color non-imaging" eyes. MEL3_schMan is similarly intronated and fairly diverged.

The first opsin is more closely related in sequence to the sole known opsin in Schmidtea, RHAB_schMed where it possibly plays a homologous role. As queries, these proteins turn up closest matches at GenBank EST in other platyhelminthes. These observations do not support the notion of horizonal gene transfer of opsins from the host snail, another Lophotrochozoan which by itself might favor sequence clustering. It would be feasible to explore synteny in both platyhelminthes.

I investigated conservation of intron position and phase using the reliably intronated match with either MEL1_gasAcu of stickleback minnow (or equally MEL1a_braFlo of amphioxus). Here the percent identity is fairly low (39%) but enough patches of good matching suffice to reliably anchor the alignment. There is perfect agreement of the first three intron positions and phases, below.

This is strong evidence for a very deep connection vertical descent of these genes from a common ancestor (eg, orthology) because these introns are highly specific to melanopsin within the opsin superfamily, ie are not generic GPCR introns as seen from the total mismatch to Ixodes, Apis, and vertebrate ciliary opsins. These same introns are predicted for opsins from transcript species such as LOPH_RHO_plaDum (Platynereis dumerilii) and MOLL_MEL_patYes (scallop). It remains to be demonstrated that all these melanopsins play a conserved consistent homologous role.


Opsin loph mel introns.png


Ecdysozoa .. 5-42 opsins

This group, which includes insects, other arthropods, and species immediately basal to them, is taken here as the other wing of Protostomia, ie as sister group to Lophotrochozoa. The focus here is on new genomes which have not been so extensively explored as say Drosophila, especially on species that might contribute to reconstruction of the last common ancestor to Ecdysozoa (resp. Protostomia and Urbilatera). Opsins in genomic species have the advantages of determinable exon breaks and flanking syntenic genes, and so better prospects for establishing accurate homological relationships.


Panarthropoda: Hypsibius dujardini (water bear) tardigrade 0 opsins

A 5x genome project for Hypsibius dujardini, a phylum of microscopic ecdysozoan was approved in July 2007 but Broad has not yet begun trace reads on the small 70 mbp genome (suggesting densely spaced genes with small introns as this is not likely highly derived). It could prove very useful for opsins as tardigrades are basal to all of Arthropoda and so shed light on that last common ancester. In fact with accompanying centipede, horseshoe crab, amphipod, and priapulid genomes, the whole ecdysozoan ancester will be accessible.

The only known fossil specimens are found in Siberian mid-Cambrian deposits and much later amber. The older fossils have three pairs of legs rather than four, a simplified head morphology, and no posterior head appendages and probably represent a stem group of extant tardigrades. Aysheaia from the Burgess Shale might be related to tardigrades.

Nothing is currently known about photoreception or opsins in tardigrades -- or even if they have eyes. However it looks like we can expect some rhabdomeric opsins at the minumum in front of these pigment cups. However the current GSS and EST collections (about 6000 sequences) do not currently contain any convincing matches using various rhabdomeric and ciliary opsins as tblastn queries. Tardigrade photos and movies provided by Goldstein Lab

Tardi.png


Chelicerata: Ixodes scapularis (tick) 1 opsin

The genome project was completed long ago but has experienced a multi-year bottleneck in assembly release and publication. However contigs built from 19.4 million traces should become available to tblastn of the GenBank "wgs" division by late 2007. Ixodes has a very conservative genome (regretably 2.1 gbp in size), seemingly far less derived than drosophilids in matters such as intron, gene retention, and protein sequence conservation. This, in conjunction with the helpful phylogenetic position of chelicerate outgroup to the many insect genomes, has improved prospects for reconstructing the ancestal opsin repertoire of Arthropoda and eventually Protostomia and UrBilatera.

A large collection of annotated Ixodes ESTs is available at the DFCI Gene Index of which 3 are marked up (2 wrongly) as opsins. Using the Opsin Classifier, I recovered the full length gene for the first of these TC19272 on 24 Nov 07, intronated the transcript at the Trace Archives (4 introns, superb coverage), and added it to the classifier fasta collection as RHAB1_ixoSca. It classifies with rhabdomeric opsins (ie with deuterostome melanopsins) with a very respectable 57% maximal percent protein identity. The second and third intron have classical ancestral position (following GWSR and LAK) and phase (2 and 0). Synteny awaits assembly of large contigs -- adjacent exons are not spanned by single traces. An apparent ciliary opsin fragment in Ixodes was located using that of Platynereis dumerilii as probe, it is stored as CILI_ixoSca but needs further analysis.

Crustacea: Daphnia pulex (water flea) .. 1-37 opsins

An 8.7x genome assembly was released in July 2007 at JGI with further support at wFleaBase. This crustacean provides a potentially important outgroup to insects (together forming Pancrustacea). However the opsin story, summarized in a meeting abstract is an embarrassment of riches, not conducive to deducing ancestral arthropod genome content. The total number of opsin genes came in at 37, comprised of 22 rhabdomeric opsins (mostly long wavelength), 7 ciliary opsins (pteropsins), and 8 in a novel family without close affiliates. This seems excessive but Daphnia has ommatidia (compound eyes), circadian rhythms, and a need to assess water turbidity and depth. Planned in situ hybridization studies may illuminate biological roles of these opsins. The pteropsins are probably of most interest here.

Opsin daphniaJGI.png

Gene models have not been submitted yet to GenBank but are likely extractable by text query at wFleaBase. What is needed here however is not the clutter of 37 sequences but their collapse into UV, blue, long, pteropsin, and novel ancestral representatives. This would remove 'noise' from lineage-specific expansions. The intron structure could provide very important support to classification schemes.

The expansions may have arisen through retroprocessing (rather than segmental duplication) of a few master exonic genes, which would then be the orthologs to other arthropod opsins. Indeed the intronation pattern -- typically far more deeply conserved than protein sequence -- could link pteropsins more convincingly to lophotrochoan and deuterostome opsins than alignments with percent identities in the 20's.

This blast twilight zone is especially dangerous for photoreceptor opsins because they are embedded in much larger gene family of generic rhodopsin and GPCR which share many structural and signaling properties. A slowly evolving generic rhodopsin might well score higher than fast evolving photoreceptor opsins. Gene expansions are noted for markedly enhanced rates as copies neo- or subfunctionalize. The generic rhodopsin might also share diagnostic residues through convergence at least at the level of statistical signficance ambiguity. Consequently intron location/phase and synteny can provide important backup.

The synteny circle surviving at this phylogenetic depth will be local (optimistically Pancrustacean). That is, the blue opsin of Daphnia might in synteny with Drosophila (ie establish orthology) but not to Platynereis ciliary opsin much less any vertebrate opsin (eg encephalopsin). This could be remedied to some extent by ancestral gene order reconstruction. The degree to which synteny can contribute to validating orthology relations within opsins is not currently known.

I found one ciliary opsin for Daphnia in the process of expanding the known one in Anopheles tko crustacea. Stored at the Opsin Classifier as ENCEPH_dapPul, it is potentially an ortholog, as are new ciliary opsins from Culex, Aedes, Tribolium, and Bombyx. However this gene and potentially its photoreceptor structure are missing in Drosophila, Nasonia, and other genomes.

Panarthropoda: Tribolium castaneum (flour beetle) .. 3 opsins

TriCasEyes.png

The red flour beetle, which is highly dark-adapted in lifestyle, has lost its blue opsin according to both the newly published genome project and specialized experimental querying, retaining the other two ancestral color vision opsins and encephalopsin (which is called pteropsin in insects though likely a strict ortholog). The Tribolium genome article 110 page supplemental contains an excellent Table S14 of all known genes involved in insect eye development.

Insect opsins are expressed non-uniformly across individual eye units (ommatidia) within compound eyes. In Drosophila, six peripheral photoreceptor cells R1-R6 express LW opsin which detect brightness, projecting into the upper optic neuropil (lamina). Central photoreceptors R7 and R8 provide color vision via UV, blue, and LW opsins that project into the second (medulla). The dorsal rim area ommatidia are modified to detect polarized light.

The comparative genomics of ommatidia number and opsin utilization is indicated in the figure. Opsin gene loss raises different issues, namely replacement, from the more familiar gene gain issues (differential rewiring). After discussing various sequential mutational scenarios and the necessity of each step being adaptive or at least near-neutral, Jackowska et al settle upon expansion of LW opsin expression into all photoreceptor cells, resulting in co- expression with blue opsin in some R8 cells and UV-opsin in R7cells. This is followed by loss of expression or pseudogenization of blue opsin. Although co-expression defeats the purpose (via spectral summation) of separate opsins that enable color vision, there are precedents in butterflies and (typically nocturnal) vertebrates.

Opsins tribolium.png

It's also known how Apis and Manduca (also genome project species) end up with nine photoreceptor cells per ommatium instead of eight -- it's due to duplication of R7 cell fate (across all ommatidia). That raises the interesting question of whether such cell duplication simply results in duplication of opsin expression at the molecular level. That's not the quite the case today because the two central R7-like cells exhibit differential opsin expression. It's not known whether additional mutations were needed to attain this.

In summary, insect genomes are fairly straightforward in terms of their contribution to establishing the ancestral arthropod visual system, but their real value lies in the extensive comparative data available within Insecta, ecological studies of adaptive vision, and the experimental genetic opportunities within Drosphila (eg a recent article exploring deviations from ommatidia expressing but a single opsin). However no single insect genome can serve all purposes because of gene loss (eg ciliary opsins in Drosophila).

That's also the case for non-opsin GPCR which have gained a new importance given the possibly paraphyly of the opsin gene tree (ie some opsin gene duplicates may have given up retinal to signal via other agonists). Here we are fortunate to have a genome-wide inventory of neurohormone GPCRs in Tribolium. This turns up 20 biogenic amine GPCR (21 in Drosophila, 19 in bee), 48 neuropeptide GPCR (45 in Drosophila,35 in honey bee), and 4 protein hormone GPCRs (4 in Drosophila, 2 in bee) with likely ligands for 45 of the 72 Tribolium GPCR. The flour beetle retains an ancestral vasopressin GPCR and cognate peptide unlike other studied insects which are not adapted to such an extremely dry environment. On the other hand, Tribolium lacks allatostatin-A, kinin, and corazonin. This covers comparative genomics of 340 million years of insect GPCR evolution -- it is very common for new agonist/receptor couples to arise and old ones to disappear. Again we see genome density sampling will need to be high to sort out Urbilatera.

Panarthropoda: Pediculus humanus (louse) .. 3 opsins

Opsin louse.png

The body louse genome, being favorably small at 108 Mbp, is well along with 2.2 million traces and a contig assembly hopefully disentangled from its endosymbiont bacterium. Sequencing is medically motivated. The lifestyle of this hemimetabolous (nymph-like adult, no pupal stage) insect does not suggest a full spectrum of metazoan photoreceptors; indeed we shall find but 3 opsins. Even that seems a lot for a single lateral ocellus of 130 rhabdomeric photoreceptor cells lacking Semper and dedicated pigment cells. The broader interest here is intronation and synteny of these opsins (hence orthology), not available in many insects with opsin studies. It requires quite dense sampling to get ancestral introns for each arthropod opsin class because high rates of intron gain and loss can occur.

I reconstructed 3 multi-exon louse opsin genes on 24 Dec 07 by tblastn of numerous queries against GenBank wgs database division. These apparent rhabdomeric imaging opsins are stored in the Opsin Classifier as INSE_LWS_pedHum, INSE_UVV1_pedHum, and INSE_UVV2_pedHum. Louse otherwise seems a gene loss story in terms of relic ciliary opsins or even melanopsins so not especially favorable for retention of ancestral characters. The new opsins potentially provide trichromatic color vision to the louse in the short, blue, and long wavelength photoreception regimes, though lambda max awaits experimentation as the second ultraviolet opsin could be either re-tuned or co-opted for some other function, as in bumblebee where a UV opsin is expressed in proximal lamina rim, antennal lobe, central complex and protocerebrum clusters. That seems likely because INSE_UVV2_pedHum is back to ancestral tyrosine in (bovine rhodopsin) position E113 whereas true ultraviolet insect opsins all specify phenylalanine here (which relaxes lambda max into the ultraviolet, ie closer to that of free retinal).

CA Hill of the louse genome annotation team discussed 3 opsins back in a June 2007 email session, calling PHUM001073 perhaps an ultraviolet opsin while rejecting a fourth PHUM000074. These gene models are not released to GenBank nor is that terminology used in the meagre search capabilities of P. humanus VectorBase. Upon whole proteome file download, PHUM001073-RA turns out to be an unintronated dna fragment matching residue 44 to stop codon of INSE_UVV1_pedHum. PHUM000074-RA has nothing to do with opsins. PHUM005795-RA is missing the first 49 residues of INSE_LWS_pedHum but otherwise identical. PHUM001044-RA is a fragment beginning at residue 55 of INSE_UVV2_pedHum. In short, it's hard to find full length genes without benefit of the Opsin Classifier, cdna, or ab initio gene predictor.

Panarthropoda: Rhodnius prolixus (kissing bug) .. 2 opsins

Yet another genome project completed long ago at the trace level but sitting around unassembled. In August 2008 some 6,879,098 trace reads and 16,284 EST sequences were available. This number of traces is more than adequate for a good assembly but for now, opsins must be fished out by exon by exon using blastn at the trace archives.

This delay is unfortunate: Rhodnius prolixus, a large blood-sucking hemipteran insect that is carrier for a parasitic protozoan (Trypanosoma cruzi) responsible for Chagas disease through bites around the eyes and mouth. Chagas disease is a currently incurable tropical disease that damages the heart and nervous system. Rhodnius is nocturnal, with possible implications for its opsin repertoire, but becomes active at night. It is found in South and Central America, primarily in domesticated rural areas, currently affecting 16-18 million people and killing around 20,000 people annually. Darwin is sometimes claimed to have suffered from Chagas disease as a result of a bite (implausibly in northern Argentina) reported in Voyage of the Beagle diaries

Rhodnius clearly has two distinct opsins but no apparent ciliary pteropsin or melanopsin. The first is a long wavelength sensitive gene most closely related (84% identity) to Tribolium but whose intronation pattern is closest to Apis (a phase 00 intron is missing in Rhodnius). Thw second Rhodnius opsin classifies with insect UV opsins, most closely to that of Apis.

>INSE_LWS_rhoPro Rhodnius prolixus (kissing_bug) frag missing first and last exons
0  2
1 YQFPPLNPLWHGILGFVIGVLGIISIVGNGMVIFIFSSTKTLRTPSNLLVVNLAFSDFLMMFTMSPPMVINCYNETWVL 1
2 GPLMCELYGMLGSLFGCASIWTMTMIALDRYNVIVK 0
0 GISAKPMTNKTAMLRILLVWAFSIMWTVFPFFGWNR 2
1 YVPEGNMTACGTDYLTKNWVSRSYILVYSVFVYFLPLFTIIYSYFFILQ 0
0 AVSAHEKQMREQAKKMNVASLRSAEAANTSAEAKLAK 0^0 VALMTISLWFMAWTPYLVINYSGIFETISISPLFTIWGSLFAKANAVYNPIVYAIR 2
1  *0

>INSE_UVV_rhoPro Rhodnius prolixus (kissing_bug) fragment, novel introns, 67% identity Apis
0  0
1 ASTSGNIRTLGWNLSPEDLKHIPEHWLSYPEPEPILNYALGVLYIFFMLIALIGNGLVIWIFST 2
1 AKTLRTPSNIFVVNLAICDFLMMSKTPIFIYNSFKLGYALGHRACQIFALLGSFSGIGASATNAVIAYDRYR 2
1 ERFSTKCTFDYLTPTSEIRNFV   MSLIIYFYSQIVSHVIIHEHNLREQ 0
0 AKKMNVESLRSNANMHTQSAEIRIAKAAITICFLFVASWTPYAVLALIGAYGNQ 2
1 DLLTPAVTMIPACACKAVACVDPYVYAISHPRYR 2
1 QELSKKFPWLDIKEAPAPSSVDANSTATEMTLPTQTSPAEA* 0