Opsin evolution: key critters: Difference between revisions

From genomewiki
Jump to navigationJump to search
Line 298: Line 298:
The sea anemone, an anthozoan within Cnidaria having epithelial cells, neurons, stem cells, complex extra-cellular matrix, muscle fibers, and symmetry axis, is emerging as a high-profile evo-devo model species  to elucidate the emergence and deployment of genes that determine animal body plans.  However those plans don't seem to include eyes or overt photoreceptor structures such as pigment cells -- for that cubomedusae would be far better. PAX6 and RX are especially relevent to photoreceptor structures; their expression has been thoroughly studied in Nematostella without uncovering any sensory system though they contribute to patterning specific components of the ectodermal nerve net.
The sea anemone, an anthozoan within Cnidaria having epithelial cells, neurons, stem cells, complex extra-cellular matrix, muscle fibers, and symmetry axis, is emerging as a high-profile evo-devo model species  to elucidate the emergence and deployment of genes that determine animal body plans.  However those plans don't seem to include eyes or overt photoreceptor structures such as pigment cells -- for that cubomedusae would be far better. PAX6 and RX are especially relevent to photoreceptor structures; their expression has been thoroughly studied in Nematostella without uncovering any sensory system though they contribute to patterning specific components of the ectodermal nerve net.
   
   
Four putative opsins have been proposed by the Oakley lab. Accessions of the supporting cdnas are given in the JGI protein ID system (non-GenBank) as Nematostella1 219988, Nematostella2 85309, Nematostella3 130042, and Nematostella4 108738 or fragments in an alignment graphic allow recovery of the respective cdnas by tblastn of GenBank WGS. As noted in the Hydra section, multiple lines of evidence are necessary to establish the first bona fide opsins in cnidarians.
Four putative opsins have been proposed by the Oakley lab. Accessions of the supporting gene models are given in the JGI protein ID system (non-GenBank) as Nematostella1 219988, Nematostella2 85309, Nematostella3 130042, and Nematostella4 108738 (or fragments in the alignment graphic allow recovery of the respective cdnas by tblastn of GenBank WGS). As noted in the Hydra section, multiple lines of evidence are necessary to establish the first bona fide opsins in cnidarians.


There appear to be at least Nematastella opsin-like cdnas at GenBank that cannot be found in the genome assembly or trace archives. While genes can be missing from first assemblies, it is bizarre for 4 to be missing considering coverage is 6x. Upon back-blast to GenBank nr or the Opsin Classifier, very strong matches are seen consistently within crustacea.  Thus it appears that these are contaminants from another species, possibly a brine shrimp widely used in aquarium food. It is not unusual to see transcript (at issue here) and genome projects contaminated with dna from other species such as commensals, parasites, and food source -- this is reminiscent of [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12931184 Xenoturbella] being confused with a mollusc in its diet.  
There appear to be some Nematastella opsin-like cdnas at GenBank that cannot be found in the genome assembly or trace archives. While genes can be missing from first assemblies, it is bizarre for 4 to be missing considering coverage is 6x. Upon back-blast to GenBank nr or the Opsin Classifier, very strong matches are seen consistently within crustacea.  Thus it appears that these are contaminants from another species, possibly a brine shrimp widely used in aquarium food. It is not unusual to see transcript (at issue here) and genome projects contaminated with dna from other species such as commensals, parasites, and food source -- this is reminiscent of [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12931184 Xenoturbella] being confused with a mollusc in its diet.  


A third group has taken a serious look at photoreception in Nematostella. No paper or dissertation has emerged as yet; no cnidarian opsins have been posted to GenBank.  
A third group has taken a serious look at photoreception in Nematostella. No paper or dissertation has emerged as yet; no cnidarian opsins have been posted to GenBank.  

Revision as of 23:15, 12 December 2007

Key Critters: opsins from genome projects

Some species such as drosophila have lost all ciliary opsins -- clearly they are not essential for a successful visually complex flying insect with 5-color vision and circadian rhythm. Bees, annelids, and mammals retain ciliary opsins so we know this must be the ancestral bilateran state state. This predicts ciliary opsins in cnidaria and indeed one was just found in cnidaria. One sees the importance of complete genomes here (versus transcripts or immunostained sections): absence of ciliary opsin evidence in a genome is truly evidence of ciliary opsin absence.

When the eye is reduced to a single pigment cell backing a single photoreceptor cell, the opsin of that species will be expressed only in one cell of the entire body. In this situation, the opsin may never show up in transcript collections, even with subtraction of common ones.

Vertebrates could never have evolved cilliary opsin vision had the bilateran ancestor possessed the limited opsin repertoire of fruit fly. Thus most pressing question is -- assuming rhabdomeric opsins were thoroughly entrenched in the earliest imaging eyes and photoreception systems -- what kept ciliary opsins around in early bilatera (and even cnidaria) so that they could later be co-opted for ciliary opsin-based vision? We ttp://genomewiki.ucsc.edu/index.php/Opsin_evolution:_trichromatic_ancestral_mammal here too] that contemporary tunicates, lancelets, and lamprey are not ancient, ancestral, antiquated, archaic, character-retaining, dead-end, failed experiments, frozen in time, genetically stationary, living fossil, primitive, primordial, relic, or retro species. They're full modern -- the tree of life is right-justified. Indeed their genes, regulatory signalling systems, and enzymes may be more finely honed than mammals because of more rapid evolution attributable to larger effective population sizes, reproductive mode, generation time, and marine selective predatory pressures.

However we can hope that ancestral character traits will still be reflected to some extent in these earlier diverging species and that with enough complete opsin repertoires from taxonomically appropriate species, the ancestral genes and even visual systems can be reconstructed at key nodes on the phylogenetic tree. The story describing the evolution of the human eye then amounts to describing the status at these successive nodes and perhaps interpolating between them. There are definitely limits to knowledge here as extant metazoans provide only 35 nodes between sponge and human -- gaps between nodes may average 30 million years but can seriously exceed that. This is offset by the occasional proposal of new deuterostome branches (Xenoturbella, Convoluta).

The ideal set of genomes needed to study the evolution of the metazoan eye is only partly completed, underway, or even proposed. In some cases, the genome size of clade representatives is so large (eg lungfish at 25x human) the species may never be sequenced, though opsin transcripts could still be obtained. In others, the rate of evolution has been so fast so long that very little information about photoreception at ancestral nodes can remain (eg Oikopleura). Hagfish opsins, which would conveniently break up the crucial lamprey long branch, are not available at GenBank but here the animal has adopted a deep water (dark) habitat, meaning that its cone opsin genomic repertoire will be highly reduced, if not gone entirely, in its markedly degenerated eyes. (Its other opsins could still be informative.)

Model organism choices do not always coincide with genome sequenceability, transcriptome projects, nor (worst of all) with slow-evolving less derived species. Finally, most sequencing speaks to narrow anthropocentric interests, whereas the more broadly conceived sequencing need is greatest farther back (to break up branches). The evolution of the eye needs a rather different portfolio of genomes than a typical disease gene because of the earlier intrinsic timing of the innovative events. In fact, one product of the investigation here is to spell out these needed genomes. Of course one obvious genome choice is the cubomedusan jellyfish Carybdea marsupialis with its 24 eyes of 6 types.

It's worth reviewing genome status and recent experimental literature on key species. While abstracts are readily available at PubMed, access to free full text is unpredictable, so those links are collected when available. It suffice to reference only recent articles because they cite the earlier literature and citation in turn of their paper are collected by Google Scholar (or AbstractsPlus at PubMed). Most opsin sequences in the Opsin evolution reference sequence collection have a PubMed accession as a field their fasta header database; those can simply be compiled to an active link that opens all of them in one PubMed window.

OpsineyePhylo.png

Figure adapted from: Acoel Flatworms Are Not Platyhelminthes: Evidence from Phylogenomics Hervé Philippe et al PLoS ONE. 2007 Aug 8

Chondrichthyes: Callorhinchus milii (elephantshark) .. 13 opsins

Five ray-finned fish genomes are available but these have major lineage-specific expansions and are quite derived. Some sequences are available from lobed-finned fish and coelocanth genome has been proposed. This makes the preliminary genome assembly of the much earlier diverging Callorhinchus (oft-misspelled) and skate transcipts very special because it is the "last stop" before lamprey.

This large-eyed cartilaginous fish lives to depths to 200m on the continental shelf of southern Australia and New Zealand but migrates into coastal estuaries to lay egg cases (lower image) in sand and muddy substrates. The distinctively-shaped egg cases are sometimes found washed ashore after storms. They are up to 25cm long, 10cm wide, and take up to eight months to hatch. The one studied member of the genus has a vitamin A1-based photopigment with maximum absorbance at 499 nm presumably adapted to its overall photic environment.

I made an exhaustive search of the WGS and Trace divisions of GenBank on 5 Nov 2007, recovering many complete exons but most fragmentary genes. The opsin classifier can easily place these fragments. Overall, Callorhinchus appears to have a full complement of vertebrate opsin genes. The exceptions are RHO2, SWS1, SWS2 (oddly also missing in skate and dogfish ESTs) apparently leaving elephantshark with only RHO1 and LWS rod/cone pigments. Parietopsin was also missing. Two encephalopsin- and two melanopsin-class opsins were found. The RGR, peropsin, and neuropsin genes will prove important in better determining their overall gene tree placement (which an October 2007 opsin phylogeny paper placed deeply within rhabdopsins).

Agnatha: Petromyzon marinus (lamprey) .. 7 opsins

Lamprey are the favored outgroup to jawed vertebrates. Geotria and Petromyzon split 280-220 myr ago (helpful in breaking a long branch in half) whereas Lethenteron-Petromyzon was much later 20 myr (unhelpful, too close). Their photoreceptor systems have been studied in considerable depth.

Geotria australis has a full complement of imaging opsins LWS, SWS1, SWS2, RHO2, and RHO1 as well as the other major opsin classes. This implies that the ancestral vertebrate possessed photopic (bright light) cone-based vision with the potential for pentachromacy, circadian rhythm, pupilary reflex, and pineal and parapineal functions. Photoreceptor morphology and spectral sensitivity can change during various phases of the lamprey lifecycle, a lineage-specific complication that does not concern us. The larvae are reported blind for a period of 3 to 4 years, during which time they live in burrows or hide under stone; non-ocular opsins might still be operative.

The lamprey genome project has stalled, despite accruing an impressive 19 million traces. It seems a 16 kbp retroposon has expanded enormously, which in conjunction with very high AT content and heterozygosity makes assembly of the 2.4 Gbp genome all but impossible. However the blast page at WUSTL allows a Petromyzon "3.0 supercontig" option which was not working though on 7 Dec 07.

New sequencing technologies make an immense collection of cDNA affordable. This would permit a pseudo-assembly of exons flanked by some at least genomic dna. That is, transcripts are aligned into the trace archives to obtain non-coding context for exons. This allows some topics to be studied (intron retention, invariant non-coding) but not others (upstream regulatory, chromosomal gene order). Labelling techniques such as FISH could provide some linkages but not gene-level order; perhaps this could be done at the level of BACs using exons from all possible gene pairs. Alternatively, the genome of Geotria might present fewer issues.

The high lamprey trace coverage is very helpful to this project. Because most of the opsin sequences are from non-genomic species, the intron structures were not known. I've mapped all GenBank opsin data for Geotria and Lethenteron into Petromyzon using these traces, not only obtaining a nearly full spectrum of new sequences but also sequences parsed for intron breaks and phase. Further, opsin classes not available in other lamprey were collected using chondrichthyes genes as query. In all cases, intron placement remains perfectly conserved from lamprey to mammal, indeed to amphioxus with some complications. These could help validate orthology candidates in earlier species, especially to clarify the presumably ancient but still cryptic origin of imaging opsins. Intron gain and loss is rare in many lineages -- the vast majority of human introns are conserved back to Urbilatera, indeed to sponge.

Agnatha: Eptatretus burgeri (hagfish) .. 0 opsins

Hagfish, after decades of back-and-forth, are now sistered with lamprey, news not accepted yet by the Taxonomy division of GenBank. However including rogue taxa such as Oikapleura can severely skew results in molecular phylogeny studies. Monophyly of Cyclostoma is unfortunate if true -- an earlier node would very much help in understanding the origin of the eye. It would be better to put aside tree topology until rare genomic events can be developed and simply procede with opsin sequencing.

Jawless fish first appeared in the Ordovician. Hagfish and lamprey split well after the Cambrian, roughly 430 myr ago according to molecular clocks. That's a time span comparable to divergence of human from shark. The oldest fossil hagfishes are Late Carboniferous (330 myr). The two extant hagfish groups split some 75 myr ago (human from mouse). Only recently has it been possible to obtain hagfish eggs and embryos and revisit the neural crest issue. Hagfish experienced an extra round of HOX gene expansion, undercutting both HOX cluster copy number as a hallmark of supposed vertebrate body plan innovation and premature speculation on 1R and 2R whole genome duplication.

Hagfish are nocturnal in aquaria and deep-sea in their natural habitat -- a new Eptatretus species was even captured at a hydrothermal vent. This lifestyle is not conducive to a well-preserved imaging opsin portfolio, though hagfish still have circadian rhythm (based in the preoptic nucleus) and dermal photoreceptors though no pineal gland. The non-imaging paired eyes lack cornea, lens, vitreous body, and extrinsic eye muscle but nonetheless the retina and optic nerve react with opsin antibody. The eyes are larger in Eptatretus than in Myxine, where they are partly covered by the trunk musculature. However 1.3 mm is still quite small for an eye. After comparison of all extant genera, Fernholm and Holmberg concluded in 1975 that the hagfish eyes are secondarily degenerated from more conventional eyes adapted for shallow water (for example an early lens placode disappears). The comparative anatomy of hagfish eyes has an excellent discussion in The Biology of Hagfishes (JM Jorgensen), pages 542-543 available by google book search.

No hagfish opsins have been sequenced; no genome project is scheduled. Even if hagfish imaging opsins are mostly gone, other ciliary and rhabdomeric opsins could be quite informative. Hagfish may have information about a critical era in deuterostome imaging opsin evolution.

Urochordata: Ciona intestinalis (tunicate) .. 4 opsins

Tunicates occupy the strategic urochordate position in the phylogenetic tree. Three tunicate genomes have been sequenced. These proved disappointing for comparative genomics due to their derived nature, which adversely impacts coding sequence divergence, gain and loss of genes, overwriting of ancestral introns, almost total loss of gene order, and high positional heterozygosity. It may not be possible to find more conservatively evolving tunicates if rapid generation time and free spawning are characteristics of all extant urochordates. Yet in other aspects, such as reconstructing the evolutionary trajectory of the vertebrate eye, contemporary tunicates may have retained critical information.

The most useful of these rogue genomes is Ciona intestinalis; Oikopleura dioica and Ciona savignyi have been all but abandoned as model organisms. Halocynthia roretzi has many cdna but has not been evaluated for genomic characters whereas Ciona intestinalis has been developed extensively as an experimental system; its massive cdna coverage allow recovery of complete coding gene models which would be nearly impossible from mere homology alignments.

Fortunately, both larval and adult photoreception have been thoroughly studied. Ciona lacks imaging eyes and thus any counterparts to rod and cone opsins like the cephalochordate Branchiostoma. The relative topology of these two with respect to the vertebrates has tilted in recent years towards amphioxus as outgroup -- we'll check later if rare genomic events in opsins support that picture.

Opsins cii larval eye.png


The tadpole larva CNS contains 335 cells of 13 types. These include 30 retinal photoreceptor cells in an unpaired ocellus and 5 accessory cells -- 3 for a ocellus lens-like structure, 1 for the pigment cup, and 1 pigment cell in the otolith (inconsistenly with a hydrostatic sensing role for its 19 receptors). The pigment cells of the ocellus and otolith form an equipotent developmental equivalence group -- a bilateral pair of cells in the blastula gives rise to the otolith and ocellus melanocytes whereas the retina arises from both left and right cell lineages. The observed genomic complement of opsins may largely come into play in the larva because the adult is sessile with little resemblance to vertebrates. The larva are non-feeding which scarcely fits a super-predator role for early deuterostomes opsins.


An evidently ciliary opsin called Ci-opsin1 is expressed in the larval ocellus (stored here as PPINa_cioInt). The opsin classifer places this in the PPIN/PIN/VAOP group with best match 44% identity, quite respectable given a billion years of roundtrip evolutionary time. As noted initially by Kusakabe et al in 2001, this opsin shares 3 identical introns with the vertebrate group.

Today there are 25x as many opsin sequences available with much greater phylogenetic dispersion. It appears Ci-opsin has 2 new intron insertions relative to the ancestral Gq ciliary opsin 4-intron pattern 0.2.0.0. This pattern is specific (not shared by Gt melanopsins nor Go retinal isomerases) and diagnostic (disregarding a few lineage-specific gains and losses) -- see documentation.

Three ancient ciliary introns were already established at the time of amphioxus and tunicate encephalopsin divergences. Indeed they already occur in sea urchin, ragworm, mosquito, moth, and beetle ciliary opsins. Consequently they were present in the parent ciliary opsin of Urbilatera and no doubt Cnidaria. There's nothing surprising about this because the vast majority of (human) coding introns originated far earlier in unicellular eukaryotes and have been conserved ever since. Outside of rogue lineages such as drosophila, nematode, and tunicates, event rates for intron gain and loss are perhaps 1-2 per five billion years of branch length. Convergence is not favored because 333 aa sites x 3 intron phases = 1000 distinct possibilities in an opsin-sized protein -- for an already very rare event to happen twice in the available branch length requires predisposing factors.

We will use these deep intron characters later to supplement -- and even trump -- maximal likelihood inference from primary sequence divergence which captures the broad picture but fails to resolve the issues of most interest. With opsins, alignment (at these time depths and rates of change) hits the wall of generic rhodopsin superfamily and indeed generic GPCR proteins, which numbered many hundreds at the time of Urbilatera. There are already many constraints on proteins which must have seven transmembrane helical segments, covalently bind retinal with a lysine and counterion, and interact with heterotrimeric signalling protein.

With the genome in hand, we can see Ci-opsin1 has an unstudied paralog (here called PPINb_cioInt) of 58% identity and identical introns (other than a new phase 21 intron breaking exon 4). There is no expression data for the paralog in the UCSC browser track but it cannot plausibly be a pseudogene due to the conserved nature of amino acid replacements, so we wonder about subfunctionalization. The hybridization experiment will have to be repeated at various life stages. Paralog lambda max might be computable or measurable in a construct. The 1999 experiment (which measured speed-up in swimming after light decreased, reminiscent of the pineal-mediated frog tadpole response) deduced a lambda max of 505 nm -- perhaps that was a composite action spectrum. The new paralog in fact conserves the key lysine and counterion.

We can hope that photoreception in Ciona retains ancestral characteristics that descended intact, at the same time knowing evolution of protein sequences and development have not stood still for 600 million years. Ciona photoreception may have both degenerative and innovative aspects. It is premature to homologize ocellus with pineal (or amacrine or horizontal retinal cells etc) until the role of all the opsins in the Ciona genome have assigned roles (not to mention dozens of other genes). I suggest from evo-devo equipotency that the paralog opsin functions as a photoreceptor in the otolith.

Here neural integration of hydrostatic pressure signaling with brightness directionality could advantageously inform the larva of its position and orientation even in a murky water column and help with dispersal and settlement. A pigment cell is hardly needed for hydrostatic pressure sensing -- what functionality would maintain it over evolutionary timescales? The function of pigment cells is blocking light from the back, here so the larva knows up from down. Curiously, a crystallin of definite homology to refracting vertebrate lens crystallins is expressed in the otolith but not ocellus lens-like accessory cells. The statocyte itself is sprung by its footpiece and two fibrous structures, all synaptotagmin-positive. Movement of the statocyte would be detected by these three structures and thus sense gravitational orientation.

We're left wondering if the speculative otolith/photoreception connection in urochordates has any connection to the balance sensory system (vestibular apparatus) in the vertebrate brain. The otolithic organs (utricle and saccule) detect inertial movement using tiny calcium stones (otoconia) coupled to hair cells. The Allen Brain Atlas could be explored on vestibular sections for extremely detailed expression of most opsin genes. The vestibular system coordinates extensively with the visual system via the vestibulo-ocular reflex. If true, this could radically affect homologization.

Opsins cii paralogs.png


Possibly this ciliary paralog pair descended from a gene duplication already present in the last common ancester, leading after still more gene duplications to the current portfolio of vertebrate ciliary opsins. This would account for its ambivalent behavior in the Opsin Classifier with respect to the PPIN/PIN/VAOP group. Alternately the pair might represent a tunicate-specific duplication of secondary interest. Ciona savignyi has a clear ortholog (88% identity) to PPINa_cioInt but a lesser match at 59% to PPINb_cioInt, in both cases with identical introns (not an unusual pattern in gene duplications assuming PPINa_cioInt continues the original function). C. savignyi -- which is only in the same genus from a severe anthropocentric perspective -- helps gauge the rate of evolution of C. intestinalis opsins.

Photoreception in the adult ascidian, which might seem gratuitous in a sessile filter-feeder, has not been studied in quite such detail. However several non-opsin expression studies suggest that adult photoreceptors may develop about pigmented spots around oral and atrial siphons, epithelial cells of sperm duct and cerebral gangla, involving behaviors such as siphon contraction, phototroism, and gamete release. The anterior photoreceptor of the oral siphon has even been homologized to vertebrate lateral eyes.

We'll see below that exactly the same problem as above (undocumented paralog) may affect interpretation of a comprehensive experimental study of Ciona Ci-opsin3 (RGR1_cioInt at the Opsin Classifier). Here too I was able to recover a related second gene in both C. intestinalis and C. savignyi. This illustrates the power of genomics -- provided coverage is complete, a full complement of bioinformatically extracted opsins can guide experimental design from the beginning. A full set of opsin classes should be sought in the genome, even if their degree of sequence divergence and lack of transcripts makes this difficult.

Kusakabe,Tsuda and coworkers have studied the overall visual cycle -- a much better approach than considering opsins in isolation for purposes of homologization. Recall incident photon absorption by rhodopsin isomerizes 11-cis-retinal to all-trans. Without recycling or fresh cis-retinal, this would soon exhaust vision. In mammals replenishment of the visual cycle (retinal isomerase, RGR) takes place in retinal pigment epithelial cells which are distinct from the photoreceptor cells, unlike lophotrochozoa where the cycle is completed within the photoreceptor cell. What about Ciona? We might expect a mixed system since the deuterostome divide preceded the deuterostome photoreception divide with Ciona occupying a strategic phylogenetic position.

If life were simple, Ciona would have strict 1:1 orthologs to the 4 components of the mammalian visual cycle protein, RGR (Ci-opsin3), cellular retinaldehyde-binding protein CRALBP, β-carotene monooxygenase BCO, and retinal pigment epithelium RPE65. At this phylogenetic depth, we can expect a certain degree of non-parallelism between photoreceptor systems and complications from lineage-specific duplication and subfunctionalization, not to mention lack of exact mammalian counterparts to Ciona larval and adult stages.

It turns out (using closest homologs) that Ci-BCO is predominantly expressed in larval ocellus photoreceptor cells, whereas Ci-RPE65 is not significantly expressed there nor in larval brain vesicle but rather in photoreceptor cells of the neural complex (a photoreceptor organ of the adult) right along with Ci-opsin3 and Ci-CRALBP (ie, like cephalopod). It appears the larval visual cycle uses Ci-opsin3 as restorative photoisomerase whereas the adult visual cycle Ci-RPE65. The remote paralog RGR2_cioInt was not studied and its role remains speculative. Given its degree of divergence yet persistance in a second ascidian, it is an old gene duplication maintained somewhat by selective pressure.

What about rhabdomeric opsins in Ciona? We know that melanopsin persisted into vertebrates so it must have been present at the common ancestor with ascidians. Rhabdomeres themselves as a subcellular opsin housing specialization did not persist so their apparent absence in Ciona does not imply the absence of melanopsin.

A Ciona melanopsin could be very diverged. The best possible search involves tblastn of the Ciona assembly and GenBank est_others with a variety of queries (since the best query is not known in advance; after the fact it is provided by the Opsin Classifer). Reconstructed ancestral melanopsins can improve on specific species queries by eliminated half of the roundtrip divergence.

However overly sensitive queries have the risk of merely returning generic rhodopsin-superfamily members (notably ADRA1A adrenergic receptor). While these won't receive clean approval from the Opsin Classifier, any putative melanopsin must be secondarily validated by retention of intron pattern, synteny with vertebrate melanopsins (unlikely in Ciona), and internal amino acid signatures of authentic melanopsin-type photoreceptors.

I evaluated various candidates using a wide variety of probes (such as echinoderm melanopsin) on 11 Dec 07 but none led to a convincing urochordate melanopsin. Absence of evidence is not evidence of absence but it appears that Ciona did not retain a melanopsin.

Echinodermata: Stronglyocentrotus purpuratus (sea urchin) .. 6 opsins

The sea urchin genome carried a big surprise: the previously dismissed echinoderm has a large set of genes for sensory and signalling capability (comparable in number to human). These include at least [1] six opsins] relevent to our purposes. Adult sea urchins exhibit a variety of responses to light intensity: shelter seeking, covering reactions, diurnal migrations, and spine defense reaction to shadowing. Various pedicellariae (jaw-like appendages around the base of spines) keep the body surface clear of encrusting organisms and aid in food capture. Larva don't have evident eyes but do express an opsin in the post-oral arm suggesting some capabilities..

Opsin urchin expr.png

Because sea urchins are seriously diverged, it is difficult to recover accurate full-length sequences by homology, especially in poorly conserved termini, without transcript evidence. At this point, only one of six urchin opsins has any cdna support -- and that from a different species of urchin! That melanopsin interestingly consists of a single exon -- evidently retroposed but still functional -- for which no parent gene can be located. It is not unusual for a descendent gene to supplant the multi-exonic parent, perhaps by accident, perhaps because of transcriptional efficiency considerations.

The two peropsin-class Go urchin sequences are adjacent in parallel tandem configuration with identical intron pattern but have only 64% amino acid identity, consistent with a moderately old tandem duplication. Despite additional weak members of this group, care must be taken not to drift off-topic into the greater rhodopsin superfamily of GPCRs (979 genes in 70 families annotated in urchin).

The sea urchin genome contains one very clear ciliary opsin (called PIN_stoPur in the sequence storage area). Here the GLEAN3_05569 prediction from Baylor appears entirely accurate whereas GenScan and GNOMON XM_778209 and XM_001177470 are impossibly flawed. The Opsin Classifier classifies this somewhat ambiguously within pinopsin-encephalopsin, suggesting it might seed a new ciliary opsin class. The intron pattern 0.2.2.0.0 is a perfect match in position and phase to pinopsins. Indels will be considered when the global alignment is revisited. It has no detectable counterpart in the Saccoglossus genome.

There appears to be a second ciliary opsin (stored as ENCEPH_strPur). It is best fit to Branchiostoma and Platynereis ciliary opsins but only at 33% identity and not that distant to certain melanopsins. This opsin too is likely to be involved in some aspect of photoreception, though that won't be as closely related to vertebrate imaging as PIN_stoPur.

The two remaining opsins classify as rhadomeric melanopsins. One of them, MEL2_strPur, has a GenBank transcript DQ285097 alluding to an unpublished expression study concerning tube feet photoreceptors. The other melanopsin is expressed post-oral arm of two-week-old larvae.

Hemichordata: Saccoglossus kowalevskii (acornworm) .. 1 opsin

This surviving member of early branching deuterostomes has excellent genomic and trancript coverage, with diverse full length multi-exon genes often recoverable. Be aware that transcript data has been misplaced by GenBank to reside under Saccloglossus 'other' at the trace archives. However acorn worm may not illuminate photoreception at its ancestral node with echinoderms (Ambulacraria). Acornworm have isolated photoreceptive cells are scattered through the epidermis but no eyes or eye spots even as planktonic larva, as befits an animal that settles in its burrow on day two. Light striking epidermal photoreceptors elicits burrowing behavior.

In searching for opsins that might underlie epidermal (or other) photoreception, the best queries (for detecting diverged sequences) are likely sea urchin opsins against Saccloglossus 'other' because, being transcripts rather than short exons, these give longer matches. However opsins expressed in scattered cells may not be represented there if rarely transcibed. Promising traces must be back-blastxed against the Opsin Classifier because the initial query choice may have beeen sub-optimal. Good matches can then be intronated using the exact-matching Saccoglossus probe against genomic reads. Intron patterns are critical adjuncts to low sequence alignments in establishing opsin orthology classes because surviving synteny at this time depth is rare.

I report here the very first hemichordate opsin. The process for obtaining accurate gene models in such a remote species is a difficult exercise in bioinformatics and use of the Opsin Classifier -- a detailed procedure is given in a separate note. That Saccoglossus opsin classifies unambiguously to the peropsin/neuropsin group by similarity and intron structure and thus is not a strong candidate for the epidermal photoreceptor -- but it suggests the story is not that simple. A complete set of opsins for these species best awaits release of assembled contigs by the Baylor sequencing center.

>PER_sacKol Saccoglossus kowalevskii Expect = 2.0e-49 PERa_braFlo Identities = 97/246 (39%)
IIYYFFLLSTGLTIFGMSLSCVSSF GRWLFGKFGCYFHGFAGMLFGLGSIGNLTVISIDRYIITCKRSL 1
2 WSYRHYYALLAVAWSNALFWSMMPLFGWSSYALEPEGTSCTIDWMNNDNQYISYVSCVTVTCFILPCAVMTYDYLAAYMKMVKAGYTLSEETEKPNND 0
0 MCIALVAAFLLSWFPSATVFLWAAFGNPGNIPLSFTGVADAFTKIPAVFNPVIYVALNPEFRKYFGKTIGCRRKRKKPIAVRLNGSEQNVENTI* 0

Deuterostomia: Xenoturbella bocki + Convoluta pulchra .. 0 opsins

These two taxa have recently been put forward as new phyla of basal deuterostomes, the former as outgroup to echinoderms plus hemichordates, the latter acoel flatworms as more basal still. However sequence data is extremely sparse with 3,127 sequences for all of Acoelomorpha, and Convoluta pulchra evolving far too fast for practical use, with its tree position controversial.

No genome or major transcript studies are under consideration. A quick check via tblastn of sea urchin opsins against available transcripts does not turn up good opsin candidates as of 28 Nov 2007 (other than a weak melanopsin match in Convoluta, EV602614, that might instead be generic GPCR). No information about photoreception in these species is readily available. While the above two taxa might not be ideal for opsin purposes, extant species are very limited.

Lophotrochozoa: Platynereis dumerilii (polychaete) .. 3 opsins

This small annelid may be an emerging model organism, though plans for genome sequencing in France have apparently collapsed. (Indeed all metazoan genomic sequencing in Europe has ceased.), Three recent papers have established that Platynereis qualifies as living fossil, at least with respect to ancestral anatomy and development, slowly evolving protein sequences and retention of genes and ancestral introns, and further has retained ciliary opsins.

That is to say, fruit fly and nematode have proven unfortunate choices because of so many lost genes and signalling pathways, rapid evolution and highly derived characters. Lophotrochozoa may thus give us very significant insight into the bilateran ancestor that had appeared lost from consideration of just Arthropoda. It should be noted though that not all insects should be written off, Anopheles has also retained ciliary opsins.

Opsin platynereis.png

Platynereis develops various pairs of eyes going by opsin expression: inverse larval eyes used in phototaxis (just one pigment cell and one photoreceptor cell) and two pairs of everse adult eyes needed for adult vision. These originate from an initially unsplit single anlage. These eyes use exclusively rhabdomeric photoreceptor cells and corresponding rhabdomeric-class opsins as expected from phylogenetic position. However two paired structures in the developing median brain dorsal to the apical organ express an opsin that unambiguously classifies as ciliary. Further, a retinal homeobox (specific to ciliary pineal eyes) and circadian rhythm regulator bmal are also expressed at this location in Platynereis. However the pigment cells necessary for directional photoreception are missing. This all fits with a role for the ciliary opsin as the primary receptor underlying circadian rhythm.

The emerging picture is Ur-Bilatera having both ciliary and rhabdomeric structure. The later structure was lost but the photoreceptor component retained in vertebrates in the form of melanopsins in retinal ganglion cells.

Remarkably using the transcript ciliary opsin as probe on a 171,779 bp forgotten high throughput genomic sequence CT030681, I found a second ciliary opsin fragment with encephalopsin character which could be intronated (unlike the original). This needs confirmation of its signalling partner through study of diagnostic binding residues. One of the exons lies on the minus strand suggesting partial assembly of the 6 unordered contig pieces. (These are stored in the Opsin Classifier as CILL1_plaDum and CILL2_plaDum, resp.). The only other apparent coding gene on this contig (blastx vs metazoan proteins) is alpha-tubulin, positions 124517-122811, downstream from the second ciliary opsin at 46848-87956 using GenBank contig order.

That's a striking coincidence in that antibody to acetylated alpha-tubulin was used as marker for stabilized microtubules in cilia and axons -- almost close enough for coupled expression especially when that contig piece is flipped (reverse complemented so that all the opsin exons are on the same strand). That brings the intergene distance to 8200 bp (end of opsin 53730 - start of tubulin 61901). That's something to keep in mind in terms of establishing syntenic orthologs in other lophotrochozoa.


Opsin parallels.png

Opsin everse.png



Lophotrochozoa: Capitella sp (annelid) .. 2 opsins

Capitella is a small segmented benthic marine worm most closely related, in the genome project sense, to its fellow annelid Platynereis. The taxonomy of the genus Capitella was thoroughly muddled by a quaint 1976 starch gel electrophoresis allozyme study; Linnean nomenclature has never been developed for the 6 alleged species defined there. The isolate used in the JGI genome project is called Capitella sp. I ES-2005 instead of Capitella capitata.

The last of 3,709,316 trace reads were taken in Nov 2005. As with Lottia, a multi-year lag ensued in release of the assembly, deposition in GenBank, and publication of central paper. As of Dec 2008, the only access to the genome is through JGI Blast. The genome is small at 240 Mb and distributed across 10 chromosomes.

This is a subsurface deposit feeder associated with organic-rich mud, seemingly not conducive to an extensive visual system. However an extensive 1993 study of both larval and adult eyes was published in the now-defunct Journal of Morphology (online acces $25). Developing larva hava a pair of eyespots consisting of one sensory cell, one pigment cell, and one support cell. The photoreceptor cell has an array of parallel microvilli with cisternae. It is surrounded by a diaphragm formed by a pigment cell ring of microvilli-like structures. These last but a few days because at metamorphosis the larval eyespots are greatly reduced. Adults have one pair of eyes built of 2-3 pigment cells and one sensory cell in juveniles increased by 2-3 more in adults.

Unusual morphological aspects of Capitella eyes can be placed within the overall context of photoreceptor cells and eyes in Annelida as reviewed by Purschke et al. in an off-PubMed journal "Arthropod Structure & Development" v35:4, 2006 where viewing full text costs $175. In addition to rhabdomeric and ciliary types, less-known phaosomous photosensory cells are discussed. Phaosomes (Greek: phaos = light, soma = body) were first described in the earthworm dermal photoreceptors as a central intracellular cavity (phaosome) filled with microvilli but may represent a derived form. They occur at various extraocular sites such as dermus and genitalia (in butterflies). Multiple types of photoreceptors thus provide a potential role for the diversity of opsins observed in the genome.

I found a number of related opsin fragments in Capitella using various queries but surprisingly no counterpart to Platynereis ciliary opsins. One, stored as MEL1_capCap, clusters consistently with melanopsins and shares two exon breaks. It may be an ortholog of the rhabdomeric Platynereis opsin. The second MEL2_capCap is more distantly related. Reliable full length genes will require a cdna program which so far is totally lacking.

Opsin capitella.png


Lophotrochozoa: Helobdella robusta (leech) .. 2 opsins

The JGI genome project for the leech Helobdella robusta is well along with 3,168,749 traces, a very recent assembly to blast, but no cdna. The genome is fairly small at 300 Mb but does not appear reduced in terms of gene count. Fifteen unannotated 100 kbp contigs are available at the HTG division of GenBank; these do not contain opsins but might otherwise suggest gene and retroposon densities and extent of synteny retention. The genome had not been submitted to GenBank by Dec 07.

Helobdella could be considered a promising emerging experimental system because techniques such as large-scale whole-mount in situ hybridization screening, RNA interference, and morpholino knock-down are established. It's not clear however that leech retains the degree of ancestral characters as nereid polychaetes. Until a cdna program is established, it will prove very difficult to annotate complete coding genes. The nearest species with a transcript program is the earthworm Lumbricus rubellus with 19,934 ests (but no opsins).

Helobdella is a rhynchobdellid, which is to say (ελεο marsh, ῥύγχος snout, βδελλα leech) a California marsh leech with a muscular straw-like proboscis in a retractable sheah for puncturing prey. Thus it is not closely related to the medicinal leech, Hirudo medicinalis. The anatomy of the closely-spaced single pair of eyes was intensively studied 40 years ago. An eye in this group consists of 30-100 photoreceptive cells in a deep pigment cup providing directional vision. Larvae are not free swimming but stay in the albuminous fluid of a cocoon. The 88 Pubmed articles include many on body plan gene expression but only two on eyes and these tangentially. We can only hope the genome project will stimulate additional studies of leech photoreceptors. It seems that every lab uses a different strain if not a different species.

I recovered two Helobedella opsin genes on 4 Dec 07 from the erratic JGI server (if no matches, close and restart with a fresh window). The full length gene, stored as MEL2_helRob has 2 conserved introns characteristic of melanopsins and its best matches there. It is likely an ortholog of a similar gene in Schistosoma, Schmidtea, Capitata, and Platynereis. The 231 aa fragment stored as MEL2_helRob has best match to octopus and chordate melanopsins and shares the first (and possibly second) intron position and phase with them. The parent scaffold 39 may contain tandem opsins or alternatively represent a misassembly. No counterpart to the ciliary opsin of Platynereis emerged. That gene -- which must have been present in the common ancestor with annelid -- could have been lost or is simply missing from the current assembly.

Opsin helobdella.png


Lophotrochozoa: Aplysia californica (sea hare).. 2 opsins

Aplysia has a pair of cephalic dorsal pit eyes just anterior to the rhinophores. The eyes are quite small at 600 microns diameter, with a spherical lens and a tiny one square millimeter retina with approximately 7000 rhabdomeric photoreceptors. Despite a fair number of studies of eyes and rhinophores involved in vision, circadian rhythm and phototactic head-waving, the opsins have not been characterized beyond immunoblot (positive for etinal photoreceptors, rhinophores, cerebral ganglia and ventral abdominal ganglia giant cell R2). There is evidence for G protein alpha subunits Gq, Gi, and Go families, phospholipase C, and an inositol 1,4,5-trisphosphate receptor in the rhinophore but this may be for chemoreception.

The sea hare genome has recently be sequenced by Broad Institute. Sizeable assembled contigs are now open to tblastn at the "wgs" division of GenBank (which allows the exon pattern to be extracted). Despite the assembly, sequencing continues: 212,159 new traces were added in the last week of Nov 07. This illustrates the need to always check the primary data repository when a gene seems missing -- millions of traces might not be used in the assembly. However a close-in query is needed to get a match.

I located the first known Aplysia opsin in the 20874 bp contig AASC01108363 on 2 Dec 2007. It had a significant expectation value (e-60) but the best match percent identity within the opsin reference collection (to fellow mollusks) was only 118/319 (36%). Otherwise the best matches are consistently vertebrate melanopsins. This gene is a strong candidate for an invertebrate melanopsin ortholog. It is stored as MOLL_MEL_aplCal.

Indeed, there are four exons but precise boundaries are difficult to locate at this low percent identity without cdna or reliably intronated guide sequence from a closely related species. However 2 introns clearly have identical position and phase to vertebrate melanosins and a 3rd quite likely; otherwise there has been intron loss in Aplysia. The contig unfortunately does not contain any information (according to blastx) on adjacent genes (synteny) despite 10 kbp still available 3'. A second fragmentary 175 aa opsin can be found on AASC01108363. Its affinities are less clear but it too is rhabdomeric. No counterpart to the Platyerneis ciliary opsin could be found.

At this point, Aplysia is not a Rosetta stone for opsin evolution. It is however the first mollusk with a genome assembly. This may eventually allow confident transfer of orthology validated by synteny, intron pattern, and indels. The eyes appear homologous in many aspects to those of Arthropoda supporting the common ancester of Protostomia having rhabdomeric lensing eyes, though true across-the-board homology of all eye components is a very complex subject.

Opsin aplysia.png



Lophotrochozoa: Lottia gigantea (limpet) .. 1 opsin

The limpet Lottia gigantea was intended to be the first lophotrochozoan for whole genome sequencing but that goal slipped. It has ancestral-like spiral cleavage and trochophore larva. The genome is small relative to other molluscs at 500 mbp. Some 5.3 million traces were sequenced by May 2005. In Jan 2007 the sequencing center presented the genome at a meeting talk. However by Dec 2007 no paper had appeared. Recently JGI enabled blast of the assembly and display on their funky browser. However nothing was submitted to Genbank. JGI predicts 4 rhodopsins for its KOG gene collection; however none are recognized by the Opsin Classifier. No transcripts are available, though other molluscs have numerous ests. A German group suggests that the genome sequenced was in fact Lottia scutum.

Under these circumstances, I annotated just one Lottia opsin on 3 Dec 07. Its best match is to other Gq-coupled molluscan opsins. The gene has 3 exons with the two splice positions and phases identical to those of melanopsin (which in vertebrates has numerous other introns). It needs to be established whether these introns are ancestral generic GPCR introns or diagnostic and informative of melanopsins as a gene class. No counterpart to the ciliary opsin of ragworm was immediately apparent.

Lottia is not emerging as a model organism. There are only a handful of studies at PubMed and none on vision. The adult limpet has a pair of eyespots at the base of its cephalic tentacles that likely house a rhabdomeric opsin, perhaps the one annotated here. There may be a second role for paired eyespots in the free-swimming larva for those five days (thoroughly reviewed for chiton trochophores by Arendt and Wittbrodt but not Lottia specifically). Circadian rhythm might involve an additional opsin. The adult is an algal gardener that clears and defends intertidal areas -- raiding limpets are sensed (visually?) and driven off. The opsin sequence found here, stored as MOLL_MEL_lotGig, suggests rapid divergence rather than living fossil character. However patellogastropods such as Lottia with symmetrical non-coiled, conical shells are sometimes taken as ancestral form.


Opsin lottia.png


Lophotrochozoa: Schmidtea mediterranea (planaria) .. 1 opsin

The common planaria Schmidtea mediterranea has a 865 Mb genome very recently assembled from 17 million traces to 10x and placed in the wgs division of GenBank, after an initial impasse attributable to high AT (69%), repeat content (46%) and high clonal heterozygosity. The genome project is described in a [http:www.genome.gov/Pages/Research/Sequencing/SeqProposals/PlanarianSEQ.pdf white paper] and has a dedicated site SmedDb. It has a strong EST collection as well.

The planarian central nervous system consists of a bilobed brain and two longitudinal ventral nerve tracts connected by commissural neurons. When planarians are decapitated they can completely regenerate a new brain, including new eyes, a boon to opsin research. The structure of the eye had already been described by 1915. Regeneration of the nervous system is a very active research area.

I began with various fragmentary opsins and ESTs and recovered a nearly complete melanopsin (including all introns) from trace archives. It is stored at the Opsin Classifer as RHAB_schMed and discussed in the Schistostoma section as a likely ortholog. Since the site of expression is known from hybridization and no other Schmidtea opsins are apparent, this is likely the principal photoreceptor both here and in Schistostoma. No counterpart to the Platynereis ciliary opsins can be found in the current assembly, indicating (since they could hardly have been invented in Platynereis) their loss in Platyhelminthes is a derived condition.


Opsin planaria.png

Lophotrochozoa: Schistosoma mansoni (trematode) .. 2 opsins

The blood fluke Schistosoma mansoni is a major agent of schistosomiasis (bilharziasis), infecting more than 200 million people worldwide, with the fresh water snail (Biomphalaria glabrata -- a large EST project) as intermediate host. As an endoparasite residing deep inside lungs, hepatoportal circulation, and mesenteric veins, it would not seem a promising species for eyespots or even circadian rhythm opsins. However at least two life stages are affected by light: the hatching of the miracidium from the egg and emergence of cercaria from the snail. These swim upwards to the surface of the water and are also affected by shadows and turbulence.

GPCR proteins are the target of approximately half of all pharmaceuticals. For that reason, a Schistoma opsin came to be studied. That gene is expressed in the miracidia and cercaria stages but down-regulated in the adult. Expression is localized to sub-tegumental structures at the front end of cercariae. Full text of the 2001 article remains locked behind a sick commercial firewall, as does a 1975 electron microscopy study of photoreceptor lamellae seen as extensions of modified cilia.

Version 4.0 of genome is readily available for blast though it is missing from GenBank as are two million of the 3.8 million total traces (7x) despite NAID funding. It's unclear whether the extensive EST set of 31000 assembled sequences is available there. The Schistosoma genome is approximately 270 MB with low GC content 34%, moderate retroposon levels andwith an estimated 15-20,000 coding genes.

I determined the intron structure of the published opsin gene (called MEL1_schMan in the opsin classifier) which classifies with melanopsins. Using this as probe, a second full length paralogous opsin MEL2_schMan was annotatable. While percent identity was only 46%, the intron structure and alignment classification were identical. Possibly this second gene has a role in the miracidium, though the first gene is expressed in both stages, more compatibly with "two color non-imaging" eyes.

The first opsin is more closely related in sequence to the sole known opsin in Schmidtea, RHAB_schMed where it possibly plays a homologous role. As queries, these proteins turn up closest matches at GenBank EST in other platyhelminthes. These observations do not support the notion of horizonal gene transfer of opsins from the host snail, another Lophotrochozoan which by itself might favor sequence clustering. It would be feasible to explore synteny in both platyhelminthes.

I investigated conservation of intron position and phase using the reliably intronated match with either MEL1_gasAcu of stickleback minnow (or equally MEL1a_braFlo of amphioxus). Here the percent identity is fairly low (39%) but enough patches of good matching suffice to reliably anchor the alignment. There is perfect agreement of the first three intron positions and phases, below.

This is strong evidence for a very deep connection vertical descent of these genes from a common ancestor (eg, orthology) because these introns are highly specific to melanopsin within the opsin superfamily, ie are not generic GPCR introns as seen from the total mismatch to Ixodes, Apis, and vertebrate ciliary opsins. These same introns are predicted for opsins from transcript species such as LOPH_RHO_plaDum (Platynereis dumerilii) and MOLL_MEL_patYes (scallop). It remains to be demonstrated that all these melanopsins play a conserved consistent homologous role.


Opsin loph mel introns.png


Panarthropoda: Hypsibius dujardini (water bear) tardigrade 0 opsins

A 5x genome project for Hypsibius dujardini, a phylum of microscopic ecdysozoan was approved in July 2007 but Broad has not yet begun trace reads on the small 70 mbp genome (suggesting densely spaced genes with small introns as this is not likely highly derived). It could prove very useful for opsins as tardigrades are basal to all of Arthropoda and so shed light on that last common ancester. In fact with accompanying centipede, horseshoe crab, amphipod, and priapulid genomes, the whole ecdysozoan ancester will be accessible.

The only known fossil specimens are found in Siberian mid-Cambrian deposits and much later amber. The older fossils have three pairs of legs rather than four, a simplified head morphology, and no posterior head appendages and probably represent a stem group of extant tardigrades. Aysheaia from the Burgess Shale might be related to tardigrades.

Nothing is currently known about photoreception or opsins in tardigrades -- or even if they have eyes. However it looks like we can expect some rhabdomeric opsins at the minumum in front of these pigment cups. However the current GSS and EST collections (about 6000 sequences) do not currently contain any convincing matches using various rhabdomeric and ciliary opsins as tblastn queries. Tardigrade photos and movies provided by Goldstein Lab

Tardi.png



Chelicerata: Ixodes scapularis (tick) 1 opsin

The genome project was completed long ago but has experienced a multi-year bottleneck in assembly release and publication. However contigs built from 19.4 million traces should become available to tblastn of the GenBank "wgs" division by late 2007. Ixodes has a very conservative genome (regretably 2.1 gbp in size), seemingly far less derived than drosophilids in matters such as intron, gene retention, and protein sequence conservation. This, in conjunction with the helpful phylogenetic position of chelicerate outgroup to the many insect genomes, has improved prospects for reconstructing the ancestal opsin repertoire of Arthropoda and eventually Protostomia and UrBilatera.

A large collection of annotated Ixodes ESTs is available at the DFCI Gene Index of which 3 are marked up (2 wrongly) as opsins. Using the Opsin Classifier, I recovered the full length gene for the first of these TC19272 on 24 Nov 07, intronated the transcript at the Trace Archives (4 introns, superb coverage), and added it to the classifier fasta collection as RHAB1_ixoSca. It classifies with rhabdomeric opsins (ie with deuterostome melanopsins) with a very respectable 57% maximal percent protein identity. The second and third intron have classical ancestral position (following GWSR and LAK) and phase (2 and 0). Synteny awaits assembly of large contigs -- adjacent exons are not spanned by single traces. An apparent ciliary opsin fragment in Ixodes was located using that of Platynereis dumerilii as probe, it is stored as CILI_ixoSca but needs further analysis.

Crustacea: Daphnia pulex (water flea) .. 37 opsins

An 8.7x genome assembly was released in July 2007 at JGI with further support at wFleaBase. This crustacean provides a potentially important outgroup to insects (together forming Pancrustacea). However the opsin story, summarized in a meeting abstract is an embarrassment of riches, not conducive to deducing ancestral arthropod genome content. The total number of opsin genes came in at 37, comprised of 22 rhabdomeric opsins (mostly long wavelength), 7 ciliary opsins (pteropsins), and 8 in a novel family without close affiliates. This seems excessive but Daphnia has ommatidia (compound eyes), circadian rhythms, and a need to assess water turbidity and depth. Planned in situ hybridization studies may illuminate biological roles of these opsins. The pteropsins are probably of most interest here.

Gene models have not been submitted yet to GenBank but are likely extractable by text query at wFleaBase. What is needed here however is not the clutter of 37 sequences but their collapse into UV, blue, long, pteropsin, and novel ancestral representatives. This would remove 'noise' from lineage-specific expansions. The intron structure could provide very important support to classification schemes.

The expansions may have arisen through retroprocessing (rather than segmental duplication) of a few master exonic genes, which would then be the orthologs to other arthropod opsins. Indeed the intronation pattern -- typically far more deeply conserved than protein sequence -- could link pteropsins more convincingly to lophotrochoan and deuterostome opsins than alignments with percent identities in the 20's.

This blast twilight zone is especially dangerous for photoreceptor opsins because they are embedded in much larger gene family of generic rhodopsin and GPCR which share many structural and signaling properties. A slowly evolving generic rhodopsin might well score higher than fast evolving photoreceptor opsins. Gene expansions are noted for markedly enhanced rates as copies neo- or subfunctionalize. The generic rhodopsin might also share diagnostic residues through convergence at least at the level of statistical signficance ambiguity. Consequently intron location/phase and synteny can provide important backup.

The synteny circle surviving at this phylogenetic depth will be local (optimistically Pancrustacean). That is, the blue opsin of Daphnia might in synteny with Drosophila (ie establish orthology) but not to Platynereis ciliary opsin much less any vertebrate opsin (eg encephalopsin). This could be remedied to some extent by ancestral gene order reconstruction. The degree to which synteny can contribute to validating orthology relations within opsins is not currently known.

Cnidaria: Carybdea marsupialis (jellyfish) cnidarian .. 0 opsins

Cnidarians are the earliest diverging invertebrates with multicellular light-detecting organs, called ocelli. Photodetectors include simple eyespots, pigment cups, complex pigment cups with lenses, and camera-type eyes with a cornea, lens, and retina. These remarkable eyes are located on sensory clubs called rhopalia with four lining the bell of Each houses six eyes: a pair of pit ocelli, a pair of slit ocelli, and two unpaired lens eyes with counterparts to cornea, cellular lens and retina of ciliated photoreceptors. Anatomically, ocelli have bipolar sensory photoreceptor cells interspersed among nonsensory pigment cells with the apical end making the light-receptor and the basal end forming an axon that synapses with second-order neurons to form what amounts to ocular nerves.

The spectral sensitivity of neritic (near-shore) lens eyes of a box jellyfish, Tripedalia cystophora recently considered by M Coates et al was interpreted as a single vitamin A-1 based opsin with peak sensitivity near 500 nm (blue-green). However nothing was sequenced. This species was most helpfully reviewed by Piatigorsky and Kozkmik who note Eakin already commented on seemingly ciliary photoreceptors in 1962. However, 45 years later we still don't know if opsins in cnidarians would classify with vertebrate ciliary opsins. They might even share conserved intron positions.

Opsin cnid larva.png

Furthermore, as noted by Nordstrom et al, planula larvae of Tripedalia have a series of single-cell pigment cup rhabdomeric-like photoreceptors directly connected to motor cilia. These lack neural connections in line with Gehring's notion of the eye preceding the brain in evolution, rather than being a later add-on. So cnidaria might actually retain descendents of both types of ancestral opsins.


Opsins cubomedusae.png

The most striking jellyfish from the perspective of a complex set of eyes is Carybdea marsupialis, as reviewed by VJ Martin. Antibody studies based on vertebrate cone/rod opsins are doubtful because of cross reactivity to generic GPCR proteins; again no opsins have been sequenced yet. This would make a great genome to study provided the retroposon and base composition are not unwieldy. Nematostella and Hydra, whatever their other genomic merits, sit in the Anthozoa and Hydrozoa respectively, types of cnidarian lacking elaborate visual systems.

Vision has roles in the reproduction and feeding of cubomedusae which can find each other and chase, catch, and eat teleost fish. A patch of Pelagia nocticula 10 square miles in extent and 35 feet deep recently destroyed a salmon farm off Northern Ireland.


Cnidaria: Nematostella (starlet sea anemone) .. ? opsins

The Nematostella genome has been released along with major papers and an upgrade to Stellabase. Not all 6.1 million traces were used up by the assembly, so any gene missing from the assembly should be sought directly in the trace archives.

The sea anemone, an anthozoan within Cnidaria having epithelial cells, neurons, stem cells, complex extra-cellular matrix, muscle fibers, and symmetry axis, is emerging as a high-profile evo-devo model species to elucidate the emergence and deployment of genes that determine animal body plans. However those plans don't seem to include eyes or overt photoreceptor structures such as pigment cells -- for that cubomedusae would be far better. PAX6 and RX are especially relevent to photoreceptor structures; their expression has been thoroughly studied in Nematostella without uncovering any sensory system though they contribute to patterning specific components of the ectodermal nerve net.

Four putative opsins have been proposed by the Oakley lab. Accessions of the supporting gene models are given in the JGI protein ID system (non-GenBank) as Nematostella1 219988, Nematostella2 85309, Nematostella3 130042, and Nematostella4 108738 (or fragments in the alignment graphic allow recovery of the respective cdnas by tblastn of GenBank WGS). As noted in the Hydra section, multiple lines of evidence are necessary to establish the first bona fide opsins in cnidarians.

There appear to be some Nematastella opsin-like cdnas at GenBank that cannot be found in the genome assembly or trace archives. While genes can be missing from first assemblies, it is bizarre for 4 to be missing considering coverage is 6x. Upon back-blast to GenBank nr or the Opsin Classifier, very strong matches are seen consistently within crustacea. Thus it appears that these are contaminants from another species, possibly a brine shrimp widely used in aquarium food. It is not unusual to see transcript (at issue here) and genome projects contaminated with dna from other species such as commensals, parasites, and food source -- this is reminiscent of Xenoturbella being confused with a mollusc in its diet.

A third group has taken a serious look at photoreception in Nematostella. No paper or dissertation has emerged as yet; no cnidarian opsins have been posted to GenBank.

Evolution of photoreception: the eyeless anthozoan 
Nematostella vectensis  as a model
Poster talk March 22-23, 2007
Heather Q. Marlow,  Daniel I. Speiser, David Q. Matus and Mark Q. Martindale (Email: marlow@hawaii.edu)
 
"Eyes have evolved numerous times within the animals, yet there has been surprising convergence in
the morphology, function and molecular basis of development in these structures. Although these diverse
eye types have arisen independently, many taxa utilize similar cassettes of genes to specify them. These de-
velopmental genes include members of the SIX class of homeodomain proteins (sine oculis and optix), eyes
absent, dachshund and famously, the Pax genes (Pax6). Additionally, all animals in which photoreception has
been investigated use the opsin family, a class of seven transmembrane receptors, to detect light. Cnidarians
are an early branching lineage that are likely to have diverged from the rest of the animals before the evolu-
tion of discrete eye structures. 

The ancestral cnidarian did not posses eyes, however like the extant anthozoan cnidarians (sea anemones, 
corals, and sea pens), it was likely to have had photoreceptive cells. In order
to determine the level at which cnidarian photoreceptive cells may share homology with bilaterian eyes, we
have examined the expression of these “eye” genes during development in the anthozoan cnidarian model
Nematostella vectensis through in situ hybridization. 

We have also identified, cloned, and studied the expression of many members of the visual opsin class of receptors in N. vectensis.
 Our data indicate that N. vectensis possesses putative photoreceptive cells which express several orthologs to the visual opsins, 
that the organization of photoreceptor cells differs between different life history stages of the animal, and that presumptive
photoreceptor cells express many of the same developmental molecules that specify eye development in
bilaterian animals. These findings support the hypothesis that eyes may share homology only at the level of
the photoreceptor, and that additional “eye” genes may have been co-opted into the eye specification path-
way from more general neural roles in bilaterians." 

Cnidaria: Hydra magnipapillata (hydra) .. ? opsins

Because opsin photoreception is quite ancient, clearly pre-Bilaterans have a major role to play in illuminating the origins of photoreception systems. What's not so clear is that the two cnidarians chosen so far for genome projects are optimal in this regard.

Hydra does not have overt photoreceptive structures or cells obviously specialized for light detection yet it exhibits marked behavioral photosensitivity (noted by Trembley in 1744). Studies beginning in 2000 flagged the ectoderm (using antibodies to squid rhodopsin), known to contain epidermal sensory neurons, as responsible for extraocular photoreception. Musio and coworkers sought to recover opsins using degenerate primers, targeting melanopsin as the most plausible in Hydra because that opsin seems not to require advanced relationships with neighboring cells (ie, acts as photosensor and its own photoisomerase) but no opsin sequences have been submitted to date to GenBank.

The Hydra cdna CB073527 was proposed as a peropsin based on best-blast to mouse peropsin. However using a much larger collection of demonstrably orthologous chordate peropsins in the Opsin Classifier conflicts with this interpretation: the putative cnidarian gene needs to consistently associate with this gene family (equivalently, have best match to it among all reconstructed ancestral opsins) but does not. Furthermore the best match is very weak at 31%. These are exactly the signatures of generic non-opsin rhodopsin superfamily members (which we expect any eumetazoan to have by the hundreds).

With the imminent availability of the Hydra genome assembly (or just using the trace archive), the 161 amino acids of the fragmentary transcript can be extended for example with trace 1121878952 to apparent full length (309 aa) and its introns determined (none). This does not improve its best-blast score nor family coherence. It does not cluster consistently with ciliary opsins -- what would the signaling partner be when the matches are scattered between Gt, Go, and Gq opsins? The blast probability of 1.6e-34 does not mean much under these circumstances.

Opsin hydra doubtful.png

Two putative hydra opsin fragments can be extracted from Fig.1 of an Oct 2007 paper, AKSSTIINPTISCIIYKE and AKLSAVLNALVNCYINKS. These too fail to extend to fully convincing opsins. Expression centers around the hydropore. Note though that fits GPCR chemoreceptor localization even better -- there is no behaviorial evidence for photoreception near the hydropore and no possibility of an imaging eye. No ultrastructure study was conducted -- that is needed to demonstrate that the putative opsin is expressed in specialized photoreceptor cells. No non-opsin controls were included in the alignment -- all GPCR proteins bind heteromeric G proteins and many have lysine without binding of retinal. The cdna accession numbers were inadvertantly omitted from supplemental.

These papers highlight the special difficulties in working with cnidarian opsin candidates. We know from the outset that they will be quite diverged. Multiple forms of supporting data are needed, preferably in the form of diagnostic introns, alignments demonstrating conservation of critical residues and structures, in situ hybridization to anatomically plausible neuronal photoreceptors, and specific loss of photobehavior upon knockdown.

A higher standard of proof is needed for the first cnidarian opsins because validated ones will surely be used to pull in further homologs via annotation transfer. There is a definite risk in admitting inadequately documented opsins to the Opsin Classifier because once that database is tainted, it could draw in even more non-opsins from the GPCR world.

CnidBase provides a blast service to cnidarians including hydra but this appears restricted to ESTs and so only duplicates GenBank. GenBank does not carry any contigs or genome assembly on 12 Dec 07. Some 10.2 million Hydra traces have been provided by JCVI, ample for the 1290 Mbp estimated genome size. However the hydra genome project is no longer mentioned on that website. The draft genome expected in Dec 2005 has not surfaced, possibly because high AT content complicated assembly.

Because trace archive searches require a good query, the best current opsin search strategy uses tblastn of the excellent EST collection and extends those results with good Opsin Classifier outcomes with trace reads. Using that method, I recovered an intronless melanopsin candidate could be further characterized bioinformatically but in the absence of experimental support cannot be proven definitively involved in hydra photoreception.

>MEL?_hydMag Hydra magnipapillata CV465424
MAGNDTLEKFSKEIIIIKSLYLVICIILGLIGNLVVLITILKYRKLQTITNYFVLNLSITDLLFIICCMPTIIITTIGEKWLLGNAVCNIIGFLNVLLCTNSIWNLVMISINRYLNVAKPKKIKEIYTRKKTILMI 
ISVWIVSGLVSVPPLLNWSSYKPGPNFCTVDRKGAKSFYLLILLIMYILPLLILVSLYSCIFFILKKKGKKILLKCNINYIEHSDCKGNASIYKNGLLSYQAG
NKQISNKKIMINYFITKNNDTYKVSVKKAQNHNKKLQKCVKLYKQYQITKRLMVLVLSFF
LFWTPFFIGSFLITYGVKNKKNFHFTTFGVMCGCLSSISNPFIYSMNSSFRNHLRKLSRNFFNEKNY*
 
>Per?_hydMag magnipapillata Hydra cdna CB073527
MAFVFIIVFLSFLCGFSVILNVTVVLTILAKGNTKNTRDVILMSLAICDGVQCTIGY
PVELFGYANYKNPSLSEKFCKPSGFIVMYLALTAIAHLVCLCIYRYLTIVYPLKLQIFLT
KSNWSACGCIAFCWIYGLFWSLSPLLGWNEIVRENKDTYKCSINLYPDNEIKSSYLYALA
IFCYLIPLIIIIYCSLKVHSELRNMLKMCKQISGVEANITKVTYRIEKQDFISVSFIIAS
FFTVWTPYAVCVFYLTIGKKLPPSFLTYCALFAKSSTILNPIIYCLMYKKFRQTLQSKFG
KLFNNPTVTPAV* 0

Porifera: Amphimedon queenslandica (sponge) and earlier metazoa .. 0 opsins

Sponges lie at the base of multicellular animals. They are not noted for eyes. However demosponge larva do exhibit phototaxis (shadow seeking under coral rubble) but the action spectrum is reportedly a better fit to a flavin or carotenoid chromophore. The genome of Amphimedon queenslandica has been available for years at the Trace Archives but never assembled. The species was formerly called Reniera spp. and it is still carried under that name at JGI Genome. Consequently tblastn of contigs is not available without do-it-yourself assembly.

The Oakley group reported searching for sponge opsins but finding only "non-opsin, rhodopsin-class GPCR genes" from Amphimedon. Similarly, no opsins were located in the even earlier diverging placozoan Trichoplax, choanoflagellate Monosiga, and fungal genomes. This fits a picture of photoreceptor opsins first appearing subsequent to sponge in eumetazoa cnidarians. However these were not de novo genes but rather evolved out of the already-rich cauldron of GPCR gene copies.

Some later diverging species such as the model organism C. elegans lost all of their opsin genes, making them useless in Urbilateran ancestor reconstruction. This argues for much more intensive genomic sampling so as to sidestep the widespread problem of gene loss in model organisms.

Coming real soon!

Cephalochordata: Branchiostoma (amphioxus) .. 7 opsins