SECIS binding proteins: KIAA0256 and SBP2: Difference between revisions

From genomewiki
Jump to navigationJump to search
(New page: == KIAA0256: mis-annotated and forgotten ancestral SECIS binding protein == KIAA0256 originally arose in a GenBank submission package from a large-scale mRNA sequencing project at the Kal...)
 
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
== KIAA0256: mis-annotated and forgotten ancestral SECIS binding protein ==
== KIAA0256: mis-annotated and forgotten ancestral SECIS binding protein ==


KIAA0256 originally arose in a GenBank submission package from a large-scale mRNA sequencing project at the Kaluza Institute. While high quality, it skips over a highly conserved exon 8. This does not alter downstream reading frame since the skipped exon resides in a series of consecutive phase 00 splices. NCBI staff then confounded the record by posting experimentally unsupported gene predictions -- all exon-skipping -- from various genome assemblies lacking significant transcript programs '''mis-labelling them as mRNAs''', thus entrenching the incomplete variant as normal form.
KIAA0256 originally arose in a GenBank submission package from a large-scale mRNA sequencing project at the Kaluza Institute. While high quality, it skips over a highly conserved exon 8, VGF...YFE. This does not alter downstream reading frame since the skipped exon resides in a series of consecutive phase 00 splices. NCBI staff then confounded the record by posting experimentally unsupported gene predictions -- all exon-skipping -- from various genome assemblies lacking significant transcript programs '''mis-labelling them as mRNAs''', thus entrenching the incomplete variant as normal form.


Two [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=86695 subsequent papers] featuring the [http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=10637234 twice-published Figure 1B] aligned full length SBP2 to an intermediate region of the exon-skipping gene model of KIAA0256 that by coincidence began with a methionine, namely residues 422-849 of the 1101 residue protein which did include the motif-bearing residues 632-829 of exons 14-16. Making matters worse, KIAA0256 contains an additional immensely conserved exon 11 that lacks a detectable counterpart in SBP2. SwissProt provides the proper full length protein Q93073 without a supporting accession.
Two [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=86695 subsequent papers] featuring the [http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=10637234 twice-published Figure 1B] aligned full length SBP2 to an intermediate region of the exon-skipping gene model of KIAA0256 that by coincidence began with a methionine, namely residues 422-849 of the 1101 residue protein (which did include the motif-bearing residues 632-829 of exons 14-16). SwissProt provides the proper full length protein Q93073 without a supporting accession.


The effect of these early errors meant early SECIS binding experiments used KIAA0256 protein lacking the conserved exon 8, this variant lacked relevent SECIS binding properties, whick lead to abandonment of further experimentation. Consequently we know nothing about the SECIS binding properties of full length KIAA0256 protein.
SBP2 exhibits a fusion of two exons (2 ELSWTPMGYVVRQTLSTEL 00 SAAPKNVT...) relative to KIA0256 within an indel-rich area of the protein. The intronation of KIA0256 exhiibits the ancestral form. The fusion occured prior to teleost fish divergence but is hard to date beyond that. After consideration of anchoring patches of semi-conserved residues, the alignment of human paralogs in this region is:
 
  SPB2 exons 5-8 showing <span style="color: #990099;">fusion</span>                            KIAA0256 exons 5-9
2 ELSWTPMGYVVRQTLS<span style="color: #990099;">TEL 0</span>                                  2 GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ 0
<span style="color: #990099;">0 SAA</span>PKNVTSMINLKTIASSADPKNVSIPSSEALSSDPSYNKEKHIIHPTQK 0  0 VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE 0
0 SKASQGSDLEQNEASRKNKKKKEKSTSKYEVLTVQEPPRIE 0            0 DEDGFQELNENGNAKDENIQQKLSSKV 0                 
0 DAEEFPNLAVASERRDRIETPKFQSKQQPQ 0                      0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 DNFKNNVKKSQLPVQLDLGGMLTALEKKQHSQHAKQSSKPVVVS 1        0 EALSKAAGKKNKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
 
The effect of these early errors meant early SECIS binding experiments used KIAA0256 protein lacking the immensely conserved exon 8, this variant unsurprisingly lacked relevent SECIS binding properties, whick lead to abandonment of further experimentation. Consequently we know nothing about the SECIS binding properties of full length KIAA0256 protein.


There is no reason to believe an odd fragment studied on a small subset of SECIS elements could accurately reflect binding properties of full length protein in regards to all 25 orthology classes of human SECIS elements. However these results remain <span style="color: #990099;">accepted folklore</span> within the selenocysteine research community even today.
There is no reason to believe an odd fragment studied on a small subset of SECIS elements could accurately reflect binding properties of full length protein in regards to all 25 orthology classes of human SECIS elements. However these results remain <span style="color: #990099;">accepted folklore</span> within the selenocysteine research community even today.
Line 15: Line 24:
Transcriptional processing in mammals is error-rich, producing numerous defective mRNA variants that never amount to useful regulation or stable protein -- downstream quality controls quickly eliminate them. While exon-skipping in some cases may have adaptive significance, without significant comparative genomics support, the default hypothesis is they do not. Here it is difficult to distinguish between a weakened exon 8 splice acceptor in apes leading to a fraction of defective transcripts versus an innovative functional truncated form. These alternatives might be resolvable from 3D structural considerations -- deleting 45 conserved residues in an ancient exon is highly problematic.
Transcriptional processing in mammals is error-rich, producing numerous defective mRNA variants that never amount to useful regulation or stable protein -- downstream quality controls quickly eliminate them. While exon-skipping in some cases may have adaptive significance, without significant comparative genomics support, the default hypothesis is they do not. Here it is difficult to distinguish between a weakened exon 8 splice acceptor in apes leading to a fraction of defective transcripts versus an innovative functional truncated form. These alternatives might be resolvable from 3D structural considerations -- deleting 45 conserved residues in an ancient exon is highly problematic.


However within this same protein, the apparent ancestral loss of exon 11 in SBP2 provides proof that exon-skipping can be evolutionarily fixed. That extraordinary event (which occured prior to chondrichthyes divergence: ES788504) certainly differentiated SBP2 from KIAA0256. Exon 11 is in the 99th percentile of vertebrate exon conservation, being 100% conserved back to amniotes (though divergence is high in early deuterostomes, below). It did not arise as exon gain from retroposon co-optation. This surely implies a critical role in KIAA0256 lost in SBP2.
KIAA0256 and SECISBP2 have so-so alignment but over their [http://genome-test.cse.ucsc.edu/cgi-bin/hgNear?near_search=uc004aqj.1&hgsid=1521383&near.do.affineAli=uc001zxd.1 entire lengths] plus 17 exactly comparable exons (trillion:1 odds for coincidence), meaning they reflect a segmental gene duplication (which can be dated to post-amphioxus, pre-chondrichtyhes). It is imperative to enforce exon boundaries to achieve true homological alignment of two proteins this diverged and so gappy N-terminally; structure-based alignment has different rules (allowing convergent evolution) and different goals.


KIAA0256 and SECISBP2 actually align moderately well over their [http://genome-test.cse.ucsc.edu/cgi-bin/hgNear?near_search=uc004aqj.1&hgsid=1521383&near.do.affineAli=uc001zxd.1 entire lengths] and have 17  perfectly comparable exons (trillion:1 odds for coincidence), meaning they reflect a segmental gene duplication. It is imperative to enforce exon boundaries to achieve true homological alignment of two proteins this diverged and so gappy N-terminally; structure-based alignment has different rules (allowing convergent evolution) and  different goals.
The teleost fish Pimephales promelas has sufficient transcript coverage to allow recover of an accurate full length KIAA0256 gene with a respectable 62% identity to human. No fish has sufficient transcripts to recover full length SBP2 as of Dec 08. Some initial exons are quite well conserved over this billion years of branch length, strong evidence that they retain an unknown function under strong selection. However the gaps in other early exons are incompatible with retention of tertiary protein structure. No early pfam domain can be found.  


The teleost fish Pimephales promelas has sufficient transcript coverage to allow recover of an accurate nearly full length KIAA0256 gene with a respectable 62% identity to human. (No fish has sufficient transcripts to recover full length SBP2.) That gene is shown below as homologically gapped exon-by-exon to human. Some early exons are quite well conserved over many billions of years of branch length, strong evidence that they retain an unknown function under strong selection. However the gaps in other early exons are incompatible with retention of tertiary protein structure. No early pfam domain can be found.  
We have to wonder how sea urchin, which has a full length apparent ortholog of KIAA0256 on Scaffold18963 but nothing clustering to SBP2, can insert selenocysteine into its numerous selenoproteins (SEPHS1, SELU1, SELU2, SELM, SELO, SELW, SELN1, GPX3, GPX2, GPX4, GPX7,...). Unless a second copy has been lost, all SECIS interaction at the ribosome at sea urchin divergence appears to have been handled by ancestral KIAA056.  


We have to wonder how sea urchin, which has a full length apparent ortholog of KIAA0256 on Scaffold18963:101,648-115,302 but nothing clustering to SECISBP2, can insert selenocysteine into its numerous selenoproteins (SEPHS1, SELU1, SELU2, SELM, SELO, SELW, SELN1, GPX3, GPX2, GPX4, GPX7,...). Unless a second copy has been lost, all SECIS interaction at the ribosome at sea urchin divergence appears to have been handled by KIAA056.  
The same can be said for amphioxus and tunicate. These species too have numerous selenoproteins yet their genome assemblies contain but a single homologous gene with vastly higher homology to KIAA0256. Lamprey genome lacks adequate coverage; elephant shark has fragments of both genes. It's difficult to extend orthologous annotation into protostomes and cnidaria because divergence is high even within the L7ae motif, though three long overlapping cDNAs from clam allow recovery of a long terminal fragment.  


The same can be asked for amphioxus and tunicate. These species too have numerous selenoproteins yet their genome assemblies contain but a single homologous gene with vastly higher homology to KIAA0256. Lamprey genome lacks coverage. It's difficult to extend orthologous annotation into protostomes and cnidaria because divergence is high even within the L7ae region.  
In summary, the genomic evidence strongly supports the scenario of a single-copy gene fulfiling all roles of SECIS binding at the ribosome for hundreds of millions of years. This gene resembled KIAA0256 much more strongly than SBP2, which arose much later and only in the deuterostome lineage. SBP2 apparently took on the SECIS binding role for a subset of these elements (split-functionalization) but continues to evolve much rapidly, suggest less selection and partial compensatory overlap with the parental gene KIAA0256.


  >SECISBP2_homSap Homo sapiens (human) full length
  >SECISBP2_homSap Homo sapiens (human) full length
Line 45: Line 54:
   
   
  <span style="color: #996633;">407–525</span> domain required for U insertion but not SECIS binding (399–516 in rat)
  <span style="color: #996633;">407–525</span> domain required for U insertion but not SECIS binding (399–516 in rat)
  <span style="color: #FF0000;">540</span>     R540Q allele of SBP2 decreases GPX1 and DIO2
  <span style="color: #FF0000;">R</span>       R540Q allele of SBP2 decreases GPX1 and DIO2
  <span style="color: #0066CC;">650–752</span> L7Ae motif kink-turn binding motif
  <span style="color: #0066CC;">650–752</span> L7Ae motif kink-turn binding motif
  <span style="color: #990099;">676</span>    invariant glycine (669 in rat)
  <span style="color: #990099;">676</span>    invariant glycine (669 in rat)
<span style="color: #00CC66;">exon 8</span>  skipped in improbable RefSeq alternative splice of KIAA0256
   
   
  >KIAA0256_homSap Homo sapiens (human) length=1101
  >KIAA0256_homSap Homo sapiens (human) length=1101
Line 58: Line 68:
  2 GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ 0
  2 GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ 0
  <span style="color: #00CC66;">0 VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE 0</span>  
  <span style="color: #00CC66;">0 VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE 0</span>  
  0 DED<span style="color: #996633;">GFQELNENGNAKDENIQQKLSSKV 0</span>
  0 DEDGFQELNENGNAKDENIQQKLSSKV 0
  <span style="color: #990099;">0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0</span>
  0 LDD<span style="color: #996633;">LPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
  <span style="color: #996633;">0 EALSKAAGKKNKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
  0 EALSKAAGKKNKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
  2 VVTAASFHTKDSTNRKPLTKSQPCLTSFNSVDIASSKAKKGKEKEIAK</span>LKRPTALKK 0
  2 VVTAASFHTKDSTNRKPLTKSQPCLTSFNSVDIASSKAKKGKEKEIAK</span>LKRPTALKK 0
  0 VILKE<span style="color: #FF0000;">R</span>EEKKGRLTVDHNLLGSEEPTEMHLDFIDDLPQEIVSQE 1
  0 VILKE<span style="color: #FF0000;">R</span>EEKKGRLTVDHNLLGSEEPTEMHLDFIDDLPQEIVSQE 1
Line 70: Line 80:
  GMLEEEEDEDEEEEEDYTHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLNGSEEDNVEQSGEEEAEAPEVLEPGMDSEAWTADQQASPGQQKSSNCSSLNKEHSDSNYTTQTT* 0
  GMLEEEEDEDEEEEEDYTHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLNGSEEDNVEQSGEEEAEAPEVLEPGMDSEAWTADQQASPGQQKSSNCSSLNKEHSDSNYTTQTT* 0
   
   
  <span style="color: #FF0000;">phosphoserines</span> predicted at SwissProt; no counterparts in SECISBP2
   
<span style="color: #00CC66;">exon 8 skipped in RefSeq KIAA0256</span>
    KIAA0256 exon 8 conservation suggests functionality:
<span style="color: #990099;">exon 11 lacks counterpart in SECISBP2</span>
 
          KIAA0256 skipped exon 8 conservation:
  VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE Homo sapiens  
  VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE Homo sapiens  
  VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE Macaca fascicularis
  VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE Macaca fascicularis
Line 92: Line 99:
  LGYRLRGQSTSSERRHNLQRKQDNKTGTPASSNKSGQSPDHLYFE Xenopus tropicali
  LGYRLRGQSTSSERRHNLQRKQDNKTGTPASSNKSGQSPDHLYFE Xenopus tropicali
   
   
          KIAA0256 exon 11 conservation (lost in SBP2):
    KIAA0256 exon 10 conservation (weak in SBP2):
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Homo sapiens  
  <font color="blue">LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Homo sapiens  
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Pongo abelii
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Pongo abelii
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Macaca mulatta
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Macaca mulatta
Line 116: Line 123:
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Echinops telfairi
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Echinops telfairi
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKiQ Monodelphis domestica
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKiQ Monodelphis domestica
  LagLPENSPIsIVQTPIPITaSVPKRAKSQKKKALAAALATAQEYSEISMEQrKLQ Ornithorhynchus anatinus
  LagLPENSPIsIVQTPIPITaSVPKRAKSQKKKALAAALATAQEYSEISMEQrKLQ Ornithorhynchus anatinus</font>
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Gallus gallus  
  <font color="green">LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Gallus gallus  
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Taeniopygia guttata
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Taeniopygia guttata
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQrKLQ Anolis carolinensis
  LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQrKLQ Anolis carolinensis
  LngLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Xenopus tropicali
  LNGLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Xenopus tropicalis
  LDnLPENSPINIVQTPIPITTSVPKRAKSQrKKALAAALATAQEYSEISMEQKKLQ Oryzias latipes
  LDNLPENSPINIVQTPIPITTSVPKRAKSQRKKAMAAALATAQEYSEISMEQKKLQ Gasterosteus aculeatus
LDnLPENSPINIVQTPIPITTSVPKRAKSQrKKAmAAALATAQEYSEISMEQKKLQ Gasterosteus aculeatus
  LDNLPENSPISIVQTPIPITSSVPKRAKSQRKKALAAALATAQEYSEISMEQKKLQ Pimephales promelas</font>
   
  <font color="magenta">LPGSQEPLNPATVVSTPVEVKKEGKNARKKRKKALLAAKAAAEEYSEITQVISENQ Branchiostoma floridae
  LPGSQEPLNPATVVSTPVEVKKEGKNARKKRKKALLAAKAAAEEYSEITQVISENQ Branchiostoma floridae
  ....INSSAPYPSSAANLNEKSQAQKTKKRRKKAERAARAADEEYAEISKEQENIQ Ciona intestinalis  
  ....INSSAPYPSSAANLNEKSQAQKTKKRRKKAERAARAADEEYAEISKEQENIQ Ciona intestinalis
  ....NVQNQVYPPS..NSNEKAQAQKSKKRRKKAERAAKAADEEYAEISKEHENIQ Ciona savignyi
  ....IKPEELLSPANVMSTIKEG-KNARKRRKKAIMATQAAAKEYSEITEEQRQLH Strongylocentrotus purpuratus
DPSSIKPEELLSPANVMSTIKEG.KNARKRRKKAIMATQAAAKEYSEITEEQRQLH Strongylocentrotus purpuratus</font>


Using the kink-turn binding motifs of the two human proteins in turn as blastp query against the both collections of deuterostome KIAA0256 and SECISBP2 sequences, establishes <span style="color: #990099;">KIAA0256 as the slower evolving protein</span> by a wide margin. This fits KIAA0256 retaining ancestral function and its gene duplicate SECISBP2 specializing via a neofunctionalization.  
Using the kink-turn binding motifs of the two human proteins in turn as blastp query against the both collections of deuterostome KIAA0256 and SECISBP2 sequences, establishes <span style="color: #990099;">KIAA0256 as the slower evolving protein</span> by a wide margin. This fits KIAA0256 retaining ancestral function and its gene duplicate SECISBP2 specializing via a neofunctionalization.  
Line 141: Line 148:
   calMil  1.43 65% identity
   calMil  1.43 65% identity


It has not been previously noted that both proteins bristle with potential NxT/S glycosylation sites, 13 for KIAA0256 and 6 for SECISBP2, with implications for cellular localization. These do not lie in homologous positions, unsurprisingly in view of the deep divergence of these genes and volatility of glycosylation sites as seen in other gene families, eg the [http://www.mad-cow.org/00/annotation_frames/tools/genbrow/sulfatases/sulfatases.html#mmm2 17 human sulfatases.] Even within orthologs of the one gene here, they are conserved only to moderate depth (and that could be for reasons unrelated to glycosylation). Hence these site do not provide reliable anchors in region of poor sequence conservation.
Both proteins bristle with potential NxT/S glycosylation sites, 13 for KIAA0256 and 6 for SECISBP2, with implications for cellular localization. These do not lie in homologous positions, unsurprisingly in view of the deep divergence of these genes and volatility of glycosylation sites in [http://www.mad-cow.org/00/annotation_frames/tools/genbrow/sulfatases/sulfatases.html#mmm2 other gene families]. These sites are conserved only to moderate depth -- and that could be for reasons unrelated to glycosylation). Hence glycosylation site do not provide reliable anchors in region of poor sequence conservation. SwissProt predicts phosphoserine sites in exon 5 (of unknown functionality); those too have only moderate phylogenetic conservation.
 
Comparative genomics of 4 glycosylation sites in exon 7 of KIAA0256:
GGV<span style="color: #FF0000;">NWSNVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Homo sapiens (human)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Macaca mulatta (rhesus)
GGVNWPKVTCQATQKRPWMEKNQAFSRGGRQTEQRNNLQ Mus musculus (mouse)
GGVNWPKVTCQATQKRPWMEKNQAFSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Rattus norvegicus (rat)
GSV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Canis familiaris (dog)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>SQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQAEQR<span style="color: #FF0000;">NNS</span>Q Sus scrofa (pig)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Equus caballus (horse)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQGTQKKPWLEK<span style="color: #FF0000;">NQT</span>FSKGGRQMEQR<span style="color: #FF0000;">NNS</span>Q Dasypus novemcinctus (armadillo)
GHV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEKHQTFSRGGRQTEQRNNAQ Loxodonta africana (elephant)
GGASWS<span style="color: #FF0000;">NVT</span>SQATQKKPWMEKSQPFSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Monodelphis domestica (opossum)
.GVSWTNVNSQATQKKPWIEKTQTFIRGGRQAEQR<span style="color: #FF0000;">NSS</span>Q Gallus gallus (chicken)
AGATWA<span style="color: #FF0000;">NVS</span>SQATQKKPWMERTPAFSRGGRQAEQH<span style="color: #FF0000;">NSS</span>Q Anolis carolinensis (lizard)
     Potential for phosphoserine conservation in exon 5 of KIAA0256:
     Potential for phosphoserine conservation in exon 5 of KIAA0256:
  DFPSDIANKSLSETTATMLWKSKGRRRRASHPTAESSSEQGA<span style="color: #FF0000;">SEADIDSDS</span>GYCSPKHSNNQPAAGALRNPDSGTMN homSap
  DFPSDIANKSLSETTATMLWKSKGRRRRASHPTAESSSEQGA<span style="color: #FF0000;">SEADIDSDS</span>GYCSPKHSNNQPAAGALRNPDSGTMN homSap
Line 157: Line 178:
  DFPDDIADKSLRDKPSPLLRKSKARRLASRRPQDPSSTDSEEDEGGIDSD<span style="color: #FF0000;">S</span>GYSSPKHGRNQSA..............braFlo
  DFPDDIADKSLRDKPSPLLRKSKARRLASRRPQDPSSTDSEEDEGGIDSD<span style="color: #FF0000;">S</span>GYSSPKHGRNQSA..............braFlo
  DFPEAIANKPLSDKTSNLTSRSKAKTRKKSQGNASSSSDSEVENTPHDSD<span style="color: #FF0000;">S</span>GYYSPLHAQQ................ strPur QTGRD insertion
  DFPEAIANKPLSDKTSNLTSRSKAKTRKKSQGNASSSSDSEVENTPHDSD<span style="color: #FF0000;">S</span>GYYSPLHAQQ................ strPur QTGRD insertion
Comparative genomics of 4 glycosylation sites in exon 7 of KIAA0256:
GGV<span style="color: #FF0000;">NWSNVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Homo sapiens (human)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Macaca mulatta (rhesus)
GGVNWPKVTCQATQKRPWMEKNQAFSRGGRQTEQRNNLQ Mus musculus (mouse)
GGVNWPKVTCQATQKRPWMEKNQAFSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Rattus norvegicus (rat)
GSV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Canis familiaris (dog)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>SQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQAEQR<span style="color: #FF0000;">NNS</span>Q Sus scrofa (pig)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEK<span style="color: #FF0000;">NQT</span>FSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Equus caballus (horse)
GGV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQGTQKKPWLEK<span style="color: #FF0000;">NQT</span>FSKGGRQMEQR<span style="color: #FF0000;">NNS</span>Q Dasypus novemcinctus (armadillo)
GHV<span style="color: #FF0000;">NWS</span><span style="color: #FF0000;">NVT</span>CQATQKKPWMEKHQTFSRGGRQTEQRNNAQ Loxodonta africana (elephant)
GGASWS<span style="color: #FF0000;">NVT</span>SQATQKKPWMEKSQPFSRGGRQTEQR<span style="color: #FF0000;">NNS</span>Q Monodelphis domestica (opossum)
.GVSWTNVNSQATQKKPWIEKTQTFIRGGRQAEQR<span style="color: #FF0000;">NSS</span>Q Gallus gallus (chicken)
AGATWA<span style="color: #FF0000;">NVS</span>SQATQKKPWMERTPAFSRGGRQAEQH<span style="color: #FF0000;">NSS</span>Q Anolis carolinensis (lizard)


== Reference set of metazoan KIAA0256 full length sequences ==
== Reference sets of metazoan KIAA0256 and SPB2 sequences ==


It is very difficult to extract full length genes from phylogenetically representative organisms in the case of KIAA0256. That's because the gene is twice average size (thus seldom tiled completely by transcripts), has two very short exons that do not emerge from alignment methods, several consecutive poorly conserved exons rife with indels, and a run-on carboxy terminus. Nearly every pipeline entry in GenBank non-redundant contains gross errors, including long internal repeats and severely truncated genes.  
It is very difficult to extract accurate full length genes from phylogenetically representative organisms in the case of KIAA0256. That's because the gene is twice average size (thus seldom tiled completely by transcripts), has two very short exons that do not emerge consistently from alignment methods, several consecutive poorly conserved exons rife with indels, and a run-on indeterminate carboxy terminus. Nearly every pipeline entry in GenBank non-redundant contains gross errors including gratuitous long internal repeats and severely truncated genes.  


It will prove imperative to initiate massive cDNA programs in non-teleost species for this (and many other anomalous genes) for which homological modelling will never work. Tiled coverage will be necessary, not merely end-sequencing.
It will prove imperative to initiate massive cDNA programs in non-teleost species for this (and many other anomalous genes) for which homological modelling will never work. Tiled coverage will be necessary, not merely end-sequencing.
Line 328: Line 335:
2 ETNWRTMVENADAPEPPDSEPISRGNNRDQREVVSPPPQPTANQSLTPSPGVARAPDESRTDDRLEWASLSTETGSLDGSGRDRLNSSHHSTTSTLVPGMLEEE 0
2 ETNWRTMVENADAPEPPDSEPISRGNNRDQREVVSPPPQPTANQSLTPSPGVARAPDESRTDDRLEWASLSTETGSLDGSGRDRLNSSHHSTTSTLVPGMLEEE 0


>KIAA0256_calMil mixed with dogfish
>KIAA0256_calMil elephantfish fragments
QDIQLSAEVEPFIPQKKGTETLVPMALPNDGNGSGVEAPPIPSYLITCYPFVQ
QDIQLSAEVEPFIPQKKGTETLVPMALPNDGNGSGVEAPPIPSYLITCYPFVQ
ENQANRPVYNGDIRWQQANPNSPGPYLAYPILPTPQPPVSTDYAYYQLMPAPCTPMMGFY
ENQANRPVYNGDIRWQQANPNSPGPYLAYPILPTPQPPVSTDYAYYQLMPAPCTPMMGFY
SPFPTPYTGTLPPASVVNAVSECSERP
SPFPTPYTGTLPPASVVNAVSECSERP
DDEEFPDLASAISSSDKTNSAGQNKFFSYKQVTKRQLEPSMDGELVLSVTNSHAIGGEQ
ISLDPDQKPVQILRKESAQENVSSSPQKKAMEAPKKSGKKSKIPMQLDFGNMLAVLEQKQ
QEKKSKHTPKPIVLAVGGTFPLVPKEPTTSKRQPQSSSQEKVPHNPLDSSAPLVKRGKQR
EVPKAKKPTSLKKIILKEREERKQR
NPLDSTAPRVKRGKEKEIPKAKRPTALKKVKSFER
NPLDSTAPRVKRGKEKEIPKAKRPTALKKVKSFER
YCNQVLSKDIDECVTLLLQELVRFQERVYQKDPIKAKMKRRLVMGLREVTKHMKLRKIKCVIISPNCEKIQSKG
YCNQVLSKDIDECVTLLLQELVRFQERVYQKDPIKAKMKRRLVMGLREVTKHMKLRKIKCVIISPNCEKIQSKG
Line 343: Line 346:


>KIAA0256_petMar mediocre fragments
>KIAA0256_petMar mediocre fragments
LHKLRALIISPNCEKIQAKG
LHKLRALIISPNCEKIQAKG
2 GGLDEALQTVIALASEQSVPFVFALNRKALGHCLNKKVPVSVVGVFHYGGAE 0
2 GGLDEALQTVIALASEQSVPFVFALNRKALGHCLNKKVPVSVVGVFHYGGAE 0
THFQRLVALTEEARSAYRNMVSSLQRQEAAATSEPTGHTEDPLEASA
THFQRLVALTEEARSAYRNMVSSLQRQEAAATSEPTGHTEDPLEASA
Line 390: Line 393:
AEEMNLNHRRALDKSFSTCSTLKPEGGVSPRISTTSESSSLIPDDVSSQSSAQDRIQLWLEDATRSVVDLDLNDVVPDAEDVNSESKLVTPDVNESK* 0
AEEMNLNHRRALDKSFSTCSTLKPEGGVSPRISTTSESSSLIPDDVSSQSSAQDRIQLWLEDATRSVVDLDLNDVVPDAEDVNSESKLVTPDVNESK* 0


>KIAA0256_stoPur Strongylocentrotus purpuratus XM_001188118 = bad internal dup 1430 aa  
>KIAA0256_stoPur Strongylocentrotus purpuratus XM_001188118 = bad internal dup 1430 aa
0 MTAMYYNAPSHQHQQQQHHHAPQPLHPHQHQQHHHQQTIPGMVPQPSPSQVVSGMLSEATAAMPGLKPPPPSQPQGGGGGGGGMQQYQTSSASAVATMNGKKVPLTELPRYITTCYPFVQDS 2
0 MTAMYYNAPSHQHQQQQHHHAPQPLHPHQHQQHHHQQTIPGMVPQPSPSQVVSGMLSEATAAMPGLKPPPPSQPQGGGGGGGGMQQYQTSSASAVATMNGKKVPLTELPRYITTCYPFVQDS 2
1 STGAAPATETWMGYPNSSQQPNQPHPQPQQHQHPPLPLPPTSQHPLSHQQPPQTTPMYAPPPPPPPGHQPPSAHLTQQQNQEYFPVHPGYN 0
1 STGAAPATETWMGYPNSSQQPNQPHPQPQQHQHPPLPLPPTSQHPLSHQQPPQTTPMYAPPPPPPPGHQPPSAHLTQQQNQEYFPVHPGYN 0
Line 397: Line 400:
0 VRGQRPMNKDHRYPGGYQNKGREHYQAYVPPPTDLPKPKTKTVVFAEACAQT 1
0 VRGQRPMNKDHRYPGGYQNKGREHYQAYVPPPTDLPKPKTKTVVFAEACAQT 1
2 DFPEAIANKPLSDKTSNLTSRSKAKTRKKSQGNQTGRDASSSSDSEVENTPHDSDSGYYSPLHAQQHNSTGLVSTYSTQTGKPTYSNVAMNNKSSPHQESR
2 DFPEAIANKPLSDKTSNLTSRSKAKTRKKSQGNQTGRDASSSSDSEVENTPHDSDSGYYSPLHAQQHNSTGLVSTYSTQTGKPTYSNVAMNNKSSPHQESR
TVEQNTFTQNQPLVVPQGPPLGPQLGPAPVIQRGRFTPVQPGIPSFRPVMPMSYANMLTKPRAANPPPPPLANVGYPQRPPNVFPTQPPPTYRNMAVSPA
TVEQNTFTQNQPLVVPQGPPLGPQLGPAPVIQRGRFTPVQPGIPSFRPVMPMSYANMLTKPRAANPPPPPLANVGYPQRPPNVFPTQPPPTYRNMAVSPAPMLYQQQQQQQQRRMQSPVPAPQ 0
PMLYQQQQQQQQRRMQSPVPAPQKPPVTPEDTPRKRKQKRTKGKKDGEVELEKPKMVNAATYAKPPQIQDKEEYPGLPLGSPAGNKFGMSTGGRPISYSSALQQRAPVQL 0
0 KPPVTPEDTPRKRKQKRTKGKKDGEVELEKPKMVNAATYAKPPQIQDKEEYPGLPLGSPAGNKFGMSTGGRPISYSSALQQRAPVQL 0
0 VNESSSEEEEEESGGDPSSIIKPEELLSPANVMSTIKEGKNARKRRKKAIMATQAAAK 0
0 VNESSSEEEEEESGGDPSSIIKPEELLSPANVMSTIKEGKNARKRRKKAIMATQAAAK 0
0 EYSEITEEQRQLHENMKKQGKRTKMPIEFDLGDMLAALE 0
0 EYSEITEEQRQLHENMKKQGKRTKMPIEFDLGDMLAALE 0
Line 436: Line 439:
</pre>
</pre>


=== SBP2 L7Ae motifs from 27 vertebrate ===
=== SBP2 L7Ae motifs from 27 vertebrates ===
<pre>
<pre>
>SECISBP2_homSap Homo sapiens (human) 855 aa full length 17 exons
>SECISBP2_homSap Homo sapiens (human)
0 MASEGPREPESE 0
0 GIKLSADVKPFVPRFAGLNVAWLESSEACVFPSSAATYYPFVQEPPVTE 2
1 QKIYTEDMAFGASTFPPQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQYLYNQPSCYRGFQTVKHRNENTCPLPQEMKALFK 0
0 KKTYDEKKTYDQQKFDSERADGTISSEIKSARGSHHLSIYAENSLKS 1
2 DGYHKRTDRKSRIIAKNVSTSKPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVHSVSTDISLLREVVKPAAVLSK 0
0 GEIVVKNNPNESVTANAATNSPSCTR 1
2 ELSWTPMGYVVRQTLSTELSAAPKNVTSMINLKTIASSADPKNVSIPSSEALSSDPSYNKEKHIIHPTQK 0
0 SKASQGSDLEQNEASRKNKKKKEKSTSKYEVLTVQEPPRIE 0
0 DAEEFPNLAVASERRDRIETPKFQSKQQPQ 0
0 DNFKNNVKKSQLPVQLDLGGMLTALEKKQHSQHAKQSSKPVVVS 1
2 VGAVPVLSKECASGERGRRMSQMKTPHNPLDSSAPLMKKGKQREIPKAKKPTSLKK 0
0 IILKERQERKQRLQENAVSPAFTSDDTQDGESGGDDQFPEQAELS 1
2 GPEGMDELISTPSVEDKSEEPPGTELQRDTEASHLAPNHTTFPKIHSRRFRD 2
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTVAARQAYKTMLENVQQELVGEPRPQAPPSLPTQGPSCPAEDGPPALKEKEEPHY 1
0 DQFHKMVELTVAARQAYKTMLENVQQELVGEPRPQAPPSLPTQGPSCPAEDGPPALKEKEEPHY 1
2 IEIWKKHLEAYSGCTLELEESLEASTSQMMNLNL* 0


>SECISBP2_panTro Pan troglodytes (chimp)
>SECISBP2_panTro Pan troglodytes (chimp)
Line 586: Line 575:
2 GGLDDALHNIISIACEQEIPFVFALNRKALGQCVNKPVPVSVLGIFSYDGAE 0
2 GGLDDALHNIISIACEQEIPFVFALNRKALGQCVNKPVPVSVLGIFSYDGAE 0
0 NQFHQMVEITEEARKAYQEMLDALQQELEADEEKGDSEEQPLISSESSTIHFNNVTSQPFSEADEPEY 1
0 NQFHQMVEITEEARKAYQEMLDALQQELEADEEKGDSEEQPLISSESSTIHFNNVTSQPFSEADEPEY 1
>SECISBP2_braFlo Branchiostoma floridae (amphioxus) extra exon
1 YCNQVLDKEIDATVTMLLQDLVRFQDRQYHK 00 DPIKAKAKRRIVMGLREVTKHLKLRKLKCIIIAPNLEKIQSK 1
2 GGLDDAIETILNLCMEQDVPFVFALGRKALGRAVNKLVPVSVVGVFNYDGAE 0
0 1
</pre>
</pre>


=== KIAA0256 L7Ae motifs from 23 deuterostomes ===
=== KIAA0256 L7Ae motifs from 23 deuterostomes ===
<pre>
<pre>
>KIAA0256_homSap Homo sapiens (human) 1101 aa 19 exons 13 glycosylation sites
>KIAA0256_homSap Homo sapiens (human)
0 MDRAPTEQ 0
0 NVKLSAEVEPFIPQKKSPDTFMIPMALPNDNGSVSGVEPTPIPSYLITCYPFVQENQSNR 2
1 QFPLYNNDIRWQQPNPNPTGPYFAYPIISAQPPVSTEYTYYQLMPAPCAQVMGFYHPFPTPYSNTFQAANTVNAITTECTERPSQLGQVFPLSSHRSRNSNRGSVVPK 0
0 QQLLQQHIKSKRPLVKNVATQKETNAAGPDSRSKIVLLVDASQQT 1
2 DFPSDIANKSLSETTATMLWKSKGRRRRASHPTAESSSEQGASEADIDSDSGYCSPKHSNNQPAAGALRNPDSGTMN 0
0 HVESSMCA 1
2 GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ 0
0 VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE 0
0 DEDGFQELNENGNAKDENIQQKLSSKV 0
0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 EALSKAAGKKNKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
2 VVTAASFHTKDSTNRKPLTKSQPCLTSFNSVDIASSKAKKGKEKEIAKLKRPTALKK 0
0 VILKEREEKKGRLTVDHNLLGSEEPTEMHLDFIDDLPQEIVSQE 1
2 DTGLSMPSDTSLSPASQNSPYCMTPVSQGSPASSGIGSPMASSTITKIHSKRFRE 2
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
2 ETNWRNMVETSDGLEASENEKEVSCKHSTSEKPSKLPFDTPPIGKQPSLVATGSTTSATSAGKSTASDKEEVKPDDLEWASQQSTETGSLDGSCRDLLNSSITSTTSTLVP
GMLEEEEDEDEEEEEDYTHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLNGSEEDNVEQSGEEEAEAPEVLEPGMDSEAWTADQQASPGQQKSSNCSSLNKEHSDSNYTTQTT* 0


>KIAA0256_panTro Pan troglodytes (chimp)
>KIAA0256_panTro Pan troglodytes (chimp)
2 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
0 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
1 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1


>KIAA0256_macMul Macaca mulatta (rhesus)
>KIAA0256_macMul Macaca mulatta (rhesus)
2 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
0 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
1 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1


>KIAA0256_tupBel Tupaia belangeri (treeShrew)
>KIAA0256_tupBel Tupaia belangeri (treeShrew)
Line 632: Line 600:


>KIAA0256_musMus Mus musculus (mouse)
>KIAA0256_musMus Mus musculus (mouse)
0 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
2 SLFNRLVELTEEARKAYKDMVAATEQEQAEEALRSVKTVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
0 SLFNRLVELTEEARKAYKDMVAATEQEQAEEALRSVKTVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1


>KIAA0256_ratNor Rattus norvegicus (rat)
>KIAA0256_ratNor Rattus norvegicus (rat)
2 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
0 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
1 SLFNRLVELTEEARKAYKDMVAATEQEQAEEALRSVKAVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
0 SLFNRLVELTEEARKAYKDMVAATEQEQAEEALRSVKAVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1


>KIAA0256_canFam Canis familiaris (dog)
>KIAA0256_canFam Canis familiaris (dog)
Line 657: Line 625:


>KIAA0256_monDom Monodelphis domestica (opossum)
>KIAA0256_monDom Monodelphis domestica (opossum)
0 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVKAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVKAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
1 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYCGAE 0
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYCGAE 0
2 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1


>KIAA0256_galGal Gallus gallus (chicken)
>KIAA0256_galGal Gallus gallus (chicken)
Line 697: Line 665:


>KIAA0256_oryLap Oryzias latipes (medaka)
>KIAA0256_oryLap Oryzias latipes (medaka)
0 YCNQVLSKEIDESVTLLLQELVRFQERVYQKDPSKAKSKRRLVMGLREVTKHMKLHKIKCVIISPNCEKIQAK 1
1 YCNQVLSKEIDESVTLLLQELVRFQERVYQKDPSKAKSKRRLVMGLREVTKHMKLHKIKCVIISPNCEKIQAK 1
1 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
2 GLFNQLVSLTEEARKAYKEMVSALEQEQAEEALKHDKKVPHHMGHSRNHSAASAISFCSILSEPISEVNEKEY 1
0 GLFNQLVSLTEEARKAYKEMVSALEQEQAEEALKHDKKVPHHMGHSRNHSAASAISFCSILSEPISEVNEKEY 1


>KIAA0256_pimPro Pimephales promelas (minnow) based on transcript tiling; exons by homology; 62% identity
>KIAA0256_pimPro Pimephales promelas (minnow) based on transcript tiling; exons by homology; 62% identity
Line 715: Line 683:
2 GGLDEALQTVIALASEQSVPFVFALNRKALGHCLNKKVPVSVVGVFHYGGAE 0
2 GGLDEALQTVIALASEQSVPFVFALNRKALGHCLNKKVPVSVVGVFHYGGAE 0
0 THFQRLVALTEEARSAYRNMVSSLQRQEAAATSEPTGHTEDPLEASASPPSVPAHDPTALLHLLRPQQGPREDDPAEASGRSPGRNA 1
0 THFQRLVALTEEARSAYRNMVSSLQRQEAAATSEPTGHTEDPLEASASPPSVPAHDPTALLHLLRPQQGPREDDPAEASGRSPGRNA 1
YCNQVLDKEIDATVTMLLQDLVRFQDRQYHK 0
0 DPIKAKAKRRIVMGLREVTKHLKLRKLKCIIIAPNLEKIQSK 1
2 GGLDDAIETILNLCMEQDVPFVFALGRKALGRAVNKLVPVSVVGVFNYDGAE 0


>KIAA0256_cioInt Ciona intestinalis (tunicate)
>KIAA0256_cioInt Ciona intestinalis (tunicate)

Latest revision as of 20:06, 2 January 2009

KIAA0256: mis-annotated and forgotten ancestral SECIS binding protein

KIAA0256 originally arose in a GenBank submission package from a large-scale mRNA sequencing project at the Kaluza Institute. While high quality, it skips over a highly conserved exon 8, VGF...YFE. This does not alter downstream reading frame since the skipped exon resides in a series of consecutive phase 00 splices. NCBI staff then confounded the record by posting experimentally unsupported gene predictions -- all exon-skipping -- from various genome assemblies lacking significant transcript programs mis-labelling them as mRNAs, thus entrenching the incomplete variant as normal form.

Two subsequent papers featuring the twice-published Figure 1B aligned full length SBP2 to an intermediate region of the exon-skipping gene model of KIAA0256 that by coincidence began with a methionine, namely residues 422-849 of the 1101 residue protein (which did include the motif-bearing residues 632-829 of exons 14-16). SwissProt provides the proper full length protein Q93073 without a supporting accession.

SBP2 exhibits a fusion of two exons (2 ELSWTPMGYVVRQTLSTEL 00 SAAPKNVT...) relative to KIA0256 within an indel-rich area of the protein. The intronation of KIA0256 exhiibits the ancestral form. The fusion occured prior to teleost fish divergence but is hard to date beyond that. After consideration of anchoring patches of semi-conserved residues, the alignment of human paralogs in this region is:

  SPB2 exons 5-8 showing fusion                            KIAA0256 exons 5-9
2 ELSWTPMGYVVRQTLSTEL 0                                  2 GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ 0
0 SAAPKNVTSMINLKTIASSADPKNVSIPSSEALSSDPSYNKEKHIIHPTQK 0  0 VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE 0
0 SKASQGSDLEQNEASRKNKKKKEKSTSKYEVLTVQEPPRIE 0            0 DEDGFQELNENGNAKDENIQQKLSSKV 0                  
0 DAEEFPNLAVASERRDRIETPKFQSKQQPQ 0                       0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 DNFKNNVKKSQLPVQLDLGGMLTALEKKQHSQHAKQSSKPVVVS 1         0 EALSKAAGKKNKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1

The effect of these early errors meant early SECIS binding experiments used KIAA0256 protein lacking the immensely conserved exon 8, this variant unsurprisingly lacked relevent SECIS binding properties, whick lead to abandonment of further experimentation. Consequently we know nothing about the SECIS binding properties of full length KIAA0256 protein.

There is no reason to believe an odd fragment studied on a small subset of SECIS elements could accurately reflect binding properties of full length protein in regards to all 25 orthology classes of human SECIS elements. However these results remain accepted folklore within the selenocysteine research community even today.

An Irish family with an SBP2 compound mutation (paternal allele inactive, maternal allele a splice donor mutation leading to early truncation: K438stop/IVS8ds+29G/A ) has been incorrectly described by these same authors as a SBP2 knockout; in fact 48% production of wildtype maternal allele still occurs. In addition, KIAA0256 may be able to partially compensate for reduction in level or loss of SECISBP2. Knockout mice for tRNA(Sec), unable to make any selenoproteins, die in utero.

While baboon also has experimental transcripts skipping this exon (FC145891, FC178616), mammalian transcripts almost always retain the exon, for example human (AK307480 but not BF055173, CN482709, BE930773, DW431473), macaque (CJ457866), mouse (AK145135), rat (CK602552), dog (CO708934), horse (CX604216), cow (CK846448), sheep (EE864720), and even chicken (DR417186), frogs, and fish. Thus inclusion of exon 8 is the ancestral state. Skipping is documented to date only in these two primates.

Transcriptional processing in mammals is error-rich, producing numerous defective mRNA variants that never amount to useful regulation or stable protein -- downstream quality controls quickly eliminate them. While exon-skipping in some cases may have adaptive significance, without significant comparative genomics support, the default hypothesis is they do not. Here it is difficult to distinguish between a weakened exon 8 splice acceptor in apes leading to a fraction of defective transcripts versus an innovative functional truncated form. These alternatives might be resolvable from 3D structural considerations -- deleting 45 conserved residues in an ancient exon is highly problematic.

KIAA0256 and SECISBP2 have so-so alignment but over their entire lengths plus 17 exactly comparable exons (trillion:1 odds for coincidence), meaning they reflect a segmental gene duplication (which can be dated to post-amphioxus, pre-chondrichtyhes). It is imperative to enforce exon boundaries to achieve true homological alignment of two proteins this diverged and so gappy N-terminally; structure-based alignment has different rules (allowing convergent evolution) and different goals.

The teleost fish Pimephales promelas has sufficient transcript coverage to allow recover of an accurate full length KIAA0256 gene with a respectable 62% identity to human. No fish has sufficient transcripts to recover full length SBP2 as of Dec 08. Some initial exons are quite well conserved over this billion years of branch length, strong evidence that they retain an unknown function under strong selection. However the gaps in other early exons are incompatible with retention of tertiary protein structure. No early pfam domain can be found.

We have to wonder how sea urchin, which has a full length apparent ortholog of KIAA0256 on Scaffold18963 but nothing clustering to SBP2, can insert selenocysteine into its numerous selenoproteins (SEPHS1, SELU1, SELU2, SELM, SELO, SELW, SELN1, GPX3, GPX2, GPX4, GPX7,...). Unless a second copy has been lost, all SECIS interaction at the ribosome at sea urchin divergence appears to have been handled by ancestral KIAA056.

The same can be said for amphioxus and tunicate. These species too have numerous selenoproteins yet their genome assemblies contain but a single homologous gene with vastly higher homology to KIAA0256. Lamprey genome lacks adequate coverage; elephant shark has fragments of both genes. It's difficult to extend orthologous annotation into protostomes and cnidaria because divergence is high even within the L7ae motif, though three long overlapping cDNAs from clam allow recovery of a long terminal fragment.

In summary, the genomic evidence strongly supports the scenario of a single-copy gene fulfiling all roles of SECIS binding at the ribosome for hundreds of millions of years. This gene resembled KIAA0256 much more strongly than SBP2, which arose much later and only in the deuterostome lineage. SBP2 apparently took on the SECIS binding role for a subset of these elements (split-functionalization) but continues to evolve much rapidly, suggest less selection and partial compensatory overlap with the parental gene KIAA0256.

>SECISBP2_homSap Homo sapiens (human) full length
0 MASEGPREPESE 0
0 GIKLSADVKPFVPRFAGLNVAWLESSEACVFPSSAATYYPFVQEPPVTE 2
1 QKIYTEDMAFGASTFPPQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQYLYNQPSCYRGFQTVKHRNENTCPLPQEMKALFK 0
0 KKTYDEKKTYDQQKFDSERADGTISSEIKSARGSHHLSIYAENSLKS 1
2 DGYHKRTDRKSRIIAKNVSTSKPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVHSVSTDISLLREVVKPAAVLSK 0
0 GEIVVKNNPNESVTANAATNSPSCTR 1
2 ELSWTPMGYVVRQTLSTELSAAPKNVTSMINLKTIASSADPKNVSIPSSEALSSDPSYNKEKHIIHPTQK 0
0 SKASQGSDLEQNEASRKNKKKKEKSTSKYEVLTVQEPPRIE 0
0 DAEEFPNLAVASERRDRIETPKFQSKQQPQ 0
0 DNFKNNVKKSQLPVQLDLGGMLTALEKKQHSQHAKQSSKPVVVS 1
2 VGAVPVLSKECASGERGRRMSQMKTPHNPLDSSAPLMKKGKQREIPKAKKPTSLKK 0
0 IILKERQERKQRLQENAVSPAFTSDDTQDGESGGDDQFPEQAELS 1
2 GPEGMDELISTPSVEDKSEEPPGTELQRDTEASHLAPNHTTFPKIHSRRFRD 2
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTVAARQAYKTMLENVQQELVGEPRPQAPPSLPTQGPSCPAEDGPPALKEKEEPHY 1
2 IEIWKKHLEAYSGCTLELEESLEASTSQMMNLNL* 0

407–525 domain required for U insertion but not SECIS binding (399–516 in rat)
R       R540Q allele of SBP2 decreases GPX1 and DIO2
650–752 L7Ae motif kink-turn binding motif
676     invariant glycine (669 in rat)
exon 8  skipped in improbable RefSeq alternative splice of KIAA0256

>KIAA0256_homSap Homo sapiens (human) length=1101
0 MDRAPTEQ 0
0 NVKLSAEVEPFIPQKKSPDTFMIPMALPNDNGSVSGVEPTPIPSYLITCYPFVQENQSNR 2
1 QFPLYNNDIRWQQPNPNPTGPYFAYPIISAQPPVSTEYTYYQLMPAPCAQVMGFYHPFPTPYSNTFQAANTVNAITTECTERPSQLGQVFPLSSHRSRNSNRGSVVPK 0
0 QQLLQQHIKSKRPLVKNVATQKETNAAGPDSRSKIVLLVDASQQT 1
2 DFPSDIANKSLSETTATMLWKSKGRRRRASHPTAESSSEQGASEADIDSDSGYCSPKHSNNQPAAGALRNPDSGTMN 0
0 HVESSMCA 1
2 GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ 0
0 VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE 0 
0 DEDGFQELNENGNAKDENIQQKLSSKV 0
0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 EALSKAAGKKNKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
2 VVTAASFHTKDSTNRKPLTKSQPCLTSFNSVDIASSKAKKGKEKEIAKLKRPTALKK 0
0 VILKEREEKKGRLTVDHNLLGSEEPTEMHLDFIDDLPQEIVSQE 1
2 DTGLSMPSDTSLSPASQNSPYCMTPVSQGSPASSGIGSPMASSTITKIHSKRFRE 2
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
2 ETNWRNMVETSDGLEASENEKEVSCKHSTSEKPSKLPFDTPPIGKQPSLVATGSTTSATSAGKSTASDKEEVKPDDLEWASQQSTETGSLDGSCRDLLNSSITSTTSTLVP
GMLEEEEDEDEEEEEDYTHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLNGSEEDNVEQSGEEEAEAPEVLEPGMDSEAWTADQQASPGQQKSSNCSSLNKEHSDSNYTTQTT* 0


   KIAA0256 exon 8 conservation suggests functionality:
VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE Homo sapiens 
VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE Macaca fascicularis
VGFRCRGHSTSSERRQNLQKRQDNKQLNPSQSHRSDSNSESLYFE Tupaia belangeri
VGFRCRGHSTSSERRQNLQKRQDNKHLNSTQSHRSDPNSESLYFE Mus musculus 
VGFRCRGHSTSSERRQNLQKRQDNKHLNSTQSHRSDPNSESLYFE Rattus norvegicus 
VGFRCRGHSTSSERRQNLPKRQDNNKQLNASQSHRGDSNSESLYFE Canis familiaris 
VGFRCRGHSTSSERRQNLQKRQDNKQLNPSQSHRGNPNSESLYFE Equus caballus 
VGFKCRGHSTSSERRQNLQKRQDNKQLNPNQSHRSDPNSESLYFE Myotis lucifugus
VGFRCRGHSTSSERRQNLQKRQDNKQLNPSQSHRGDPNSESLYFE Bos taurus   
VGFRCRGHSTSSERRQNLQKRQDNKQLNPSQSHRGDPNSESLYFE Ovis aries 
VGFRCRGHSTSSERRQNLQKKQDNKQLNSSQSHRGDPNSESLYFE Dasypus novemcinctus
VGFRCRGHSTSSERRQNLQKRQDNKQLNPIQSQRGDPNSESLYFE Loxodonta africana
VGFRCRGHSTSSERRQSLQKRQDNKPL-GNHSHRVETSSDPLYFE Monodelphis domestica
SGFRCRGHSTSSERRQNLQKRHE-KPLTTSQSSRAEQSPEPLYFE Gallus gallus 
PAFRCRGHSTSSERRQNLQKKPE-KPVSSSQSSKREQSPGSLYFE Anolis carolinensis
LGYRLRGQSTSSERRHNLQRKQDNKTGTPASSNKSGQSPDHLYFE Xenopus tropicali

   KIAA0256 exon 10 conservation (weak in SBP2):
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Homo sapiens 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Pongo abelii
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Macaca mulatta
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Tarsius syrichta
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Otolemur garnettii
LDDLPEiSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Tupaia belangeri
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Mus musculus 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Rattus norvegicus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Dipodomys ordii
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Spermophilus tridecemlineatus 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Cavia porcellus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Oryctolagus cuniculus 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Ochotona princeps 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Felis catus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Canis familiaris 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Equus caballus 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Myotis lucifugus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Pteropus vampyrus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Tursiops truncatus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Lama pacos
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Sorex araneus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Echinops telfairi
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKiQ Monodelphis domestica
LagLPENSPIsIVQTPIPITaSVPKRAKSQKKKALAAALATAQEYSEISMEQrKLQ Ornithorhynchus anatinus
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Gallus gallus 
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Taeniopygia guttata
LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQrKLQ Anolis carolinensis
LNGLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ Xenopus tropicalis
LDNLPENSPINIVQTPIPITTSVPKRAKSQRKKAMAAALATAQEYSEISMEQKKLQ Gasterosteus aculeatus
LDNLPENSPISIVQTPIPITSSVPKRAKSQRKKALAAALATAQEYSEISMEQKKLQ Pimephales promelas
LPGSQEPLNPATVVSTPVEVKKEGKNARKKRKKALLAAKAAAEEYSEITQVISENQ Branchiostoma floridae
....INSSAPYPSSAANLNEKSQAQKTKKRRKKAERAARAADEEYAEISKEQENIQ Ciona intestinalis 
....NVQNQVYPPS..NSNEKAQAQKSKKRRKKAERAAKAADEEYAEISKEHENIQ Ciona savignyi 
DPSSIKPEELLSPANVMSTIKEG.KNARKRRKKAIMATQAAAKEYSEITEEQRQLH Strongylocentrotus purpuratus

Using the kink-turn binding motifs of the two human proteins in turn as blastp query against the both collections of deuterostome KIAA0256 and SECISBP2 sequences, establishes KIAA0256 as the slower evolving protein by a wide margin. This fits KIAA0256 retaining ancestral function and its gene duplicate SECISBP2 specializing via a neofunctionalization.

 Blastp score ratio KIAA0256/SECISBP2 (human query): ratio > 1 indicates slower evolution of KIAA0256
  galGal  1.41 72% identity
  anoCar  1.35
  xenTro  1.41 68% identity
  danRer  1.44
  tetNig  1.59
  takRub  1.45 64% identity
  gasAcu  1.60
  oryLat  1.52
  calMil  1.43 65% identity

Both proteins bristle with potential NxT/S glycosylation sites, 13 for KIAA0256 and 6 for SECISBP2, with implications for cellular localization. These do not lie in homologous positions, unsurprisingly in view of the deep divergence of these genes and volatility of glycosylation sites in other gene families. These sites are conserved only to moderate depth -- and that could be for reasons unrelated to glycosylation). Hence glycosylation site do not provide reliable anchors in region of poor sequence conservation. SwissProt predicts phosphoserine sites in exon 5 (of unknown functionality); those too have only moderate phylogenetic conservation.

Comparative genomics of 4 glycosylation sites in exon 7 of KIAA0256:
GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ Homo sapiens (human)
GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ Macaca mulatta (rhesus)
GGVNWPKVTCQATQKRPWMEKNQAFSRGGRQTEQRNNLQ Mus musculus (mouse)
GGVNWPKVTCQATQKRPWMEKNQAFSRGGRQTEQRNNSQ Rattus norvegicus (rat)
GSVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ Canis familiaris (dog)
GGVNWSNVTSQATQKKPWMEKNQTFSRGGRQAEQRNNSQ Sus scrofa (pig)
GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ Equus caballus (horse)
GGVNWSNVTCQGTQKKPWLEKNQTFSKGGRQMEQRNNSQ Dasypus novemcinctus (armadillo)
GHVNWSNVTCQATQKKPWMEKHQTFSRGGRQTEQRNNAQ Loxodonta africana (elephant)
GGASWSNVTSQATQKKPWMEKSQPFSRGGRQTEQRNNSQ Monodelphis domestica (opossum)
.GVSWTNVNSQATQKKPWIEKTQTFIRGGRQAEQRNSSQ Gallus gallus (chicken)
AGATWANVSSQATQKKPWMERTPAFSRGGRQAEQHNSSQ Anolis carolinensis (lizard)

   Potential for phosphoserine conservation in exon 5 of KIAA0256:
DFPSDIANKSLSETTATMLWKSKGRRRRASHPTAESSSEQGASEADIDSDSGYCSPKHSNNQPAAGALRNPDSGTMN homSap
.FPSDIANKSLSESTATMLWKAKGRRRRASHPAVESSSEQGASEADIDSDSGYCSPKH-NNQSAPGALRDPASGTMN musMus
DFPSDIANKSLSESSATMLWKSKGRRRRASHPTAESSSEQGASEADIDSDSGYCSPKHSNNQPAAGALRNPDSSTMN canFam
DFPSDIANKSLSESSSTMLWKSKGRRRRSSHPTAESSSEQGASEADIDSDSGYCSPKHSNNQATAMTSRNTDSGSIN monDom
DFPLDIANKSLSESAATVLWKSKGRRRRASHPAAESSSEQGASEADIDSDSGYCSPKHGNNQAAGPAARSADSGPAN ornAna G insertion
DFPSDIANKSLSESASTMLWKSKGRRRRASHPAAESSSEQGASEADIDSDSGYCSPKHGNNQAAAVTSRNADSCAMN galGal
DFPSEIASKSLSESMSTMHWKPKTRRRRSSHP-AESSSEQGASEADIDSDSGYCSPKHS-NQAAAVTSRSVESAAGN anoCar
DFPNEIANKTICESVGATPWKSKVRRRRLSHPAAESSSEQGASEADIDSDSGYCSPKHC--QAAAMCTRHADCGAV. xenTro
DFPGEASGGVRCVSDQVSPQQWKNKPRRRRTSQQESSSEQGASEADIDSDSGYCSPKH--NQGAA............ danRer
DFPGEVSGRCAAERASPQLWKNKTKRRRASHP-AENYSEQGASEADIDSDSGYCSPKH--NQAAGVTQR........ gasAcu
DFPGEAAVRCVSDQASPQLWSNKARRRRTSQ--QESSSEQGVSEADIDSDSGYCSPKHSTNQPAAAV----DAGVM  pimPro SGSG NQGANNT HT insertions
DFPDDIADKSLRDKPSPLLRKSKARRLASRRPQDPSSTDSEEDEGGIDSDSGYSSPKHGRNQSA..............braFlo
DFPEAIANKPLSDKTSNLTSRSKAKTRKKSQGNASSSSDSEVENTPHDSDSGYYSPLHAQQ................ strPur QTGRD insertion

Reference sets of metazoan KIAA0256 and SPB2 sequences

It is very difficult to extract accurate full length genes from phylogenetically representative organisms in the case of KIAA0256. That's because the gene is twice average size (thus seldom tiled completely by transcripts), has two very short exons that do not emerge consistently from alignment methods, several consecutive poorly conserved exons rife with indels, and a run-on indeterminate carboxy terminus. Nearly every pipeline entry in GenBank non-redundant contains gross errors including gratuitous long internal repeats and severely truncated genes.

It will prove imperative to initiate massive cDNA programs in non-teleost species for this (and many other anomalous genes) for which homological modelling will never work. Tiled coverage will be necessary, not merely end-sequencing.

Why would a gene seemingly essential to making numerous essential selenoproteins evolve so erratically? The ribosome and SECIS elements with which it interacts are exceedingly conserved and its role must have been stable for over half a billion years. KIAA0256 has a dumbbell conservation structure, possibly suggesting a fusion of two proteins. Only the amino terminal region of the upstream partner was conserved, along with the SECIS and L7Ae motif of the downstream partner, with little selection on the run-on carboxy terminal tail (not uncommon in proeins).

Full length metazoan KIAA0256 sequences

>KIAA0256_homSap Homo sapiens (human) length=1101
0 MDRAPTEQ 0
0 NVKLSAEVEPFIPQKKSPDTFMIPMALPNDNGSVSGVEPTPIPSYLITCYPFVQENQSNR 2
1 QFPLYNNDIRWQQPNPNPTGPYFAYPIISAQPPVSTEYTYYQLMPAPCAQVMGFYHPFPTPYSNTFQAANTVNAITTECTERPSQLGQVFPLSSHRSRNSNRGSVVPK 0
0 QQLLQQHIKSKRPLVKNVATQKETNAAGPDSRSKIVLLVDASQQT 1
2 DFPSDIANKSLSETTATMLWKSKGRRRRASHPTAESSSEQGASEADIDSDSGYCSPKHSNNQPAAGALRNPDSGTMN 0
0 HVESSMCA 1
2 GGVNWSNVTCQATQKKPWMEKNQTFSRGGRQTEQRNNSQ 0
0 VGFRCRGHSTSSERRQNLQKRPDNKHLSSSQSHRSDPNSESLYFE 0 
0 DEDGFQELNENGNAKDENIQQKLSSKV 0
0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 EALSKAAGKKNKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
2 VVTAASFHTKDSTNRKPLTKSQPCLTSFNSVDIASSKAKKGKEKEIAKLKRPTALKK 0
0 VILKEREEKKGRLTVDHNLLGSEEPTEMHLDFIDDLPQEIVSQE 1
2 DTGLSMPSDTSLSPASQNSPYCMTPVSQGSPASSGIGSPMASSTITKIHSKRFRE 2
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
2 ETNWRNMVETSDGLEASENEKEVSCKHSTSEKPSKLPFDTPPIGKQPSLVATGSTTSATSAGKSTASDKEEVKPDDLEWASQQSTETGSLDGSCRDLLNSSITSTTSTLVP
GMLEEEEDEDEEEEEDYTHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLNGSEEDNVEQSGEEEAEAPEVLEPGMDSEAWTADQQASPGQQKSSNCSSLNKEHSDSNYTTQTT* 0

>KIAA0256_monDom Monodelphis domestica XM_001380435=flawed
0 MDRAAADQ 0
0 NVKLSAEVEPFVPQKKTPDTLMIPMALPGDSGSVSGVEPTPIPSYLITCYPFVQENQSNR 2
1 QFPLYNNDLRWQQPNPNPPGPYLAYPIISAQPPVSTEYTYYQLMPAPCAQVMGFYHPFPTPYSSTFPAANTLNTIPTECTDRPNQLGQVFPLSSHRSRSSNRGPIVQK 0
0 QQLLQQHVKTKRPPVKSVATQKETSAAGPDNRSKIVLLVDASQQT 1
2 DFPSDIANKSLSESSSTMLWKSKGRRRRSSHPTAESSSEQGASEADIDSDSGYCSPKHSNNQATAMTSRNTDSGSIN 0
0 LMEPSICS 1
2 GGASWSNVTSQATQKKPWMEKSQPFSRGGRQTEQRNNSQ 0
0 VGFRCRGHSTSSERRQSLQKRQDNKPLGNHSHRVETSSDPLYFE 0
0 DEDEFTELNETGSAKDENIQQKISAKV 0
0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKIQ 0
0 EALSKAAGKKSKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLTYT 1
2 VVSAVPLQSKDSANRKSLTKSQPCLAPLNPLDTTSPKIKRGKEKEIAKLKRPTALKK 0
0 VILKEREEKKGRFTVDHSLLGSEEPIEMPLDFIDDLPQEIASQE 1
2 DTGLSMPSDTSLSPASQNSPYCMTPVSQGSPASSGIGSPMASSAITKIHSKRFRE 2
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVKAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYCGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
2 ETNWRNMVETSDGLETSENERDTSYKVISPETSNKVPNDKVLVNKQLPSVITGGTASTTNPGKCTVSDKEEVKPDDLEWASQQSTETGSLDGSCRDILNSSITSTTSTLVPGMLEE
EEDEDDDEDEDYPHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLNGAEEDNTEQSGEEEIEVPEQTDPVNDSEEWTADKQISNVQEKPNSCNSLNKEHSDSITT* 0

>KIAA0256_galGal Gallus gallus XM_413816=flawed 
0 MDKADK 0
0 NVKLSAEVEPFIPQKKGPETLMIPMALPNDSGGINGVEPTPIPSYLITCYPFVQENQSNR 2
1 QFPLYNNDIRWQQPNPNPAGPYLAYPIISAQPPVSTEYTYYQLMPAPCAQVMGFYHPFPPPYSAPFQTANAVNTVTTECTERPNPPGQVFPLSTQRSRSSNRGPIIPK 0
0 QQQLQMHIKNKRPPVKNVATQKETSSSGPENRSKIVLLVDASQQT 1
2 DFPSDIANKSLSESASTMLWKSKGRRRRASHPAAESSSEQGASEADIDSDSGYCSPKHGNNQAAAVTSRNADSCAMN 0
0 VVEPSINA 1
2 TGVSWTNVNSQATQKKPWIEKTQTFIRGGRQAEQRNSSQ 0
0 SGFRCRGHSTSSERRQNLQKRHEKPLTTSQSSRAEQSPEPLYFE 0
0 DEDEFPELNSDNGNSKSSNIQQKISPKV 0
0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 EALSKAAGKKSKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
2 VGSAAPFHTKESANRKSLTKGQPSMGCLNPLDSTAPKVKRGKEREISKLKRPTALKK 0
0 IILKEREEKKGRLSVDHSLLGSDEQKQVHISLPTDQSQELASQE 1
2 ETGLSMPSDTSLSPASQNSPYCMTPVSQGSPASSGIGSPMASSAITKIHSKRFRE 2
1 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPMRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 DLFNKLVSLTEEARKAYRDMVAAMEQEQAEEALKNVKKAPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
2 ETNWRNMVETSDGLETSENERESSSQTAVPEKAANGQIAKSTLHKQPPLAATSTTSATNHGKATPGEKEEVKPDDNLEWASQQSTETGSLDGSCRDILNSSMISTTSTLVP
GMLEEEDEEDEEDDEDYAHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLSGAEEDNAEQSEEEEIETSEQVDPAVDSEEWTNDKHASNIQHKPTICGSLNKEHTDSIYMP* 0

>KIAA0256_taeGut zebrafinch
0 MDKSNKI 0
0 NVKLSAEVEPFIPQKKGPETLMIPMALPNDSGGINGMEPAPIPSYLITCYPFVQENQSNR 2
1 QFPLYNNDIRWQQPSPNPAGPYLAYPIISAQPPVSTEYTYYQLMPAPCAQVMGFYHPFPTPYPAPFQTANAVNTVTTECTERPSPSGQVFPLSTQRSRSSNRGPVIQK 0
0 QQQLQMHIKSKRPPVKNVATQKETSSSGPENRSKIVLLVDASQQT 1
2 DFPSDIANKSLSESTSTMLWKSKGRRRRTSHPAAESSSEQGASEADIDSDSGYCSPKHGNNQAAAMASRNTDSCAMN 0
0 VVEPSINA 1
2 TGIGWTNVNSQATQKKPWIEKTLTFSRGGRQAEQRNNPQ 0
0 SGFRCRDHSTSSERMQSLQKREKPLAMSQASRTEQSPEPLYFE 0
0 DEDEFPELNDNGSSKSSSIQQKISPKV 0
0 LDDLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 EALSKAAGKKSKTPVQLDLGDMLAALEKQQQAMKARQITNTRPLSYT 1
2 VGSAAPFHTKESASRKSITKGQPSMGCLNPLDSTAPKVKRGKEREIAKLKRPTALKK 0
0 IILKEREEKKGRLSADHSLLGSDEQKEAHLNLTADQSQELASQE 1
2 ETGLSMPSDTSLSPASQNSPYCMTPVSQGSPASSGIGSPMASSAITKIHSKRFRE 2
1 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPTRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 DLFNKLVSLTEEARKAYRDMVAAMEQEQAEEALKNVKKTPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
2 ETNWRNMVETSDGLETSENERESVCKAAVPEKAGNGQMEKTTLNKQQLATTGTTSATNHGKSTPGDKDEVKPDDLEWASQQSTETGSLDGSCRDLLNSSMTSTTSTLVP
GMLEEEEEEEDDDDEDYAHEPISVEVQLNSRIESWVSETQRTMETLQLGKTLSGAEEDNAEQSEEEEMETSEQADPITDGEEWTNDKHASSTQHKPTICSSLNKEHTDSIYMP* 0

>KIAA0256_xenTro Xenopus tropicalis BC167330
0 MEMNEQ 0
0 NGKLSAEVEPFVPQKKGAEALAIPMALPSDGGSVGGLEPTPIPSYLITCYPFVQENQSNR 2
1 QFPSYNNDIRWQQSNSSPAGPYLAYPIISTQPPVSQDYMYYQLMPAPCAQVMGFYHPFPTPYTTPLQATNAVSVDCSERASQQSQINALTSQRNRNTRAPLIHK 0
0 PQPALPQPRCKRPPMKSVAIQKETCASSPETRSKIVLLVDACQQT 1
2 DFPNEIANKTICESVGATPWKSKVRRRRLSHPAAESSSEQGASEADIDSDSGYCSPKHCQAAAMCTRHADCGAVS 0
0 ISDPAVPA 1
2 AGGSWASVASQATQKRPWNEKGQTFSRGGRQTEIRNNAQ 0
0 LGYRLRGQSTSSERRHNLQRKQDNKTGTPASSNKSGQSPDHLYFE 0
0 DEDAFPELNSSNGARNDNAQTKIPTKV 0
0 LNGLPENSPINIVQTPIPITTSVPKRAKSQKKKALAAALATAQEYSEISMEQKKLQ 0
0 EALSKASGKKSKTPVQLDLGDMLAELERQQQAMKARQITNTRPLSYT 1
2 VGSAVPFHIKEHTNRNVFTKAQAVMGSPNPLDSTAPRVKRGKEKEVPKLKRPTALKK 0
0 IILKEREEKKGRLPVDPSVLGSEEQKDALSFADDQSEELASQE 1
2 EAGLSAPSDTSLSPASQNSPYCMTPVSQGSPASSGIGSPMATSTLTKIHSKRFRE 2
1 YCNQVLSKEIDECVTVLLQELVSFQERVYQKDPVKAKSKRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFSYSGAE 0
0 SLFHNLVSLTEEARKAYKDMVSSMEQEQAEEALKNIKKVPHMGHSRNPSAASAISFCSVISEPISEVNEKDY 1
2 ETNWRNMVETSDGLETSENEECSVTTTGSEQAASAPLVRNNTQKQEPKTASSTTSSATLEKPTPADKEEVKQDDNLEWASQQSTETGSWDGSGRDVLNSSMTSTASTLVP
EMLEEDDDEEEDDDEYPQEPISVSRIESWVSETQRTMESLQLVNSNSPEEDNIEHSEEDEVGQCEQSEAADCKERTAEMHVRNGSHTQTGRKSSLKEKVNSTFM* 0

>KIAA0256_gasAcu Gasterosteus aculeatus (stickleback) 
0 MDAGDIK 0
0 DVKLSAEVEPFIPQKKGMEGSQVSMSLSGEAGGGGSGGGSGGVETTPIPSYLITCYPFVQENQPNRY 2
1 QHPMYNGGELRWWQQPNPSPGGSYLAYPILSSPQPPVSNDYAYYQIMPAPCPPVMGFYQPFPGPYAGPVQAGVVNPVSAEVGERPLPLGPAYGMNSQRGRGMVRPNVPPN 0
0 QLGVCQPLRGRRPPTRSVAVQKEVCTLGPDGRTKTVMLVDAAQQT 1
2 DFPGEVSGRCAAERASPQLWKNKTKRRRASHPAENYSEQGASEADIDSDSGYCSPKHNQAAGVTQRSAENTAAPTV 0
0 AVETGVMT 1
2 AGTWVNVASQATQSWGDRNGHFHRADQRKNSEQRNFSQ 0
0 EFHTGYAGRGPPGLSHQRPQPAVVSGTQVSPHPLYFE 0
0 DEDEFPDLASGGAAQRCTKAESTSAQTHAQPKLPKNL 0
0 LDNLPENSPINIVQTPIPITTSVPKRAKSQRKKAMAAALATAQEYSEISMEQKKLQ 0
0 EAFTKAAGKKSKTSVELDLGDMLAALEKHQQAMKARQLNNTKPLSFT 1
2 VGTTAPFHGSGLVSLPSALKGHQQPYSVPHNSLDSTAPRIKRGKEREIPKVKRPTALKK 0
0 IILKEREGKKGKTSVEQESSGQEEHADESLHFTDDLAREPASQE 1
2 ETGLSMPSDASLSPASQNSPYSITPVSQGSPASSGIGSPMASNAITKIHSRRFRE 2
1 YCNQVLSKEIDESVTMLLQELVRFQERIYQKDPTKAKTKRRLVMGLREVTKHMKLNKIKCVLISPNCEKIQAK 1
2 GGLDEALYNVIAMARDQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 GLFNRLVSLTEEARKAYKDMVSALEQEQAEEAQKNDKKLPHHMGHSRNHSAASAISFCSIFSEPISEVNEKEY 1
2 ETNWRSMVENSDALEPVESEPRRPAPPTSTPKVGEAAAATPPATSASTATPSSTAPQTARTAPPTLTQGNGERDEVRVDDRLELASQQSTETGSLDGSCRGPLNSSITSTTSTLVPGMLA
EEEEEEDYTPEPIAVEVPTLSSRIEYWVSKTLENLQLGKSQESTEEEDEDEEEEEEEERGHSEEEEDLDSADIAETRSEDKDQVEVKKVQG* 0

>KIAA0256_pimPro Pimephales promelas tiled cDNAs
0 MDAGERK 0
0 DVKLSAEVEPFIPQKKGVEASLLPMSLCGEGGAEPTQIPSYLITCYPFVQENQSNSR 2
1 QLPMYNGGDQRWQQLNPSPGGPYLAYPILSSPQPPVTSDYATYYHAIMPTPCPPVMGFYQPFPGPFAGPVPAGVLNPVSDCSDRPTPQRGRGVPRTPVLH 0
0 KQPMAQPMRAKRPVMRSVAVQKEVCATGPDGRTKTVLLVDAAQQT 1
2 DFPGEASGSGAVRCVSDQASPQLWSNKARRRRTSQQESSSEQGVSEADIDSDSGYCSPKHNQGANNTSTNQHTPA 0
0 AAVDAGVM 1
2 TAVSWGNVSSQAVQKPWPDRNTPFFRGSRTPERSYTQDF 0
0 QMSFGCRAAGPRRSTPPETPNTHLTPEPLYFQ 0
0 DEDEFPDLATGGAAQRNKPDPVQPKLPKT 0
0 LLDNLPENSPISIVQTPIPITSSVPKRAKSQRKKALAAALATAQEYSEISMEQKKLQ 0
0 EALSKAAGKKSRTPVQLDLGDMLAALEKQQQAMRARQLNNTKPLSYT 1
2 VGTVSSLHSKDCGSRVTGLKNTHTPPHNILDSSAPRIKRGKEREIPKVKKTTAMKK 0
0 IILQEREVKKGKSSADQGVSGADEQRDSLSFTDTLTQEQDENG 1
2 LSMPSDASLSPASQNSPYSITPVSQGSPASSGIGSPMAASAITKIHSRRFRE 2
1 YCNQVLSKDIDESVTLLLQELVRFQERVYQNEPSKAKAKRRLVMGLREVTKHMKLHKIKCVIISPNCEKIQAK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 ALFNTLVSLTEEARRAYKEMVSALEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1
2 ETNWRTMVENADAPEPPDSEPISRGNNRDQREVVSPPPQPTANQSLTPSPGVARAPDESRTDDRLEWASLSTETGSLDGSGRDRLNSSHHSTTSTLVPGMLEEE 0

>KIAA0256_calMil elephantfish fragments
QDIQLSAEVEPFIPQKKGTETLVPMALPNDGNGSGVEAPPIPSYLITCYPFVQ
ENQANRPVYNGDIRWQQANPNSPGPYLAYPILPTPQPPVSTDYAYYQLMPAPCTPMMGFY
SPFPTPYTGTLPPASVVNAVSECSERP
NPLDSTAPRVKRGKEKEIPKAKRPTALKKVKSFER
YCNQVLSKDIDECVTLLLQELVRFQERVYQKDPIKAKMKRRLVMGLREVTKHMKLRKIKCVIISPNCEKIQSKG
GGLDDALHNIISIACEQEIPFVFALNRKALGQCVNKPVPVSVLGIFSYDGAE
FHQMVEITEEARKAYQEMLDALQQELEADEEKGDSEEQPLISSESSTIHFNNVTSQPFSEADEPEYGT
DKEEGKTDDILEWASQQSTETGSLDGSCRDVLNSSMTSTTSTLVPDMLEEEEEEEEDDDDEDDEEEDYVHEPVSIAGTFSSRVDDWVSEAQKTLETLQLSKNIDSTEEDCDEQSDTEELDTVEQIDLTAESED

>KIAA0256_petMar mediocre fragments
 LHKLRALIISPNCEKIQAKG
2 GGLDEALQTVIALASEQSVPFVFALNRKALGHCLNKKVPVSVVGVFHYGGAE 0
THFQRLVALTEEARSAYRNMVSSLQRQEAAATSEPTGHTEDPLEASA
QVSCKHSRLPSALARTTPIPHPPQQLNTPPPPARRPQCPRELHYSLHSLALSLTAQPTSPHGRPPGKVPTVERCRRQRARVV

>KIAA0256_braFlo Branchiostoma florida fragment with low support 1116 aa
0 0
0 VSQLSAEVEPFVPSALPLPTSDPSGGTQPHVLPRYVTSCYPFVQPPEVTP 2
1 EGYVQEVRWPSSVPNPQYNPYPPLSPQPHLPHYYPPHNTPPPPGPFLPHPSPYPPPLYAGYPPPPHLYPPPYGTRSPTQQ 0
0 TRRRSGSRSVQTKSIAVQKEASSNSPVHNRHRTIILLDASQQT 1
2 DFPDDIADKSLRDKPSPLLRKSKARRLASRRPQDPSSTDSEEDEGGIDSDSGYSSPKHGRNQSANSSTSEAATGTCI 0
0 1
2 0
0 0 
0 0
0 LPGSQEPLNPATVVSTPVEVKKEGKNARKKRKKALLAAKAAAEEYSEITQVISENQ 0
0 EIQKKASGKKSKQPMQLDLGDMLAALEKRQQELKLKTAAGPAKTAAVSTGTVPVQ 1
2 DNKQWSGKKEASNVSMPHNPLDSHAPAVKRGKERETPHKKKPSALKK 0
0 VILKEREDKKMQKLMEDQAHSDAETGEVPSSTACYIPFSMEDSDGGTSQE 1
2 GSELSPLSQAMSPINFSPLSSASPLSSGTGSPLCAPSPIGPKIHSRRFRE YCNQVLDKEIDATVTMLLQDLVRFQDRQYHK 0
0 DPIKAKAKRRIVMGLREVTKHLKLRKLKCIIIAPNLEKIQSK 1
2 GGLDDAIETILNLCMEQDVPFVFALGRKALGRAVNKLVPVSVVGVFNYDGAE 0
0 EHFKTMVELTTQARNAYIDMVTIYRQEWEQMQ 0
0 AMRNSGQPIYPAHLGHSRNPSAASAVSFSSVLSETISECHPE HDGDIEGPKIKVEAAKVTEESKLKGQEECTEQGAIKAEPTKESLNSETENNLKENSSESNSDRADVESESSEGPESVSRHSEIVEFPPAYDDVLTSSAAT
TVVNGAESEVTDVVEEEGDVLNTSCSSSKLRVLDTGRIESWVVEASQCVEKLDLDPQQHAEEKIDPDQKKAESKVTSEQDSSKDLRADSKPDVDTRPGRNVSPTKQMDSSPAEEQNTANCDLSNLKQNVQLQSEGGEVSAEKK
TIASKDEDSTAGQTGSLSEPADDVGKFNGTVSTEINDR* 0

>KIAA0256_cioInt Ciona intestinalis XM_002123197 FK199357 BW542841 FF776957 FF925374 BW008530 1128 aa
0 MFSPGSD 0
0 RTNLRAEVPPFVPRREWPEGSMEQHNGGPLPRYVTTCYPFVQDNQD 0
1 HPAQIGINMNQRMANQNMRNSYSSVNYLSNNPNPATQLTNANTQMGMVQQNFSAQLFSR 21 GNLADTAHITAVYGDSFQCQYPPNQTSMIVQKTSSLSR 0
0 SSSRGSGKKILKRNVGTQKEVSRRSPVSPEMVDSCQQT 1
2 DFPMSVACKSLTDHPSSLRRATKSRRRRETCSSQSGNCDSSSDHADADVDSDSGYYS 21 PKHRLGHKRNGGTSTNGLWSRNE 0
0 KDPVPQVIYITPTNVAPPVSLFQVHSSAHNHMFPNSLPPQVTPPPNSLLGYGHPGPPIILSPPPNQIL 12 ANRPTPPFPIHPSMIGNRNPNQ 0
0 CPSNNNWNINQLALPPGYWPNNSTHPPQHRHQTRNPSLDFRHQRNLKKFEFYGEQPYVSAAGSQFLQGHFDKRKHDKRKTVDEVPSRERSPVVMQTHEQPNTNNLHFHGDNSMIMLS 0
0 DTQEFPGLDGNFFSTSSSPSINAFSYSAAVMGKIPRPIAP 0
0 INSSAPYPSSAANLNEKSQAQKTKKRRKKAERAARAADEEYAEISKEQENIQ 0
0 KVLKKTASSRNKNKNQVLDLGEFLTSKFEEKKLLSDSPTKNTEVAKSWEEGHLVAKPPLDLNMK 2
1 IKPTGPPANALDSTAPLIKKGKEREVPKPKKPSALKK 0
0 VILKEREEKKEHHLKLKEREEKKEHHLKQLTTMLSPETDVPPYPPALYKIP 1
2 VPSDDEKSLGHDTNTEVSVSIPPTVPQIHSRRYRE 2
1 YCCQVLDKRVDEMSNQMLQRLVYFQDR 21 LYKTDPAKAKRKRRVVLGFREVTKHLKMKKLRCVIISPNLEKIESK 1
2 GGLDDVLHEILDLCKEQNIPYVFALGKKALGRAVSKTVPVSIVGVFDYSGAEAQ 0
0 FKQLVELVKEAQLQYKDMVQIYQKQVAEANKP 00 VQSAGPSKRYAYMTHSRNASATSHLSVTSIISEPISEMNE 1
2 GSNWRVIMDAGEDGLSPPPEDDVSEEEEVEESPGKEKPAPPVPVKGGEELSRKDSGSTVVECPHPELQPDDNFVSELPGEEESSSVDAED
AEEMNLNHRRALDKSFSTCSTLKPEGGVSPRISTTSESSSLIPDDVSSQSSAQDRIQLWLEDATRSVVDLDLNDVVPDAEDVNSESKLVTPDVNESK* 0

>KIAA0256_stoPur Strongylocentrotus purpuratus XM_001188118 = bad internal dup 1430 aa
0 MTAMYYNAPSHQHQQQQHHHAPQPLHPHQHQQHHHQQTIPGMVPQPSPSQVVSGMLSEATAAMPGLKPPPPSQPQGGGGGGGGMQQYQTSSASAVATMNGKKVPLTELPRYITTCYPFVQDS 2
1 STGAAPATETWMGYPNSSQQPNQPHPQPQQHQHPPLPLPPTSQHPLSHQQPPQTTPMYAPPPPPPPGHQPPSAHLTQQQNQEYFPVHPGYN 0
0 QVPHQTPPPAAASPGGPLYQQGAYQQHGGTYQPHLTGTAPHHPTHHHHTQSPTPMPLASQSSMPA 1
2 GGVPVSHTPFAPPPMMTPPSQSPSPYPFVPPPPHGAATPGGYDAALPGTQPTLPSYGQYGYGAYPGPQVK 0
0 VRGQRPMNKDHRYPGGYQNKGREHYQAYVPPPTDLPKPKTKTVVFAEACAQT 1
2 DFPEAIANKPLSDKTSNLTSRSKAKTRKKSQGNQTGRDASSSSDSEVENTPHDSDSGYYSPLHAQQHNSTGLVSTYSTQTGKPTYSNVAMNNKSSPHQESR
TVEQNTFTQNQPLVVPQGPPLGPQLGPAPVIQRGRFTPVQPGIPSFRPVMPMSYANMLTKPRAANPPPPPLANVGYPQRPPNVFPTQPPPTYRNMAVSPAPMLYQQQQQQQQRRMQSPVPAPQ 0
0 KPPVTPEDTPRKRKQKRTKGKKDGEVELEKPKMVNAATYAKPPQIQDKEEYPGLPLGSPAGNKFGMSTGGRPISYSSALQQRAPVQL 0
0 VNESSSEEEEEESGGDPSSIIKPEELLSPANVMSTIKEGKNARKRRKKAIMATQAAAK 0
0 EYSEITEEQRQLHENMKKQGKRTKMPIEFDLGDMLAALE 0
0 KQQQEIRAKQQQQQQLIQRGPVAPSRNVQFAPNVATMDPYSQSRPVKDVPR 0
0 GHNPLDMTAPVKRGKERELPAKKKPSALKRVILKEREEKKRLRTLEESRLSDD 2
1 DPVVSQGLSQGLSQGLSQGFSQGFSYGFSHGFSQGLSQD 1
2 APSDRGSFPGFNASQSDLSPLSQMSPLSMSPLSPGSPLSSGLSSPATGMGRSNPTQVATKIHSRRFRE 2
1 YCNQVLDKDIDGCCTTLLQTLVKFQDRQYHKDPAK 0
0 AKMKRRLVMGLREVTKHLKLKKIKCVVVSPNLERIQSK 1
2 GGLDEAMDRISSLASEQNVPLIFALGRKALGRAVNKVVPVSVVGIFNYDGAE 0
0 DTYKQLLDLSTRARNAYADMVRKFQQELEAANAASAARMAKHRHHMGHNRNLSGCSAISFSSVISEPISENYPNPEPEVDSQGREIEPDPPTTPTYSPQ
GGGCSSDAGQQHPSAPMRSLSFTGTGSVISNSTDDTIHKEEKDGGGSSVGKDYVMSETSSRTLTAGEGDQDLEEGSKEDVGRVELEELEAGLVDQDHDEE
EEDEEEEEDEDEDAEVIKANILLPEDGAPEKRVADWVAEAQQCIESLTVDDESGDDGGDAKKKGVGEKKDEKPSDANISPEQVGKMLTSLEV* 0

>Mytilus californianus ES395733 GE754305 to KIAA0256_homSap  Identities = 156/269 (57%) 570 aa
          GSSAVGISYSAILQTVPVS 0
0 RPSTVERNKTSSSEDSPRKDNSSLEDKGTRASRRRRKRKDILNTAAEN 0
0 ELAEIGLEQQMLKEQCLKTQGQKSHKDEKGQTPGILKVNP 0
0 KAQNSGKKSKQNVSLDLGAVIDALEQKKTISLTSGARTEQKVKAEQPKNKEEQKSK 0
0 GSHNVLDASAPIKRGKERETPKAKKPSPLKKVILKEREEKKLLKMLEGT 2
1 ESGSTEAAVGIGVVSAESDLSQD 1
2 AMSTKSSIDYTGTPGSANLSPVSQTSPISMSPLSPGTSPLSSEVNSPIAGAVGKDVVKKIHSRRFRE 2
1 YCNQVLDKDIDECATTLLQDLVRFQDRMYHKDPSK 0
0 AKLKRRLVLGLREVAKHLKLRKIKCDIISPNLEKIQSK 1
2 GGLDDALNNILTLCNEQNVPFVFALGRRALGRACAKMVPVSVVGIFNYSGSE 0
0 ENFKQLIDLTAKARESYGEMVAAIEIEIKEYPMKKQQPTIPHVFAHMGHSRTPSGASVLSFTSSILSEPISENYPHSEPETDSKGYEIVKDDALIKQGLPTDSSGYQTQMRI
IHSNTKDDDGNEADNEEEGDRINRDYYRT*

>Nematostella vectensis NZ_ABAV01022736 fragments
1 QGSTEQENQSVKKKKKRKKKKKPTETEGES 1
0 VFHNMLDSTAPVIKRGKEREVPKKKKPSALKR 0
0 IILKEREEKKKERENAEHEKTDDGDAS 1 
1 YCDQVLDKELNTVTLKLLSELVRFQDRVYFKDPEK 0
0 AKAKRRYVVGLREVTKHLKLKKIKCVILSPNIEQIKSA 1
2 GGLDDALHNIISLAHTNRIPVVFSLRRQILGRAVCKKVPVSAVGIFNYDGAQ 0
0 DLFKNLMELTENGRKVYAERWNAAQEALREELDNEHPVISCNTEQGGP 1

SBP2 L7Ae motifs from 27 vertebrates

>SECISBP2_homSap Homo sapiens (human)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTVAARQAYKTMLENVQQELVGEPRPQAPPSLPTQGPSCPAEDGPPALKEKEEPHY 1

>SECISBP2_panTro Pan troglodytes (chimp)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKTMLENVQQELVGEPRPQAPPSLPTQGPSCPAEDGPPALTEKEEPHY 1

>SECISBP2_macMul Macaca mulatta (rhesus)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKTMLENVQQELAGEPRPQAPPSPPTQGPSCPAEDGPPALTEKEEPHY 1

>SECISBP2_otoGar Otolemur garnettii (bushbaby)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKERRLVLGLREVLKHLKLKKLICVISPNCERQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKTMLENVQRELAGEPGPQVPSSLPMEGPSCSVEDSPPAPTEKEEPHY 1

>SECISBP2_tupBel Tupaia belangeri (treeShrew)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRVVLGLREVLKHLKLKKLKCVIISPIZEKIQSK 1
2 GGLDDTLHTIIAYACAQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMEARQAYRSMLESARQELAGEPGLQAPPQPPVQGPRASSEGSAPAPTGRQEPHC 1

>SECISBP2_musMus Mus musculus (mouse)
1 YCSQMLSKEVDACVTGLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLRKLKCIIISPNCEKTQSK 1
2 GGLDDTLHTIIDCACEQNIPFVFALNRKALGRSLNKAVPVSIVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKTMLETMRQEQAGEPGPQSPPSPPMQDPIPSTEEGTLPSTGEEPHY 1

>SECISBP2_ratNor Rattus norvegicus (rat) exons 1416
1 YCSQMLSKEVDACVTGLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLRKLKCIIISPNCEKTQSK 1
2 GGLDDTLHTIIDCACEQNIPFVFALNRKALGRSLNKAVPVSIVGIFSYDGAQDQ 0
0 FHKMVELTMAARQAYKTMLETMRQEQAGEPGPQTPPSPPMQDPIQSTDEGTLASTGEEPHY 1

>SECISBP2_cavPor Cavia porcellus (guineaPig)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCIIISP 1
2 GLDDTLHTIIDYACAQNIPFVFALNRKALGRSLNKTVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKTMLENVRQELAGEPRPQMPPDPPSEGPSSSLEDTAPDPSAEEPHY 1

>SECISBP2_oryCun Oryctolagus cuniculus (rabbit)
1 YCSQMLSKEVDACVTDLFKELVRFHDLMYQDPVKATTKCQFELRVGKALDHLRLKKLKCIIVFPKHKKQS 1
2 TIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKTMLENMRHELAGEPGPPTPQPVQGPSCSAEDGPPAPTEGEVPHY 1

>SECISBP2_canFam Canis familiaris (dog)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLRKLKCIIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHRMVELTMAARQAYKTMLENVRQELAGEPGTPALANPPMQGLGCSTQDSPPAPTEKEEPHY 1

>SECISBP2_felCat Felis catus (cat)
1 YCSQMLSKEVDACVTDLLRELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLRKLKCIIISPNCEKIQSK 1
2 GGLDDTLHTIIGYACEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHRMVELTMAARQAYKTMLENARQELAGEPGPPAPGSPPPQPPAPAGRDEPRY

>SECISBP2_equCab Equus caballus (horse)
1 YCSQILSKEVDACVTELLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLRKLKCIIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPCVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTKAARQAYKAMLENVHQELAGEPGPQAPASPPAQGPSCSTEGAPPAPTGKEEPHY 1

>SECISBP2_bosTau Bos taurus (cow)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKAKRRLVLGLREVLKHLKLRKLKCIIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACDQNIPFVFALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYRTMLENARQELPGELGPCAPVGPPSQGPGCPVEDSPLAPTEKEEPHY 1

>SECISBP2_eriEur Erinaceus europaeus (hedgehog)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCIIISPNCEKIQSK 1
2 GGLDETLHTIIDCACEQNIPFVFALNRKALGRSLNKGVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKALLENMRQELAEESGSPAPSSPPVQSPSEDGPPAPAEKEEPHY 1

>SECISBP2_dasNov Dasypus novemcinctus (armadillo)
1 YCSQVLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGELDDTLHTIIDYAASRHSICVALNRKALGRSLNKAVPVSVVGIFSYDGAQ 0
0 DQFHKMVELTMAARQAYKAMLENVRKELAGEPGPRSPPSPPALGPHSSAGDVHPTSAGKEEPHY 1

>SECISBP2_loxAfr Loxodonta africana (elephant)
1 YCSQMLSKEVDACVTDLLKELVRFQDRMYQKDPVKAKTKRRLVLGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGLDDTLHTIIDYACEQNIPFVFALHRKALGRSLNKPVPVSVVGIFSYDRAQ 0
0 DQFHKMVELTMAARQEYKTMLESVRQELAEEPRAGSPPSPPTQGPGCSAEVPRPAPTEKEEPRY 1

>SECISBP2_monDom Monodelphis domestica (opossum)
1 YCSQMLSKEVDDCVMDLLKELVRFQDRMYQKDPVKAKTKRRLVMGLREVLKHLKLKKLKCVIISPNCEKSKSK 1
2 GGLDETLHTIIDYACEQNVPFVFALNRKALGRSVNKVVPVSVVGIFSYDGAQ 0
0 DQFHKMIALTMEARQAYKIMLSTLKEEPALETENPPSPSLPRPSESCPSELGQTDPTQEEEPNY 1

>SECISBP2_triVul Trichosurus vulpecula (possum)
1 YCSQMLSKEVDDCVMDLLKELVRFQDRMYQKDPVKAKTKRRLVMGLREVLKHLKLKKLKCVIISPNCEKSKSK 1
2 GGLDETLHTIIDYACEQNVPFVFALNRKALGRSVNKVVPVSVVGIFSYDGAQ 0
0 DQFRKMIELTMEARQAYKVMLATLKEGAEALQTENPLPTSLTPQGQGCSSELSKTTDPTKEEEPNY 1

>SECISBP2_galGal Gallus gallus (chicken)
1 YCSQVLSKEVDSCVTDLLKELVRFQDRLYQKDPVKAKIKRRLVMGLREVLKHLRLKKLKCVIISPNCEKIQSK 1
2 GGLDETLHNIIDCACEQNIPFVFALNRKALGRCVNKAVPVSVVGIFSYDGAQ 0
0 DHFHRMVQLTTEARKAYKDMVAALEEELKELSKPLNZKSCLSETGKTSSTKEDIPNY 1

>SECISBP2_anoCar Anolis carolinensis (lizard)
1 YCTQVLSKEVDSCVTDLLKELVRFQDRLYQKDPVKAKTKRRLVMGLREVLKHLKLKKLKCVIISPNCEKIQSK 1
2 GGLDETLHLIIDSACEQNIPFVFALNRKALGRCLNKAVPVSVVGIFSYDGAQ 0
0 DYFHKMVELTMEARQAYKDMISALERELKKKTVRKKPLQSRPLDTVEASSTEEDVPDY 1

>SECISBP2_xenTro Xenopus tropicalis (frog) NM_001097262
1 YCSQVLSKDVDNCVMELLKELVRFQDRLFLKEPAKAKSKRRLVMGLREVLKHLKLQKLKCIIISPNCEKIQSK 1
2 GGLDDTLQTIISHACEQNVPFVFALNRKALGRCLNKAVPVSVVGVFSYDGAQ 0
0 DHFHKLCELTVQARQAYKDMIAAAQEQQSETEAGKNEEDPVAVNGQNKSDDMREESKAEEPDEPNY 1

>SECISBP2_danRer Danio rerio (zebrafish)
1 YCNQVLSKDVDECVSNLLKELVRFQDRLYQKDPMKARMKRRLVMGLREVLKHLKLKKVKCVIISPNCERIQSK 1
2 GGLDEALHNIIDTCRDQSVPFVFALSRKALGRCVNKAVPVSLVGIFNYDGAQ 0
0 DFYHKMIELSSEARTAYEVMLLNLEQTDAEEAQQTSPLAEKVETSSGDPQPEEPEY 1

>SECISBP2_tetNig Tetraodon nigroviridis (pufferfish)
1 YCNQVLSKEIDESVTLLLQELVRFQERVYQKDPTKAKSKRRLVMGLREVTKHMKLQTIKCVIISPNCEKIQAK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 DFYHKMIELSSEARIAYEVMLSNLEQTSAEEEPQTCTLAEKINTSSEDAQPEEPEY 1

>SECISBP2_takRub Takifugu rubripes (fugu)
1 YCTQMLSKDVDECVTTLLKELVRFQDRLYQKDPIKARMKRRIVMGLREVQKHLKLRKLKCVIISPNCERIQSK 1
2 GGLDEALHTIIDTCREQAVPFVFALSRRALGRCVNKAVPVSLVGIFNYDGAQ 0
0 DFYHKMIELSSEARTAYEVMLLNLEQTDAEEAQQTSPLAEKVETSSGDPQPEEPEY 1

>SECISBP2_gasAcu Gasterosteus aculeatus (stickleback)
1 YCNQVLSKEIDESVTMLLQELVRFQERIYQKDPTKAKTKRRLVMGLREVTKHMKLNKIKCVLISPNCEKIQAK 1
2 GGLDEALYNVIAMARDQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 DFYHKMIELSSEARRAYEVMVSSLEQTGQADPESVEEKLQISSAAEEAELGRDITPPEEPEY 1

>SECISBP2_oryLap Oryzias latipes (medaka)
1 YCSQMLRKDVDECVTVLLKELVRFQDRLYHKDPIKARMKRRLVMGLREVLKHLKLRKVKCVIISPNCEQIQSK 1
2 GGLDEALHTIIQTCREQAVPFVFALSRKALGHCVNKAVPVSLVGIFNYDGAQ 0
0 DHYHKMIELSAEARKAYEVLVSSLERDQQEESHPDRGTCFGSVTAEPEKPHY 1

>SECISBP2_calMil Callorhinchus milii (elephantfish) AAVX01044988
1 YCSQVLSKDVDSCVTDLLKELVRFQDRLYQKDPIKAKKKRRIVMGLREVLKHLKLKRLKCIIISPNCEKIQSR 1
2 GGLDDALHNIISIACEQEIPFVFALNRKALGQCVNKPVPVSVLGIFSYDGAE 0
0 NQFHQMVEITEEARKAYQEMLDALQQELEADEEKGDSEEQPLISSESSTIHFNNVTSQPFSEADEPEY 1

KIAA0256 L7Ae motifs from 23 deuterostomes

>KIAA0256_homSap Homo sapiens (human)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_panTro Pan troglodytes (chimp)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_macMul Macaca mulatta (rhesus)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_tupBel Tupaia belangeri (treeShrew)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_musMus Mus musculus (mouse)
1 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNRLVELTEEARKAYKDMVAATEQEQAEEALRSVKTVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_ratNor Rattus norvegicus (rat)
1 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNRLVELTEEARKAYKDMVAATEQEQAEEALRSVKAVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_canFam Canis familiaris (dog)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_equCab Equus caballus (horse)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVALTEEARRAYKDMVAALEQEQAEEASKNVKKGPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_dasNov Dasypus novemcinctus (armadillo)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSKG 1
2 GLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_monDom Monodelphis domestica (opossum)
1 YCNQVLCKEIDECVTLLLQELVSFQERIYQKDPVKAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYCGAE 0
0 SLFNKLVELTEEARKAYKDMVAAMEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_galGal Gallus gallus (chicken)
1 YCNQVLSKEIDECVTLLLQELVSFQERIYQKDPMRAKARRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 DLFNKLVSLTEEARKAYRDMVAAMEQEQAEEALKNVKKAPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_anoCar Anolis carolinensis (lizard)
1 YCNQVLSKEIDECVTLLLQELVSFQEQIYQKDPMRAKAKRRLVMGLREVTKHMKLSKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 NLFNKLVSLTEEARKAYRDMVAAMEQEQEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_xenTro Xenopus tropicalis (frog)
1 YCNQVLSKEIDECVTVLLQELVSFQERVYQKDPVKAKSKRRLVMGLREVTKHMKLNKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFSYSGAE 0
0 SLFHNLVSLTEEARKAYKDMVSSMEQEQAEEALKNIKKVHMGHSRNPSAASAISFCSVISEPISEVNEKDY 1

>KIAA0256_danRer Danio rerio (zebrafish)
1 YCNQVLSKEIDESVTLLLQELVRFQERVYQKEPSKAKAKRRLVMGLREVTKHMKLHKIKCVIISPNCEKIQAK 1
2 GGLDEALHNIIDTCRDQSVPFVFALSRKALGRCVNKAVPVSLVGIFNYDGAQ 0
0 GLFNKLVSLTEEARRAYKEMVSALEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_tetNig Tetraodon nigroviridis (pufferfish)
1 YCNQVLSKEIDESVTLLLQELVRFQERVYQKDPTKAKSKRRLVMGLREVTKHMKLQTIKCVIISPNCEKIQAK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 SLFNQLVSLTEEARKAYKDMVSALEQEQTEEALKNEKKVPHQMGHYRNHSAASAVSFCSIFSEPISEVNEKEY 1

>KIAA0256_takRub Takifugu rubripes (fugu)
1 YCNQVLSKEIDESVTLLLQELVRFQERVYQKDPTKAKSKRRLVMGLREVTKHMKLQTIKCVIISPNCEKIQAK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNFSGAE 0
0 SLFNQLVSLTEEARKAYKDMVSALEQEQTEEALKNEKKVPHQMGHYRNHSAASAVSFCSIFSEPISEVNEKEY 1

>KIAA0256_gasAcu Gasterosteus aculeatus (stickleback)
1 YCNQVLSKEIDESVTMLLQELVRFQERIYQKDPTKAKTKRRLVMGLREVTKHMKLNKIKCVLISPNCEKIQAK 1
2 GGLDEALYNVIAMARDQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 GLFNRLVSLTEEARKAYKDMVSALEQEQAEEAQKNDKKLPHHMGHSRNHSAASAISFCSIFSEPISEVNEKEY 1

>KIAA0256_oryLap Oryzias latipes (medaka)
1 YCNQVLSKEIDESVTLLLQELVRFQERVYQKDPSKAKSKRRLVMGLREVTKHMKLHKIKCVIISPNCEKIQAK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 GLFNQLVSLTEEARKAYKEMVSALEQEQAEEALKHDKKVPHHMGHSRNHSAASAISFCSILSEPISEVNEKEY 1

>KIAA0256_pimPro Pimephales promelas (minnow) based on transcript tiling; exons by homology; 62% identity
1 YCNQVLSKDIDESVTLLLQELVRFQERVYQNEPSKAKAKRRLVMGLREVTKHMKLHKIKCVIISPNCEKIQAK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYSGAE 0
0 ALFNTLVSLTEEARRAYKEMVSALEQEQAEEALKNVKKVPHHMGHSRNPSAASAISFCSVISEPISEVNEKEY 1

>KIAA0256_calMil Callorhinchus milii (elephantfish) AAVX01105236
1 YCNQVLSKDIDECVTLLLQELVRFQERVYQKDPIKAKMKRRLVMGLREVTKHMKLRKIKCVIISPNCEKIQSK 1
2 GGLDEALYNVIAMAREQEIPFVFALGRKALGRCVNKLVPVSVVGIFNYFGAE 0
0 1

>KIAA0256_petMar Petromyzon marinus (lamprey)
1              LHKLRALIISPNCEKIQAK 1
2 GGLDEALQTVIALASEQSVPFVFALNRKALGHCLNKKVPVSVVGVFHYGGAE 0
0 THFQRLVALTEEARSAYRNMVSSLQRQEAAATSEPTGHTEDPLEASASPPSVPAHDPTALLHLLRPQQGPREDDPAEASGRSPGRNA 1

YCNQVLDKEIDATVTMLLQDLVRFQDRQYHK 0
0 DPIKAKAKRRIVMGLREVTKHLKLRKLKCIIIAPNLEKIQSK 1
2 GGLDDAIETILNLCMEQDVPFVFALGRKALGRAVNKLVPVSVVGVFNYDGAE 0


>KIAA0256_cioInt Ciona intestinalis (tunicate)
1 YCCQVLDKRVDEMSNQMLQRLVYFQDR 21 RLYKTDPAKAKRKRRVVLGFREVTKHLKMKKLRCVIISPNLEKIESK 1
2 GGLDDVLHEILDLCKEQNIPYVFALGKKALGRAVSKTVPVSIVGVFDYSGAE 0
0 1

>KIAA0256_strPur Strongylocentrotus purpuratus (sea_urchin)
1 YCNQVLDKDIDGCCTTLLQTLVKFQDRQYHKDPAK 00 AKMKRRLVMGLREVTKHLKLKKIKCVVVSPNLERIQSK 1
2 GGLDEAMDRISSLASEQNVPLIFALGRKALGRAVNKVVPVSVVGIFNYDGAE 0
0 DTYKQLLDLSTRARNAYADMVRKFQQELEAANAASAARMAKHRHHMGHNRNLFKG 1

Ribosomal L30 L7Ae motifs from 10 deuterostomes

>L30_homSap Homo sapiens (human) 4 exons numerous pseudogenes
0 MVAAKKT 0
0 KKSLESINSRLQLVMKSGKYVLGYKQTLKMIRQGKAKLVILANNCPALR 2
1 KSEIEYYAMLAKTGVHHYSGNNIELGTACGKYYRVCTLAIIDP 1
2 GDSDIIRSMPEQTGEK* 0

>L30_tupBel Tupaia belangeri (treeShrew)
0 MVAAKKT 0
0 KKSLESINSQLQLAMKDGKYVLGYKQTLKMIRQGKAKLVILANNCPALR 2
1 KSEIEYYAMLAKTGVHHYSGNNIELGTACGKYYRACTLAIMDP 1
2 GDSDIIRSMPEQTGEK* 0

>L30_ratNor Rattus norvegicus (rat) Sep15 Gpx4 Gpx1 Dio1 quite weak homology 35% with BP2 exons
0 MVAAKKT 0
0 KKSLESINSRLQLVMKSGKYVLGYKQTLKMIRQGKAKLVILANNCPALR 2
1 KSEIEYYAMLAKTGVHHYSGNNIELGTACGKYYRVCTLAIIDP 1
2 GDSDIIRSMPEQTGEK* 0

>L30_myoLuc Myotis lucifugus (microbat)
0 MVAAKKT 0
0 KKSLESINSRLQLVMKSGKYLLGYKQTLKMIRQGKAKLVILANNCPALR 2
1 ISEIEYYAMLAKTGVHHYSGNNIELGTACGKYYRVCTLAIIDP 1
2 GDSD-IRSMPEQTGEK* 0 

>L30_echTel Echinops telfairi (tenrec)
0 MVAAKKT 0
0 KNSLESINSRLQLVMKSGKYMLGYKQMLKMIRQGKAKLVVLANNCPALR 2
1 KSEIEYYAMLAKTGVHHYSGHNIELGTACGKSCRVCTLAITDP 1
2 GDADIIRSMPEQTGEK* 0

>L30_anoCar Anolis carolinensis (lizard)
0 MVAAKKT 0
0 KKSLESINSRLQLVMKSGKYVLGYKQTLKMIQQGKAKLVILANNCPALG 2
1 KSEIEYYAMLAKTGVHHYSGNNIEMGTACGKYYRVCTLAIIDP 1
2 GDSDIIRSMQEQTAEK* 0

>L30_danRer Danio rerio (zebrafish) 94%
0 MVAAKKT 0
0 KKSLESINSRLQLVMKSGKYVLGYKQSQKMIRQGKAKLVILANNCPALR 2
1 KSEIEYYAMLAKTGVHHYSGNNIELGTACGKYYRVCTLAIIDP 1
2 GDSDIIRSMPDQQQGGEK* 0

>L30_squAca Squalus acanthias (spiny dogfish) 97%
0 MVAAKKT 0
0 KKSLESINSRLQLVMKSGKYVLGYKQTLKMIRQGKAKLVILANNCPALR 2
1 KSEIEYYAMLAKTGVHHYSGNNIELGTACGKYYRVCTLAIIDP 1
2 GDSDIIRSMPEQISEK* 0

>L30_petMar Petromyzon marinus (lamprey) 94%
0 MSAKKT 0
0 KKAIESINSRLQLVMKSGKYCLGYRQTLKMIRQGKAKLVLLANNCPALR 2
1 KSEIEYYAMLAKTGVHHYSGNNIEMGTACGKYYRVCTLAIIDP 1
2 GDSDIIRSMPEQQQPQPGDK* 0

>L30_braFlo Branchiostoma floridae (amphioxus) 84% to homSap
0 MKQK 0
0 RKTMESINSRLQLVMKSGKYVLGLKETLKVLRQGKAKLIIIANNTPALR 2
1 KSEIEYYAMLAKTGVHHYSGNNIELGTACGKYFRVCTLAITDP 1
2 GDSDIIRSMPAEDKGESK* 0