Coding indels: PRNP: Difference between revisions

From genomewiki
Jump to navigationJump to search
mNo edit summary
mNo edit summary
Line 1: Line 1:
The prion gene PRNP exhibits a 6bp indel in its amino-terminal signal peptide that contributed historically to establishing the clade Euarchontoglires. From consideration of outgroups, the indel is a deletion rather than an insertion. It present in all sequenced species of rodents, rabbits, treeshrews, flying lemurs and primates but not in any other species of mammal.  
The prion gene PRNP exhibits a 6bp indel in its amino-terminal signal peptide that contributed historically to establishing the clade Euarchontoglires. From consideration of outgroups, the indel is a deletion (reducing signal pepide length from 31 to 29) rather than an insertion. It occurs in all species of rodents, rabbits, treeshrews, flying lemurs and primates sequenced to date but not in any other species of mammal.  


Remarkably, this indel distribution has held up (remained homoplasy-free) even as the number of species sequenced has come to exceed 100. (A typical mammalian gene as of November 07 can only be recovered from about 40 species.)
Remarkably, this indel distribution has held up even as the number of genera sequenced has come to exceed 100. The billions of years of branch length represented by this data suggest that the deletion was a very rare event not subject to independent reoccurence (in effect homoplasy-free). Note it does not occur in a compositionally simple region (strings of leucines are common interiorly). As a typical mammalian gene as of November 08 can only be recovered from about 40 species, meaning similar rare genetic events cannot be as stringently evaluted as in PRNP.


Consequently this data set strongly conflicts with recent proposals placing mouse basal relative to dog and human, ie (mouse,(dog,human)), because it would require a global revision of the super-ordinal mammalian tree based or assume highly non-parsimonious multiple events bizarrely timed to very near these divergence stems.
Consequently this data set strongly conflicts with the never-ending computer proposals placing mouse basal relative to dog and human, ie (mouse,(dog,human)), which would require both a global revision of the well-established super-ordinal mammalian tree and in PRNP highly non-parsimonious multiple events both bizarrely located basally at the two unrelated divergence stems (very dense phylogenetic sampling has the effect of squeezing the window on homoplasy).  


However signal region indels are very rare among the 4500-odd genes with signal peptides, no doubt due to steric requirements of the binding pocket of the signal processing complex SRP, making multiple independent events within a particular gene's signal peptide highly implausible.  
Signal region indels are not especially rare among orthologs to the [http://proline.bic.nus.edu.sg/spdb/index.html 4500-odd human genes with signal peptides] of which 595 are experimentally validated, despite steric requirements of the binding pocket of the signal processing complex SRP. In actuality the distribution of signal peptide length is fairly broad. These indels can be rapidly screened in batches of 25 by [http://genome-test.cse.ucsc.edu/cgi-bin/hgBlat Blat] alignment relative to the 44 available vertebrate genomes.
 
However few of these indels have any phylogenetic depth. It does not appear that the PRNP indel in euarchontoglires has any significant effect on cell targeting by the signal peptide (or subsequent membrane topology). It is not that indels in signal peptides are so rare but rather narrowly windowed basal events in large clades.


Below is data from 96 species:
Below is data from 96 species:
Line 16: Line 18:
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Nomascus leucogenys
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Nomascus leucogenys
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Hylobates lar
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Hylobates lar
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Symphalangus syndactylus
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca arctoides
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca arctoides
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Symphalangus syndactylus
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca fascicularis
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca fascicularis
MA--NAGCWLLVLFVATWSDTGLCKKRPKPG    Cavia porcellus
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca fuscata
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca fuscata
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca mulatta
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG    Macaca mulatta
Line 39: Line 40:
MA--NLGYWLLALFVTTCTDVGLCKKRPKPG    Rattus norvegicus
MA--NLGYWLLALFVTTCTDVGLCKKRPKPG    Rattus norvegicus
MA--NLGYWLLALFVTTCTDVGLCKKRPKPG    Rattus rattus
MA--NLGYWLLALFVTTCTDVGLCKKRPKPG    Rattus rattus
MA--NAGCWLLVLFVATWSDTGLCKKRPKPG    Cavia porcellus
MA--NLGYWLLALFVTTWTDVGLCKKRPKPG    Apodemus sylvaticus
MA--NLGYWLLALFVTTWTDVGLCKKRPKPG    Apodemus sylvaticus
MA--NLGCWLLVLFVATWSDLGLCKKRTKPG    Dipodomys ordii
MA--NLGCWLLVLFVATWSDLGLCKKRTKPG    Dipodomys ordii
Line 48: Line 50:
MA--NLGYWLLALFVATWTDVGLCKKRPKPG    Sigmodon fulviventer
MA--NLGYWLLALFVATWTDVGLCKKRPKPG    Sigmodon fulviventer
MA--NLGYWLLALFVATWTDVGLCKKRPKPG    Sigmodon hispiedis
MA--NLGYWLLALFVATWTDVGLCKKRPKPG    Sigmodon hispiedis
MV--NPGCWLLVLFVATLSDVGLCKKRPKPG    Spermophilus tridecemlineatus
MV--NPGYWLLVLFVATLSDVGLCKKRPKPG    Sciurus vulgaris
MA--HLGYWMLLLFVATWSDVGLCKKRPKPG    Oryctolagus cuniculus
MA--HLGYWMLLLFVATWSDVGLCKKRPKPG    Oryctolagus cuniculus
MA--HLSYWLLVLFVAAWSDVGLCKKRPKPG    Ochotona princeps
MA--HLSYWLLVLFVAAWSDVGLCKKRPKPG    Ochotona princeps
MV--NPGCWLLVLFVATLSDVGLCKKRPKPG    Spermophilus tridecemlineatus
MV--NPGYWLLVLFVATLSDVGLCKKRPKPG    Sciurus vulgaris
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG    Bos taurus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG    Bos taurus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG    Bison bison
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG    Bison bison

Revision as of 16:06, 22 November 2008

The prion gene PRNP exhibits a 6bp indel in its amino-terminal signal peptide that contributed historically to establishing the clade Euarchontoglires. From consideration of outgroups, the indel is a deletion (reducing signal pepide length from 31 to 29) rather than an insertion. It occurs in all species of rodents, rabbits, treeshrews, flying lemurs and primates sequenced to date but not in any other species of mammal.

Remarkably, this indel distribution has held up even as the number of genera sequenced has come to exceed 100. The billions of years of branch length represented by this data suggest that the deletion was a very rare event not subject to independent reoccurence (in effect homoplasy-free). Note it does not occur in a compositionally simple region (strings of leucines are common interiorly). As a typical mammalian gene as of November 08 can only be recovered from about 40 species, meaning similar rare genetic events cannot be as stringently evaluted as in PRNP.

Consequently this data set strongly conflicts with the never-ending computer proposals placing mouse basal relative to dog and human, ie (mouse,(dog,human)), which would require both a global revision of the well-established super-ordinal mammalian tree and in PRNP highly non-parsimonious multiple events both bizarrely located basally at the two unrelated divergence stems (very dense phylogenetic sampling has the effect of squeezing the window on homoplasy).

Signal region indels are not especially rare among orthologs to the 4500-odd human genes with signal peptides of which 595 are experimentally validated, despite steric requirements of the binding pocket of the signal processing complex SRP. In actuality the distribution of signal peptide length is fairly broad. These indels can be rapidly screened in batches of 25 by Blat alignment relative to the 44 available vertebrate genomes.

However few of these indels have any phylogenetic depth. It does not appear that the PRNP indel in euarchontoglires has any significant effect on cell targeting by the signal peptide (or subsequent membrane topology). It is not that indels in signal peptides are so rare but rather narrowly windowed basal events in large clades.

Below is data from 96 species:

MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Homo sapiens
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Pan troglodytes
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Gorilla gorilla
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Pongo pygmaeus
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Nomascus leucogenys
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Hylobates lar
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Symphalangus syndactylus
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Macaca arctoides
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Macaca fascicularis
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Macaca fuscata
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Macaca mulatta
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Macaca nemestrina
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Papio hamadryas
MA--NLGCWMLFLFVATWSDLGLCKKRPKPG     Callithrix jacchus
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Cebus apella
MA--NLGCWMLVVFVATWSDLGLCKKRPKPG     Cercopithecus aethiops
MA--NLGCWMLVVFVATWSDLGLCKKRPKPG     Cercopithecus dianae
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Colobus guereza
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Presbytis francoisi
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG     Saimiri sciureus
MA--KLGYWLLVLFVATWSDVGLCKKRPKPG     Tarsius syrichta
MA--NLGCWMLVVFVATWSDVGLCKKRPKPG     Microcebus murinus
MA--RLGCWMLVLFVATWSDIGLCKKRPKPG     Otolemur garnettii
ME--NLGCWMLILFVATWSDIGLCKKRPKPG     Cynocephalus variegatus
MA--QLGCWLMVLFVATWSDVGLCKKRPKPG     Tupaia belangeri
MA--NLGYWLLALFVTMWTDVGLCKKRPKPG     Mus musculus
MA--NLGYWLLALFVTTCTDVGLCKKRPKPG     Rattus norvegicus
MA--NLGYWLLALFVTTCTDVGLCKKRPKPG     Rattus rattus
MA--NAGCWLLVLFVATWSDTGLCKKRPKPG     Cavia porcellus
MA--NLGYWLLALFVTTWTDVGLCKKRPKPG     Apodemus sylvaticus
MA--NLGCWLLVLFVATWSDLGLCKKRTKPG     Dipodomys ordii
MA--NLSYWLLAFFVTTWTDVGLCKKRPKPG     Clethrionomys glareolus
MA--NLSYWLLALFVATWTDVGLCKKRPKPG     Cricetulus griseus
MA--NLSYWLLALFVATWTDVGLCKKRPKPG     Cricetulus migratorius
MA--NLGYWLLALFVTMWTDVGLCKKRPKPG     Meriones unguiculatus
MA--NLSYWLLALFVAMWTDVGLCKKRPKPG     Mesocricetus auratus
MA--NLGYWLLALFVATWTDVGLCKKRPKPG     Sigmodon fulviventer
MA--NLGYWLLALFVATWTDVGLCKKRPKPG     Sigmodon hispiedis
MV--NPGCWLLVLFVATLSDVGLCKKRPKPG     Spermophilus tridecemlineatus
MV--NPGYWLLVLFVATLSDVGLCKKRPKPG     Sciurus vulgaris
MA--HLGYWMLLLFVATWSDVGLCKKRPKPG     Oryctolagus cuniculus
MA--HLSYWLLVLFVAAWSDVGLCKKRPKPG     Ochotona princeps
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Bos taurus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Bison bison
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Rangifer tarandus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Alces alces
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Capreolus capreolus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Kobus megaceros
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Connochaetes taurinus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Ammotragus lervia
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Hippotragus niger
MVKSHMGSWILVLFVVTWSDVGLCKKRPKPG     Camelus dromedarius
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Capris hircus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Cervus elaphus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Cervus elaphus nelsoni
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Dama dama
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Odocoileus hemionus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Odocoileus virginianus
MVKSHIGSWILVLFVAMWSDVALCKKRPKPG     Oryx leucoryx
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Ovibos moschatus
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Ovis aries
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Ovis canadensis
MVKSHIGSWILVLFVAMWSDVALCKKRPKPG     Tragelaphus strepsiceros
MVKSHIGGWILVLFVAAWSDIGLCKKRPKPG     Sus scrofa
MVKSHMGSWILVLFVVTWSDMGLCKKRPKPG     Vicugna vicugna
MVKSHVGGWILVLFVATWSDVGLCKKRPKPG     Equus caballus
MVRSHVGGWILVLFVATWSDVGLCKKRPKPG     Diceros bicornis
MVKSLVGGWILLLFVATWSDVGLCKKRPKPG     Myotis lucifugus
MVKNYIGGWILVLFVATWSDVGLCKKRPKPG     Pteropus vampyrus
MVKSHIANWILVLFVATWSDMGFCKKRPKPG     Tursiops truncatus
MVKSHIGGWILLLFVATWSDVGLCKKRPKPG     Canis lupus familiaris
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Felis catus
MVKSHIGSWLLVLFVATWSDIGFCKKRPKPG     Mustela putorius
MVKSHIGSWLLVLFVATWSDIGFCKKRPKPG     Mustela vison
MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG     Ailuropoda melanoleuca
MVKNHVGCWLLVLFVATWSEVGLCKKRPKPG     Erinaceus europaeus
MVTGHLGCWLLVLFMATWSDVGLCKKRPKPG     Sorex araneus
MVKSHLGCWIMVLFVATWSEVGLCKKRPKPG     Cyclopes didactylus
MVRSRVGCWLLLLFVATWSELGLCKKRPKPG     Dasypus novemcinctus
MVKGTVSCWLLVLVVAACSDMGLCKKRPKPG     Echinops telfairi
MVKSSLGCWILVLFVATWSDMGLCKKRPKPG     Elephas maximus
MVKSSLGCWILVLFVATWSDMGLCKKRPKPG     Loxodonta africana
MVKSSLGCWMLVLFVATWSDVGLCKKRPKPG     Procavia capensis
MMKSGLGCWILVLFVATWSDVGLCKKRPKPG     Orycteropus afer
MVKSGLGCWILVLFVATWSDVGVCKKRPKPG     Trichechus manatus
MAKIQLGYWILALFIVTWSELGLCKKPKTRPG    Macropus eugenii
MGKIHLGYWFLALFIMTWSDLTLCKKPKPRPG    Monodelphis domestica
MGKIQLGYWILVLFIVTWSDLGLCKKPKPRPG    Trichosurus vulpecular
MARLLTTCCLLALLLAACTDVALSKKGKGKPS    Gallus gallus
MAKLPGTSCLLLLLLLLGADLASCKKGKGKPG    Taeniopygia guttata
MGKHQMTCWLAIFLLLIQANVSLAKK-KPKPS    Anolis carolinensis
MRRFLVTCWIAVFLILLQTDVSLSKKGKNKPG    Gekko gekkko
MGRYRLTCWIVVLLVVMWSDVSFSKKGKGKGG    Trachemys scripta
MGRHLISCWIIVLFVAMWSDVSLAKKGKGKTG    Pelodiscus sinensis
MPQSLWTCLVLISLICTLTVSSKKSGGGKSKTG   Xenopus laevis
MLRSLWTSLVLISLVCALTVSSKKSGSGKSKTG   Xenopus topicalis