CDH23 SNPs: Difference between revisions

From genomewiki
Jump to navigationJump to search
Line 411: Line 411:
=== Localization of disease alleles ===
=== Localization of disease alleles ===


Of the 49 known disease alleles of CDH23, 14 have a clear explanation in terms of directly disrupting a calcium binding motif. Quite a few linker domains are also affected, suggesting that L1122V is positioned where it could have an adverse impact (it matches location of deafness allele L0480Q). The first graphic below shows CDH23 marked up for these disease alleles as well as for signal peptide, linkers, cadhedrin domains, the single pass transmembrane region, potential glycosylation sites, and some apparent anomalies in wildtype calcium binding sites.
Of the 49 known disease alleles of CDH23, 14 have a clear explanation in terms of directly disrupting a calcium binding motif. Quite a few linker domains are also affected, suggesting that L1122V is positioned where it could have an adverse impact (it matches location of deafness allele L0480Q). The first graphic below shows CDH23 marked up for these disease alleles as well as for signal peptide, linkers, cadherin domains, the single pass transmembrane region, potential glycosylation sites, and some apparent anomalies in wildtype calcium binding sites.


Glycosylation with a bulky complex carbohydrate would preclude that residue from participating in the parallel dimer. However potential sites are not always utilized; these might be phylogenetically conserved but that conservation would be difficult to distinguish from overall protein invariance. In any event, there is no evident pattern to sites with correct motifs so no constraints emerge for the dimer.
Glycosylation with a bulky complex carbohydrate would preclude that residue from participating in the parallel dimer. However potential sites are not always utilized; these might be phylogenetically conserved but that conservation would be difficult to distinguish from overall protein invariance. In any event, there is no evident pattern to sites with correct motifs so no constraints emerge for the dimer.
Line 530: Line 530:
   
   


>CDH23_calMil Callorhinchus milii (elephant shark) dots = missing residues, spaces = exons, right-aligned to cadherin domains
...................... 
...NRLPYFKNYFFDQYFLIYEDTPV                GASITTLLGEDSDHDPLVYGVVGEEASRYFAVESQTGVVWLRQPLDRE ................ VIKRIVNIQVGDVNDNVPIF EC1 
HSQPYSVRILE                       NTPVGTPIYIVNATDADQGTGGSVLYSFQPPSPFFSIDGARGVVTVVKPLDYEITQAYQLQVNAT DQDEIRPLSTLANLAITITDIQDMDPIF EC2 
INLPYSTNIYENSPP GKTVRVITAIDQDRGRPRGIGYTIVS              GNTNSIFALDYISGALILNGPLDRENPLYSSGFILTVR GTELNDDRTPSNATVSTTFNILVIDVNDNPPEF EC3 
NRSQYSVSIPELAQVGFALPLFIQVQDKDD                 .................................................. LFANETASDHVGFARVKINLINENDNRPVF EC4 
SQPLYNVNLFENASVGITVIRVI                           ..................... ...............................................DVNDNVPTF EC5 
.......................                  AIDDDSPPNNQITYSIFNASIQSNYFDISVSEGYG .......................................... ENDNSPTF EC6 
SQTSYIIAVSENIIA GATVLFVTATDLDQSREYGQESMIYSLEGSSQFRINTR                    GEITTTSLLDRETKFEYILIVRAVDGGMGYNQKTGIAT VNITLLDMNDNHPLW EC7 
KDEPYLVNVVEMSPAHTDVIT               VSAFDPDLSENGTVAYTIHPPNRFYHINSTTGKIRTSGAVLDRENMNVRAAEMMRKIIVSVTD .GNPPLRASSSTTVTVNLLDLNDNDPSF EC8 
ENLPFVAEVPEGLTAASSVFQ                      VLAVDPDENLNGLVTFTMQVGMPRLDFIMNTTTGLITSTALLDREKIAEYYLRIIASDAGMVPRSSTSTLTVR ..DVNDETPTF EC9 
........................................                    ...................................... DNGPAGSRRTGTATVYIRVLDVNDNRPIF EC10 
LQNTYEASVPENITMSTSILQ ....................                    .................................................. VILYVEDVNDESPVF EC11 
TQQQYSRLGLRETAGIGTSVNVVRATDKDT                       GDGGMVAYRILAGSERKFAIDESTGLITTIDYLDYETRTNYLMNISATDQAAPFNRGYCTVYITLMNELDEPVQF EC12 
TNATYEVTLMENIATGTDVIQIHAQSADIMNQITYRFDPDTSALALSLFQINRVT                       GIITVRGQVDRERGDFYLITVIADNGGPRKDSTV ......DENDNSPRF EC13
...................................                  VNYSILAGNTNGAFRIRTTNNSRGEVYIAKLLDREWISRYVL. ......................DINDNPPVI EC14
............                    ............................................................................ VTTMVFITILDENDNYPVF EC15
RQQLYEITLDEGPLTLYSFNITVNATDQDEGLNGTISYSILEGNIGNTFVI....                 GLIKAIKELDYEISHGRYTLIVAAIDQCADRERRLTSTTT ............PTF EC16
ARSYEGPFDITEGQPGPRVWTFIAADKDAGPNGQVEYSVIAGDPL                     EFIISPVDGELRVKRDAELDRETLSFYNITIMGKDRGTP..... ........DINDNDPVL EC17
.....................................................                   SGSVYVNRPLDRERTEEYRLTITVKDNPENPRHARR DSDFLIISISDENDNRPIF EC18 
TQNTYQAEILENSRA GTQITLVNGPIMAHDRDEGPNGVVSYQLLGNRTDFFTIDRTS                 ............................................DDNDNRPTF EC19
...............                      GFSVVQVKATDADIGSNQQLNYRIEGGAQDKFIIDLLTGVIKVANGVTVDREERESYTLIVIAMDRGIPSLSASATVNIIIDDVNDYRPDF EC20
INPIQTVSVSESAAIGTIIANVTAIDQDLRPRLEYYIIELVAKDDTDAVIQDQQRAFGINFET        ....................................... AKLTVNVQDVNDNAPRF EC21 
RPFGVKAFTERILEGATAGTTLISVTAVDSDRGLNGRIIYELLNLPTGGYIRLEDPAA                    ...ANRTVDYEEVHWLNFTVRAVDHGTPPRSVQVPVNLQIVDINDNNPVF EC22 
VQSSYQ ..VFEDVSLGTVIIQVSATDADYGSFALIEYSLVDGEGRFGINPTT                .DIYILSALDREKKDHYTLTAVAKDNPEGNPSKRRENSVQ ........NDYRPQFSKR EC23 
EFSTSVYENEPGGASVITMTARDLDEGENAVLQFSVEGPGA                             DAFRMDGDTGLITTSRLLQSHEMFNLTVVATDNGRPPLWGTTLLKVDVIDVNDNRPVF EC24 
VRPPNGTILHIME                .......VYEVLAVDSDEGLNGAVRYSFLKTGVNKDWEYFNIDSLSGIINTTIRLDREKQPVYN LILLAYDLGHPVPYETTQLLQVALDDIDDNEPSF EC25 
LKPP RGRFQYQLLSVPEHSRPGVIVGNVTGAVDADEGPNAVVYYFIA AGDPDRNFQLNRNGLLKILKDLDREINPYYSITVKASSNRNWSPARSARRNRGFNLSTDLTLQEVRIYLEDINDQPPRF EC26 
LKPEYTA GVAADAKVGSELIKVDAYDADIGNNSIVYYQILNIRYIKLQSNNSEEVGNVFII  GEKSGIVRTFDLFTAYSPGYFVVEIMVSDLAGHNDTAVIGIYILRDDQRVKIVINEIPERVREF EC27
GEQFIKLLSNITGAIVNTDDVQ FHVDKKGRVNFAQTNVLIHVVNRETNQILDVER VIQMIDENKEQLRNLFRNYNVLDVQPAVTVRPADDMTALQ ........................... 
............. GNQGFMDILDMPNTNKYSFEG ANPVWLDPFCRNLELAAQAEHEDDLPENLSDITDLWNSPARTH GTFGREPVTSKPEDDRYLRAAIQEYDSITKLGQIMREGPIK 
GSLLKVVLDDYLRLKKLFAARLAHKSTGSGDQSSLTE LIPSDMDED  DEKPMSRGTLRLKHRHPVEFKGPDGIHVVHGSTGTLLASDLNSLPEDDQKVLVRSLETLNTDSGSYNDRNARTESAKSTPMHKIKETIMDAPLEITEL*
  >PCDH15_homSap Homo sapiens (human)
  >PCDH15_homSap Homo sapiens (human)
  MFRQFYLWTCLASGIILGSLFEICLG QYDDDCKLARGGP
  MFRQFYLWTCLASGIILGSLFEICLG QYDDDCKLARGGP

Revision as of 14:13, 28 August 2009

CDH23 SNPs

CDH23 (cadherin 23) on 10q22.1 is one of the better understood genes of the Usher disease complex. These genes generally encode structural proteins utilized in both hearing and visual systems -- and so at the mutational level by effects on both. Stop codons within CDH23 cause both deafness and blindness (USH1D) whereas missense alleles can affect hearing only (DFNB12). Both conditions are autosomal recessive. However one bad copy of CDH23 in conjunction with one bad allele of PCDH15, protocadherin 15 on 10q21.1 (17 million bp over, not tandem) can perhaps give rise to digenic disease USH1H. That could have a simple physical explanation in defective heteroligomeric binding of the two terminal domains where the respective cSNPs occur.

Many Usher genes function both transiently during development of cochlea and retina and permenantly in adult structures. These functions may localize to multiple sites within each organ, for example ribbon synapses and stereocilia. CDH23, like many of these proteins, has different binding partner issues in cytoplasmic (USH1C harmonin, MYO7A myosin, USH1G sans) versus extracellular and transmembrane domains. Other unrelated cell types elsewhere in the body may use these gene products (or particular splice variants) though mutant alleles manifest most sensitively in hearing and vision (where mouse serves erratically as human disease model). The role of CDH23 in hair tip links has recently been disentangled from its transient but critical role in hair cell development.

However some coding variants of CDH23 are simply near-normal (or even adaptive) polymorphic variants not giving rise to problems during the carrier's lifespan, though subtle subclinical effects on age related (or noise-induced) hearing loss or night vision acuity might still occur. In the past, such variations would be occasionally be detected within geneologies of affected indiviuals but not track with their disease; today, coding SNPs are far more likely to emerge -- and in far greater numbers -- simply in the course of genomic screening. That trend will only accerate with the advent of rapid screening platforms such as Nimblegen that can affordably screen the entire human proteome.

Note these myriad new cSNPs needing interpretation will come with accurate population frequencies further stratified by ethnic group distribution. That can be viewed as 'close-up' comparative genomics that complements the longer view of reduced alphabet afforded currently by CDH23 orthologs in 50-odd vertebrate genome phylogenetic tree. These considerations, along with accurate 3D models of both the cadherin module affected and protein binding partner, greatly help in interpreting disease implications of particular observed SNPs (for example E737V), yet uncertainty will remain in many instances.

Here a newly observed cSNP in a Kalahari Bushmen, heterozygous L1122V in exon 26, lies fall just before the boundary of the 11th of 27 cadherin ectodomains of the 3354 residue, 67 exon protein. This would appear unremarkable except for the observation that valine is the ancestral mammalian value here and it is conserved over a vast phylogenetic time scale.

It does not suffice to consider the structual impact of L1122V in isolation (say by modeling the two adjacent cadherin domains and the intervening region). Even though CDH23 is highly extended (meaning other cadherin domains are irrelevent to L1122V), it forms a parallel dimer in hair tip links implying a side-by-side uncharacterized interaction with a second copy of the relevent L1122V-containing domains. Leucine, valine and isoleucine are hydrophobic residues often important to tight packing of globular fold interiors (where they are often not interchangeable) but also can occur similarly occluded in dimer surfaces patches.

Calcium play an important role in connecting consecutive cadherin domains into a stable non-extensible structure. (If the structure extended in response to an auditory vibration, the tip link would not then open the mechanotransduction channel.) Three Ca+2 ions are bound by each domain, so 81 for CDH23 and 33 for PCDH15 if all sites are utilized. About a third of the 49 known disease alleles directly alter a calcium chelating residue.

The current view of the placement of this residue in auditory hair cell stereocilia tip links involves the relationship between PCDH15 and CDH25 as shown below. Twisted homodimers of each form before they meet at their special first cadherin domain to form the two domain-swapped dimers because both proteins are anchored by transmembrane regions and cytoplasmic interactions. If PCDH15 continues the twist of CDH25 (say clockwise viewed N to C, then the net result is neither homodimer can unravel. However if a substitution weakens the initial twisting, the connection might not stably form.

It's currently unclear how two vaguely homologous proteins could abut at their ends (which are actually the amino-termini minus signal peptides) as a non-covalent tetramer that is stable for 7-8 decades -- there is little evidence for repair in mammals. The much-studed domain swap model in other cadherins appears not applicable here (key tryptophan and initial strands lacking) and a lack of cysteines in the first 3 cadherin domains rules out a disulfide. Because the proteins are oriented C<--N|N-->C, PCDH15 cannot continue a Ca+2 binding motif begun in CDH23 with joint binding providing a seamless connection.

CadDom.jpg

Comparative genomics

Orthologs of CDH23 are available from 42 vertebrates in the exon containing L1122V. The following exon is quite short, therefore difficult to obtain by blast and transcripts are uncommon so deep in a gene. An unusual GC-AG phase 0 intron separates these exons and may help identify ancient orthologs. Residue 1122 lies in a linker region between two adjacent cadherin domains but to model this region it will prove necessary to model consecutive globular domains as well.

Observe that while leucine is sometimes found at this position in other species, that occurrence is concentrated strictly in early diverging vertebrates. In all 33 species of tetrapods (where sound is conducted primarily through air), the value here is exclusively valine. Note in particular the four other species of great apes have valine with no indication of heterozygosity.

From this perspective, L1122V may reflect retention of the ancestral value in one allele, rather than result from de novo back mutation from a L1122L homozygote. In other words, L1122 could be viewed as a mutation apparently fixed for better or worse in all other human populatons -- at a position conserved over billions of years of branch length in phylogenetically related species.

Dozens of disease alleles are known, for example D124G, P240L, E247K, R301Q, A366T, N452S, L480Q, R582Q, H755Y. These often directly disrupt a Ca+2 binding motif (thus in EC1 at 124: DVNDNAPTF --> GVNDNAPTF) so demonstration of residue phylogenetic conservation would support that criterion in the analysis of L1122V.

Posy et al observe cadherin Ca2+ binding domains like DxNDN are located in a linker region so cannot be clearly associated with one of the folded beta-sheet domains, observing further that the domain definition at SMART v.34 omits the N-terminal region critical to EC domains (A*, A and part of B strands) whereas Pfam35 drops the first two. These tools also err in defining calx-beta domains important to vlgr1 binding to usherin.

To see if the domain swap dimer model is applicable to link EC1 of PCDH15 to EC1 of CDH25 -- it seems not because the conserved tryptophan is not present -- it is preferable to associate the Ca2+ binding inter-domain residues with the domain from which they originate but not include residues more naturally part of the following domain. Note the first 3 EC regions of human CDH23 lack any cysteine so a disulfide linkage can be ruled out.

              <1......................exon 26.................0> <0..exon 27.......1> <2...............exon 28........................0>
              <-----------EC10-------------><---interdomain----- -><----------------- -------------------------EC11-------------------->
              ....................Ca+2........................^. ....Ca+2............ ..................................................
CDH23_homSap  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQ LKATDADEGEFGRVWYRILH GNHGNNFRIHVSNGLLMRGPRPLDRERNSSHVLIVEAYNHDLGPMRSSVR
CDH23_panTro  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_gorGor  dNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILR
CDH23_ponAbe  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASIPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_nomLeu  DNGPVGKRHTGTATVFITVLDVNDNRPIFLQSSYEASIPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_rheMac  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_calJac  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILQ
CDH23_tarSyr  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_micMur  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILR
CDH23_musMus  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_ratNor  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_cavPor  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_speTri  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ
CDH23_oryCun  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LRATDADEGEFGRVWYRILH
CDH23_ochPri  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIVEGHSIVQ LRATDADEGEFGRVWYRILH
CDH23_bosTau  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVSEDIPEGHSIVQ LKATDADEGEFGRVWYRIVH
CDH23_canFam  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKAADADEGEFGRVWYRILH
CDH23_felCat  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_pteVam  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LMATDADEGEFGRVWYRILH
CDH23_turTru  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LTATDADEGEFGRVWYRILH
CDH23_susScr  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ
CDH23_equCab  DNGPVGKRRTGTATVFITVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILQ
CDH23_eriEur  dNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_loxAfr  DNGPVGKRRTGTTTVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGLVWYRILH
CDH23_proCap  DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASIPEDIPEGHSIVQ LKATDADEGEFGRVWYRILR
CDH23_echTel  dNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH
CDH23_choHof  dNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGRSIVQ
CDH23_monDom  DNGPVGKRRTGTATIYVTVLDVNDNRPIFLQSSYEASVPEDIPEGSSIVQ LMATDADEGDNGRVWYRILH
CDH23_macEug  DNGPVGKRRTGTATVYVTVLDVNDNRPIFLHSSYEASISEDIPEGSSIVQ LMATDADEGDNGRVWYRILH
CDH23_ornAna  DNGPSGKRRTGTATVYVTVLDVNDNRPIFLQSSYEASVPEDIPEASSIVQ LKATDADEGEYGRVWYRIIS
CDH23_galGal  DNGPTGNRRTGTATVYVTVLDVNDNRPIFLQSSYEASVPEDIPAASSIVQ VKATDADEGVNGRVWYRIVK
CDH23_taeGut  DNGPSGNRRTGTATVYVTVLDVNDNRPIFLQSSYEVSVPEDIPAASSIVQ VKATDADEGINGRVWYRIVK
CDH23_anoCar  DNGPTGKRRTGTATVHVTVLDVNDNRPYFLQSSYEATVPEDIPDYSSIVQ VKATDADEGINGRVWYRIVK
CDH23_xenTro  DNGPAGNRKTGTATVSVTVLDINDNKPIFLKSSYEASVPENVPFSSSIVQ LEATDADEGDNGLVWYRILS
CDH23_oryLat  DNGPAGSRRTGTATVFVEVLDVNDNRPIFLQNSYETSVLETVPQGTSILQ VQATDADQGENGRVLYRILS
CDH23_takRub  DNGPAGSRRTGTATVFVEVQDVNDNRPIFLQNSYETGILESVPQGTSILQ VQATDADQGENGRVLYRILT
CDH23_danRer  DNGPAGGRRTGTATVYVEVLDVNDNRPIFLQNSYETSVLENIPRGTSILQ VQATDADQGENGKVLYRILS
CDH23_gasAcu  DNGPAGSRRTGTATVFVEVQDVNDNRPIFLQNSYETSILESVPQRTSILK VQATDADQGENGKVLYRILT
CDH23_tetNig  DNGPAGSRRTGTATVFVEVQDVNDNRPIFLQNSYETSVLESVPQGTSILQ VQATDADQGENGSVLYRILT
CDH23_ictPun  DNGPAGDRKTGTATVYVEVLDVNDNRPIFLQNSYETTVLENVPRGSSVLQ
CDH23_calMil  DNGPAGSRRTGTATVYIRVLDVNDNRPIFLQNTYEASVPENITMSTSILQ VSATDADTGQNGRLTYQILQ
CDH23_petMar  DHGPAGSRRTGTTTLDVLVLDVNDNRPLFLEGSYZVSVPDNVTRGAIFLQ
              ................................................^.
CDH23_homSap  DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQ
CDH23_panTro  ................................................V.
CDH23_gorGor  ................................................V.
CDH23_rheMac  ................................................V.
CDH23_calJac  ................................................V.
CDH23_pteVam  ................................................V.
CDH23_ponAbe  .....................................I..........V.
CDH23_nomLeu  ................I....................I..........V.
CDH23_tarSyr  ........R.......................................V.
CDH23_micMur  ........R.......................................V.
CDH23_musMus  ........R.......................................V.
CDH23_ratNor  ........R.......................................V.
CDH23_cavPor  ........R.......................................V.
CDH23_speTri  ........R.......................................V.
CDH23_oryCun  ........R.......................................V.
CDH23_canFam  ........R.......................................V.
CDH23_felCat  ........R.......................................V.
CDH23_turTru  ........R.......................................V.
CDH23_susScr  ........R.......................................V.
CDH23_echTel  ........R.......................................V.
CDH23_eriEur  ........R.......................................V.
CDH23_proCap  ........R............................I..........V.
CDH23_equCab  ........R.......I...............................V.
CDH23_loxAfr  ........R...T...................................V.
CDH23_choHof  ........R....................................R..V.
CDH23_bosTau  ........R.............................S.........V.
CDH23_ochPri  ..........................................V.....V.
CDH23_monDom  ........R.....IY.............................S..V.
CDH23_macEug  ........R......Y..............H......IS......S..V.
CDH23_ornAna  ....S...R......Y............................AS..V.
CDH23_galGal  ....T.N.R......Y...........................AAS..V.
CDH23_taeGut  ....S.N.R......Y...................V.......AAS..V.
CDH23_anoCar  ....T...R......H...........Y........T......DYS..V.
CDH23_xenTro  ....A.N.K......S.....I...K....K.........NV.FSS..V.
CDH23_oryLat  ....A.S.R........E.............N...T..L.TV.Q.T....
CDH23_takRub  ....A.S.R........E.Q...........N...TGIL.SV.Q.T....
CDH23_tetNig  ....A.S.R........E.Q...........N...T..L.SV.Q.T....
CDH23_gasAcu  ....A.S.R........E.Q...........N...T.IL.SV.QRT...K
CDH23_danRer  ....A.G.R......Y.E.............N...T..L.N..R.T....
CDH23_ictPun  ....A.D.K......Y.E.............N...TT.L.NV.R.S.V..
CDH23_calMil  ....A.S.R......YIR.............NT.......N.TMST....
CDH23_petMar  .H..A.S.R...T.LD.L.........L..EG..ZV...DNVTR.AIF..
   Consensus  .n..a...r...a.v..t.l.......i..qs..#as!p#.!p.g.siv.
              ................................................^.

Comparative anatomy

The remarkable auditory hair cell linker provided by CDH23 and PCDH15 is not a vertebrate innovation. Instead it must date back to the pre-bilateran ancestor because contemporary cnidarians such as Nematostella have a very similar overall structure that incorporates a strong homolog to CDH23. This cannot be plausibly attributed to convergent evolution given the extent of structural agreement of kinocilium, sterocilia, lateral and tip links. Note in mouse that the link is polarized with PCDH15 attached to the shorter sterocilium and CDH23 to the longer.

CDH25compAnat.jpg

The Nematostella protein most resembling CDH23 has 6,074 residues, three transmembrane helices and 44 contiguous cadherin ectodomains with 4x-periodicity. Thus the correspondence at the protein level is imperfect. However antibodies show it is distributed on stereocilia of anemone hair bundles and required for tentacle sensitivity to vibration (prey detection). It provides both lateral and tip linkages. It can be predicted to form a coiled parallel homodimer like mammalian CDH23.

Nematostella also has long but weak matches to PCDH15, namely XM_001638202 and EU289217. However upon back-blast to human, these do not quite pull up PCDH15 as best match but rather its closest paralog FAT4 (unsurprising given its 34 cadherin, 6 EGF and 2 lamG domains), perhaps because of lineage-specific expansion or because the blast score is inflated. Possibly CDH23 had not yet undergone duplication and divergence to protocadherins and it alone may play a double homodimer linker role in anemone stereocilia.

Nematocyst discharge is sensitive to calcium levels and streptomycin (like vertebrate mechanotransduction channels) but is insensitive to the MET channel blocker amiloride. The channel itself has not been identified in anemone either.

Prey capture can result in signficant trauma to anemone tenacle hair bundles but this can be repaired using a protein again with similarities to a vertebrate stereocilia repair protein ARL5B which acts on the extracellular face of the plasma membrane along stereocilia in the vicinity of tip links. Human and cnidarian protein XM_001629283 are 77% identical:

homSap ARL5B  MGLIFAKLWSLFCNQEHKVIIVGLDNAGKTTILYQFLMNEVVHTSPTIGSNVEEIVVKNTFLMWDIGGQESLRSSWNTYYSNTEFIILV
              MGL+FAK +S F N+EHKVIIVGLDNAGKTTILYQFLMNEVVHTSPTIGSNVEEIV KNHF+MWDIGGQESLRS+WNTYY+NTEF+ILV 
nemVec repair MGLLFAKFFSWFSNEEHKVIIVGLDNAGKTTILYQFLMNEVVHTSPTIGSNVEEIVWKNIHFIMWDIGGQESLRSAWNTYYTNTEFLIL

homSap ARL5B  HVDSIDRERLAITKEELYRMLAHEDLRKAAVLIFANKQDMKGCMTAAEISKYLTLSSIKDHPWHIQSCCALTGEGLCQGLEWMTSRI
               +DS DRERLAI+K ELY+MLA+E+L++AA+ LI ANKQD+KG M+ AEIS+L L+ IKDH WHIQ+CCALTGEGL QGLEW+T+++
nemVec repair VIDSTDRERLAISKAELYQMLANEELKQAALLILANKQDIKGSMSVAEISEQLNLACIKDHGWHIQACCALTGEGLYQGLEWITTQL

Note 'stereocilia' is an anatomical misnomer. These are instead actin-based membrane protrusions. It is the kinocilium that is a true cilium in both anemone and developing vertebrate hair cells. If parallels to ciliary photoreceptors are sought, these should be with the kinocilium rather than stereocilia. Since no known counterpart to the PCDH15-CDH23 linker occurs in vision, the commonality (Usher syndrome) may reside primarily in ribbon synapses of auditory and photoreceptive neurons.

Pseudogene and paralog issues

No potential exists here for mis-determining orthologous exons, even in remote species such as lamprey with poor assemblies. Exceedingly long genes such as this are not well-represented as retrogenes (which begin 3' and truncate early). Position 1122 is too remote from the 3' terminus. Relevent pseudogenes are not observed by Blat of human, macaque, and dog. Indeed, this exon gives a unique match at this level of sensitivity, even though cadherins are very widespread in the proteome.

The top matches to CDH23 within the human genome are shown as provided by GeneSorter at UCSC. Observe the established binding partner PCDH15 is by no means the best match; the e-value is high but only because of many weak alignments of tandem cadherin domains.

PCDH15 has 11 cadherin domains and a transmembrane region C-terminally. It's problematic to call it a paralog of CDH23 even though all cadherin domains must ultimately coalesce. PCDH15 tandem domains, which cannot be put in 1:1 correspondence with the 27 cadherin domains of CDH23, might have had quite a different history of sources in its domain histories. The evolution of these gene families might be better worked out using interdomain spacer regions and intron position and phasing. Particular regions can be very conserved in themselves while not display much conservation between spacers.

Here the spacer region of CDH23 containing L1122 has best match within the human proteome to PCDHB14 (protocadherin beta 14), far down on the list of overall best matches. The interdomain region is shown in blue below. Note the best matches internally to other spacers are quite weak and neither L nor V is conserved in them.

PCDH15   S-TLTLAIKVLDIDDNSPVFTNSTYTVLVEENLPAGTTILQIEAKDVDLG---ANVSYRIRS
         + | |+ + |||++|| |+|  |+|   | |++| | +|||++| | | |     | |||  
CDH23    TGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQILKATDADEGEFGRVWYRILH  
         +GT  V + VLD+NDN P F QS YE  VPED P G  I  + A D D G +G++ Y   H
PCDHB14  SGTTLVLIKVLDINDNAPEFPQSLYEVQVPEDRPLGSWIATIISAKDLDAGNYGKISYTFFH

                           IFLQSSYEASVPEDIPEGHSILQLK 1122
                           TFQNLPFVAEVLEGIPAGVSIYQVV 928  CDH23 internal spacer
                           IFSQPLYNISLYENVTVGTSVLTVL 497  CDH23 internal spacer
                           TFFPAVYNVSVSEDVPREFRVVWLN 1034 CDH23 internal spacer
                           TFHNQPYSVRIPENTPVGTPIFIVN 169  CDH23 internal spacer
                           TWKDAPYYINLVEMTPPDSDVTTVV 809  CDH23 internal spacer

CDH23    0      chr10:72826710-73245710   cadherin-like 23
FAT4     0      chr4:126457017-126633537  FAT tumor suppressor homolog 4
DCHS1    0      chr11:6599134-6633650     dachsous 1
FAT3     0      chr11:91724910-92269283   FAT tumor suppressor homolog 3
FAT1     0      chr4:187745931-187881981  FAT tumor suppressor 1
FAT2     0      chr5:150863846-150928698  FAT tumor suppressor 2
DCHS2    0      chr4:155375138-155531899  dachsous 2 isoform 1
CELSR2  1e-115  chr1:109594164-109619901  cadherin EGF LAG seven-pass G-type receptor 2
CELSR1  5e-113  chr22:45135395-45311731   cadherin EGF LAG seven-pass G-type receptor 1
CELSR3  5e-109  chr3:48637835-48684985    cadherin EGF LAG seven-pass G-type receptor 3
PCDH24  3e-87   chr5:175908971-175955375  protocadherin LKC
...
PCDHB14  5e-53  chr5 140,584,653          protocadherin beta 14

Structural significance to normal function

Blastp at PDB of the region around residue L1122 establishes that the best fit to an already determined structure is the 39% identity match to mouse cadherin CDH8, 2A62. Within the 25 residue interdomain region, the percent identity is somewhat higher at nearly 50%. While not ideal, this should still allow accurate modeling of the adjacent cadherin domains and the critical spacer region, although the structural effects of the L1122V may be fairly subtle.

CDH23  1    GTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQLKATDADEGEFGRVWYRILHGNHGNNFRI
            GT T+ VT+ DVNDN P F QS Y  SVPED+  G +I ++KA D D GE  +  Y I+ G+    F I 
CDH8   174  GTTTLTVTLTDVNDNPPKFAQSLYHFSVPEDVVLGTAIGRVKANDQDIGENAQSSYDIIDGDGTALFEI
CDHe737v.jpg

Since CDH23 is known to form a helical homodimer in tip links -- and such binding patches can involve hydrophobic residues that otherwise would be buried -- the quaternary structure here is the main unknown. Crystallographic adjacency in the unit cell does not always reflect oligomeric solution structure. Consequently, it may not be possible to fully evaluate L1122V despite the favorable match at PDB.

There is no reason to think L1122V would directly affect the calcium binding motifs (LDRE, D.ND, D.D) of either adjacent cadherin domains in the manner of the E737V salsa mouse mutation in exon 22 or D124G, R1060W, E1595K and D2202N, none of which have syndromic effects on the retina but demonstrably weaken calcium dependent binding to PCDH15 even though they do not lie in the amino-terminal cadherin binding domains.

An effect of L1122V on the MET (mechanotransduction) channel is also implausibly indirect because this is at the lower (PCDH15) end of the link tip, though that is disputed for larger sound displacement effects.

Similarly an up-link intracellular effect of extracellular L1122V on CDH23 binding to harmonin would be a stretch. That binding is now thought mediated by an autonomously folding region proximal to the harmonin PDZ motif with a short internal peptide of CDH23 that extends from 3180-3211, KPDDDRYLRAAIQEYDNIAKLGQIIREGPIK, over 2,000 residues away.

The diagram below summarizes what is currently known about homotypic and heterotypic binding of proteins within the Usher network. Some of these remain unclear, like those of whirlin USH2D which also localizes at stereocilia tips and has an N-terminal domain like that of harmonin but yet does not bind CDH23. These interactions must be understood before SNPs such as V1122L can be modelled in their quaternary context and assessed with any confidence.
USHprots.jpg

Comparison of CDH15 domains to PCDH15

Given that CDH23 and PCDH15 binding constitutes the critical part of hair links joining a lower stereocilia to the next higher, it's worth investigating the evolutionary relationship betweeen the 27 and 11 cadherin ectodomains (ie the extent to which these proteins are paralogs and so naturally dimerize). It emerges that the cadherin domains have little residual sequence identity to each other within or across proteins, even though each individually has very considerable phylogenetic conservation.

Further, the cadherin domain of CDH23 best corresponding to a cadherin domain of PCDH15 (as linearly ordered in its primary sequence) are out of order with respect to the primary sequence of CDH23: 25, 16, 15, 13, 12, 16, 14, 2, 9, 18, 12. This suggests if a deep relationship ever existed between these two proteins, different rates of cadherin domain evolution has obscured it.

However when the 27 spacer domains separating cadherin units are concatenated, SMART detects an internal repeat within CDH23 of the first 11 to the last 11 spacers. Furthermore, these repeats give weak but full length alignments to the 11 concatenated spacer units of PCDH15. (Use of spacers avoids confounding issues of cadherin domain cross-matching.) This suggests the latter is ancestral length and that CDH23 experienced an internal duplication that lengthened its ectodomain. Spacers is surely a misnomer as the degree of phylogenetic conservation suggests each is under considerable selection along with its associated cadherin domains.

When the cytoplasmic domains are aligned (using uncorrected genomic alignments from the 46 species UCSC data set), it emerges that CDH23 (below, top graphic) is far better conserved to far better phylogenetic depth than PCDH15 (bottom). This conservation does not extend into early deuterostomes, protostomes or cnidarians but rather emerged abruptly in lamprey (ie synchronoously with ciliary opsins and the retina).

Presumably both are specifically anchored to the internal Usher protein network at one or more domains, with the terminal PDZ binding motif ITEL* and STSL* respectively) and internal motif IM KPDDDRYLRAAIQEYDNIAKLGQIIREGPIK serving to anchor CDH23 but not PCDH15 bidentately to harmonin N-domain. However these domains can explain only a small fraction of observed CDH23 cytoplasmic conservation. Note runs of compositionally simple sequence cannot form ordered secondary or tertiary structure.

Intriguingly, whirlin (USH2D) has a similar domain structure to harmonin, in particular the N-domains align. Although this does not bind the cytoplasmic internal motif of CDH23, it could plausibly bind a comparable region in PCDH15, in effect quasi-paralog binding to quasi-paralog.

Overview of cytoplasmic tail conservation in vertebrate CDH23 and PCDH15: red: 80% conserved residues, blue 50% conserved, black variable:

CadTails.jpg

Difference alignment of human CDH23 cadherin domains relative to L1122V in CDH23.11

 Consensus  ....y.........p.g...* .v.a.d.d.g.n....y..............f......................dre......l...a.d................v.i.v.#.#.n...f 
 CDH23.11   LQSSYEASVPED-IPEGHSIL QLKATDADEGEFGRVWYRILH----GNHGNNF-RIHVS----NGLLMRG-PRPLDRERNSSHVLIVEAYNHD--LGPMRSSV---RVIVYVEDINDEAPVF 
 CDH23.09   QNLPFV.E.L.G-..A.V..Y .VV.I.L...LN.L.S..MPV----.MPRMD.-L.NS.----S.VV-VT-TTE.....IAEYQ.R.V.SDAG--TPTKS.TS---TLTIH.L.V...T.T. 
 CDH23.20   SPATLTVHLL.N-C.P.F.V. .VT...E.S.LN.ELV...EA----.AQDR-.-L..LV----T.VIRV.-NATI...EQE.YR.T.V.TDRG--TV.LSGTA---I.TILID....SR.E. 
 CDH23.15   SPFG.NV..N.N-VGG.TAVV .VR...R.I.INSVLS.Y.TE----..KDMA.-.MDRI----S.EIATR-.A.P....Q.FYH.VATVEDEG--TPTLSATT---H.Y.TIV.E..N..M. 
 CDH23.18   .NLPMNITIS.N-S.VSSFVA HVL.S...S.CNA.LTFN.TA----..RERA.-F.NAT----T.IVTV--N.......IPEYK.TISVKDNP--EN.RIARRDYDLLLIFLS.E..NH.L. 
 CDH23.14   FT.DSAV.I...-C.V.QRVA TV..W.P.A.SN.Q.VFSLAS----..IAGA.EIVTTN----DSIGEVFVA......ELDHYI.Q.V.SDRG--TP.RKKDH---ILQ.TIL....NP..I 
 CDH23.17   RDYEGPFE.T.G-Q.-.PRVW TFL.H.R.S.PN.Q.E.S.MD----.DPLGE.-V.SPV----E.V.RVRKDVE....TIAFYN.TIC.RDRG--MP.LS.TM---L.GIR.L....ND..L 
 CDH23.05   S.PL.NI.LY.N-VTV.T.V. TVL...N.A.T..E.S.FF------SDDPDR.-SLDKD----T..I.L--IAR..Y.LIQRFT.TII.RDGGG-EETTGR------.RIN.L.V..NV.T. 
 CDH23.23   D.P..QEA.F..-V.V.TI.. TVT.....S.N.ALIE.SL------.DGESK.-A.NPT----T.DIYV--LSS....KKDHYI.TAL.KDNPG-DVASNRRENSVQ.VIQ.L.V..CR.Q. 
 CDH23.07   SKPA.FV..V.N-.MA.ATV. F.N...L.RSREYGQESI.YS----LEGSTQ.-..NAR----S.EITT--TSL....TK.EYI...R.VDGG---VGHNQKTGIAT.NITLL....NH.TW 
 CDH23.10   FPAV.NV..S..-V.REFRVV W.NC..N.V.LNAELS.F.TG----..VDGK.-SVGYR----DAVVRT--VVG....TTAAYM..L..IDNG---PVGKRHTGTAT.F.T.L.V..NR.I. 
 CDH23.02   HNQP.SVRI..N-T.V.TP.F IVN...P.L.AG.S.L.SFQP----PS-----QFFAID----SARGIVTVI.E..Y.TTQAYQ.T.N.TDQ.K-TR.LSTLA---NLAIIIT.VQ.MD.I. 
 CDH23.25   PPNGTILHIR.E-..LRSNVY EVY...K...LN.A.R.SF.K----TAGNRDWEFFIID----PISGLIQTAQR....SQAVYS..LV.SDLGQ-PV.YETMQ---PLQ.AL...D.NE.L. 
 CDH23.12   Q.QYSRLGLR.T-AGI.T.VI VVQ...R.S.DG.L.N....S----.AE.K----FEID----ESTGLIITVNY..Y.TKT.YMMN.S.TDQAP-PFNQGFCS---VY.TLLNELDEAVQF  
 CDH23.03   INLP.STNIY.H-S.P.TTVR IIT.I.Q.K.RPRGIG.T.VS----..TNSI.ALDYI.G----V.TLN.LLDRENPLYSHGFI.T.KGTELND-DRTPSDATVTTTFNIL.I....N..E. 
 CDH23.08   KDAP.YINLV.M-T.PDSDVT TVV.V.P.L..N.TLV.S.QP------PNKFYSLNSTTG---KIRTTHAMLDRENPDPHEAELMRKIVVSVTDCGR.PLKATSSAT.F.NLL.L..ND.T. 
 CDH23.06   QKDA.VGALR.N-E.SVTQLV R.R...E.SPPNNQIT.S.VS----AS-AFGSYFDISLYEGYGVISVSRPL-DYEQIS.GLIY.T.M.MDAG---N.PLN.T--VP.TIE.F.E..NP.T. 
 CDH23.24   SKPQFST..Y.N-E.A.T.VI TMM...Q...PN.ELT.SLEG----PG-VEA.HVDMD.----GLVTTQRPLQSYEKFS-----.T.V.TDGG---E.PLWGT--TMLL.E.I.V..NR... 
 CDH23.16   Q.PH..VLLD.GPDTLNT.LI TIQ.L.L...PN.T.T.A.VA----..IV.T.RIDRHM----GVITAAKEL-DYE-ISHGRYT...T.TDQCPI.SHRLT.T--TT.L.N.N....NV.T. 
 CDH23.22   GITY.MERIL.GAT.-.TTLI AVA.V.P.K.LN.L.T.TL.D----LVPPGYVQLEDS.A---GKVIANRTV-DYEEVH--WLNFT.R.SDNG---S.P.AAE--IP.YLEIV....NN.I. 
 CDH23.27   TKAE.T.G.AT.-AKV.SELI .VL.L...I.NNSL.F.S..AIHYFRALA.DSEDVGQVFTMGSMDGILRTFDLFMAYSPGYF.VDIV.RDLAGHNDTAIIGIYIL.DDQR.KIVIN.I.DR
 CDH23.01   FTNHFFDTYLLISEDTPVGSS VTQLLAQ.MDND-PLVFG---VSGEEASR-F.AVEPDTGVVWLRQ-------.....TK.EFTVEFSVSD.QG--------.ITRK.NIQ.G.V..N..T. 
 CDH23.21   .NPIQTV..L.SAE.GTVIAN IT-.I.H.LNPK-LEYHIVGIVAKDDTDR-LVPNQEDAFAVNINTGSVMVKS.MN..LVATYEVTLSVIDNASD.PERSV..PNAKLT.N.L.V..NT.Q. 
 CDH23.26   SPQYQLLT...HSPRGTLVGN VTG.V.....PN-AIV.Y--FIAAGNEEK-..HLQPDGCLLVLRD.D.EREAIFSFIVKA.SNRSWTPPRGPSPTLDLVADLTLQE.R.VL.....QP.R. 
 CDH23.04   NS.E.SVAIT.LAQVGFALP. FIQVV.K..NLG-LNSMFEVYLVGNNS.HFIISPTS.QGKADIRI-----RVAIPLDYETVDRYDFDLFANESV----PDH.GYAK.KITLINE..NR.I. 
 CDH23.13   SNA....AIL.NLALGTEIVR -VQ.YSI.-NLN-QIT..FNAYTSTQAKA-L.KIDAITGVITVQG-------LV...--KGDFYTLTVVAD.GG----PKVDSTVK.YIT.L.E..NS.R. 
 CDH23.19   TK.T.Q.E.M.NSPAGTPLTV -.NGPILALDAD-QDI.AVVTYQLL.AQSGL.DINSSTGVVTVRS-----GVII...AF.PPI.ELLLLAE.IG-----LLNSTAHLLITIL.D..NR.T. 
 Consensus  ....y.........p.g.... .v.a.d.d.g.n....y..............f......................dre......l...a.d................v.i.v.#.#.n...f 

Difference alignment of human PCDH15 cadherin domains

PCDH15.CA.01           GTAGGPDPTIELSLKDN---VDYWVLMDPVKQMLFLNSTGRVLDRDPPMNIHSIV-----VQVQCINKKVGTIIYHEVRIVVRDRNDNSP
PCDH15.CA.02       NGATDIDD..NGQ..YVIQY.PDDPTSNDTFEIPLMLTGNIVLRKR.NYEDKTRYFV.I----QANDRAQ.LNERRTTTTTLTVD.L.GD.LG.
PCDH15.CA.07       VKATDPDA.INGQVHY..G------NFNN.FRITS--NGSIY.AVK.N.EVRDYYELV.----VATDGAVH---PRHSTLTLA.K.L.ID....
PCDH15.CA.05       LTAVDADE.SNGE.TYEILV-----GAQGDFIIN.T-TG.ITIAPGVEMIVGRTYALT.----QAADNAPPA-ERRNSICT.Y.E.LPP.NQ..
PCDH15.CA.06          LQATDREGDS.TYAIEN----G.PQRVFNLSET-TGILTL.KA...ESTDRYIL--------IITASDGRPDGTSTAT.N...T.V...A.
PCDH15.CA.10        VGVIS.AAINQS.VY.IVS----GNEEDTFGINNI-TGVIYVNGP..YETRTSYVLR.QADSLEV.LANLRVPSKSNTAK.Y.EIQ.E.NHP.
PCDH15.CA.08         IEAKDVDLGANVSYRIRS----PEVKHFFALHPF-TGEL.LL.S..YEAFPDQEASI---TFLVEAFDIYGTMPPGIAT.TVI.K.M..YP.
PCDH15.CA.03  IQAIDQDRNIQPPSDR.G.LY.ILVGTPEDYPRFFHMHPR--TAEL.LLEPVN..FHQKFDLVI-------KAEQDNGHPLPAFAGLH.EIL.E.NQ..
PCDH15.CA.04    IVALDKDIEDTKDPELHLFL------NDYTS.FTVTQTGITRYLTLLQPV..EEQQTYTFSI-------TAFDGVQESEPVI--.N.Q.M.A...T.
PCDH15.CA.09    VYAEDAD-PP.L.ASRVRYRVD.VQFPYPASIFEVEED--SGRVI.RVN.NEE.TTIFKLV.-------.AFDDGEPVMSSSAT.K.L.LHPGEIPR
PCDH15.CA.11    VKATDKD-.GNY--SVMAYR.IIPPIKEGKEGFVVETY--TG.IK.AMLFHNMRRSYFKFQ.-------IATDDYGK.LSGKAD.LVS.VNQL.MQV
 Consensus      .........d......!.Y.i...........ff..... TG.i.l...l#r#....%.l.!    ....aa..........at.....l..##...
>CDH23_homSap domains: signal, 27 spacers, 27 cadhedrin domains (last weak), unknown extracellular, single pass transmembrane, unknown cytoplasmic: 
MGRHVATSCHVAWLLVLISGCWG
QVNRLPFFTNHFFDTYLLISEDTPVGSSVTQ
LLAQDMDNDPLVFGVSGEEASRFFAVEPDTGVVWLRQPLDRETKSEFTVEFSVSDHQGVITRKVNIQVGDVNDNAPTF
HNQPYSVRIPENTPVGTPIFI
VNATDPDLGAGGSVLYSFQPPSQFFAIDSARGIVTVIRELDYETTQAYQLTVNATDQDKTRPLSTLANLAIIITDVQDMDPIF
INLPYSTNIYEHSPPGTTVRI 
ITAIDQDKGRPRGIGYTIVSGNTNSIFALDYISGVLTLNGLLDRENPLYSHGFILTVKGTELNDDRTPSDATVTTTFNILVIDINDNAPEF      
NSSEYSVAITELAQVGFALPLF 
IQVVDKDENLGLNSMFEVYLVGNNSHHFIISPTSVQGKADIRIRVAIPLDYETVDRYDFDLFANESVPDHVGYAKVKITLINENDNRPIF
SQPLYNISLYENVTVGTSVLT
VLATDNDAGTFGEVSYFFSDDPDRFSLDKDTGLIMLIARLDYELIQRFTLTIIARDGGGEETTGRVRINVLDVNDNVPTF
QKDAYVGALRENEPSVTQLVR
LRATDEDSPPNNQITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYLTVMAMDAGNPPLNSTVPVTIEVFDENDNPPTF
SKPAYFVSVVENIMAGATVLF
LNATDLDRSREYGQESIIYSLEGSTQFRINARSGEITTTSLLDRETKSEYILIVRAVDGGVGHNQKTGIATVNITLLDINDNHPTW
KDAPYYINLVEMTPPDSDVTT
VVAVDPDLGENGTLVYSIQPPNKFYSLNSTTGKIRTTHAMLDRENPDPHEAELMRKIVVSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF
QNLPFVAEVLEGIPAGVSIYQ
VVAIDLDEGLNGLVSYRMPVGMPRMDFLINSSSGVVVTTTELDRERIAEYQLRVVASDAGTPTKSSTSTLTIHVLDVNDETPTF
FPAVYNVSVSEDVPREFRVVW
LNCTDNDVGLNAELSYFITGGNVDGKFSVGYRDAVVRTVVGLDRETTAAYMLILEAIDNGPVGKRHTGTATVFVTVLDVNDNRPIF
LQSSYEASVPEDIPEGHSILQ
LKATDADEGEFGRVWYRILHGNHGNNFRIHVSNGLLMRGPRPLDRERNSSHVLIVEAYNHDLGPMRSSVRVIVYVEDINDEAPVF
TQQQYSRLGLRETAGIGTSVIV
VQATDRDSGDGGLVNYRILSGAEGKFEIDESTGLIITVNYLDYETKTSYMMNVSATDQAPPFNQGFCSVYITLLNELDEAVQF
SNASYEAAILENLALGTEIVR
VQAYSIDNLNQITYRFNAYTSTQAKALFKIDAITGVITVQGLVDREKGDFYTLTVVADDGGPKVDSTVKVYITVLDENDNSPRF
DFTSDSAVSIPEDCPVGQRVAT
VKAWDPDAGSNGQVVFSLASGNIAGAFEIVTTNDSIGEVFVARPLDREELDHYILQVVASDRGTPPRKKDHILQVTILDINDNPPVI
ESPFGYNVSVNENVGGGTAVVQ
VRATDRDIGINSVLSYYITEGNKDMAFRMDRISGEIATRPAPPDRERQSFYHLVATVEDEGTPTLSATTHVYVTIVDENDNAPMF
QQPHYEVLLDEGPDTLNTSLIT
IQALDLDEGPNGTVTYAIVAGNIVNTFRIDRHMGVITAAKELDYEISHGRYTLIVTATDQCPILSHRLTSTTTVLVNVNDINDNVPTF
PRDYEGPFEVTEGQPGPRVWT
FLAHDRDSGPNGQVEYSIMDGDPLGEFVISPVEGVLRVRKDVELDRETIAFYNLTICARDRGMPPLSSTMLVGIRVLDINDNDPVLL
NLPMNITISENSPVSSFVAH
VLASDADSGCNARLTFNITAGNRERAFFINATTGIVTVNRPLDRERIPEYKLTISVKDNPENPRIARRDYDLLLIFLSDENDNHPLF
TKSTYQAEVMENSPAGTPLTVLNGP
ILALDADQDIYAVVTYQLLGAQSGLFDINSSTGVVTVRSGVIIDREAFSPPILELLLLAEDIGLLNSTAHLLITILDDNDNRPTF
SPATLTVHLLENCPPGFSVLQ
VTATDEDSGLNGELVYRIEAGAQDRFLIHLVTGVIRVGNATIDREEQESYRLTVVATDRGTVPLSGTAIVTILIDDINDSRPEF
LNPIQTVSVLESAEPGTVIAN
ITAIDHDLNPKLEYHIVGIVAKDDTDRLVPNQEDAFAVNINTGSVMVKSPMNRELVATYEVTLSVIDNASDLPERSVSVPNAKLTVNVLDVNDNTPQF
KPFGITYYMERILEGATPGTTLIA
VAAVDPDKGLNGLVTYTLLDLVPPGYVQLEDSSAGKVIANRTVDYEEVHWLNFTVRASDNGSPPRAAEIPVYLEIVDINDNNPIF
DQPSYQEAVFEDVPVGTIILT
VTATDADSGNFALIEYSLGDGESKFAINPTTGDIYVLSSLDREKKDHYILTALAKDNPGDVASNRRENSVQVVIQVLDVNDCRPQF
SKPQFSTSVYENEPAGTSVIT
MMATDQDEGPNGELTYSLEGPGVEAFHVDMDSGLVTTQRPLQSYEKFSLTVVATDGGEPPLWGTTMLLVEVIDVNDNRPVF
VRPPNGTILHIREEIPLRSNVYE
VYATDKDEGLNGAVRYSFLKTAGNRDWEFFIIDPISGLIQTAQRLDRESQAVYSLILVASDLGQPVPYETMQPLQVALEDIDDNEPLF
VRPPKGSPQYQLLTVPEHSPRGTLVGNV
TGAVDADEGPNAIVYYFIAAGNEEKNFHLQPDGCLLVLRDLDREREAIFSFIVKASSNRSWTPPRGPSPTLDLVADLTLQEVRVVLEDINDQPPRF
TKAEYTAGVATDAKVGSELIQ
VLALDADIGNNSLVFYSILAIHYFRALANDSEDVGQVFTMGSMDGILRTFDLFMAYSPGYFVVDIVARDLAGHNDTAIIGIYILRDDQRV
KIVINEIPDRVRGFEEEFIHLLSNITGAIVNTDNVQFHVDKKGRVNFAQTELLIHVVNRDTNRILDVDRVIQMIDENKEQLRNLFRNYNVLDVQPAISVRLPDDMSALQM
AIIVLAILLFLAAMLFVLMNWYY
RTVHKRKLKAIVAGSAGNRGFIDIMDMPNTNKYSFDGANPVWLDPFCRNLELAAQAEHEDDLPENLS 
RTVHKRKLKAIVAGSAGNRGFIDIMDMPNTNKYSFDGANPVWLDPFCRNLELAAQAEHEDDLPENLSEIADLWNSPTRTHGTFGREPAAVKPDDDRYLRAAIQEYDNIAKLGQIIREGPIKGSLLKVVLEDYLRLKKLFAQRMVQKASSCHSSISELIQTELDEEPGDHSPGQGSLRFRHKPPVELKGPDGIHVVHGSTGTLLATDLNSLPEEDQKGLGRSLETLTAAEATAFERNARTESAKSTPLHKLRDVIMETPLEITEL

>PCDH15_homSap domains: signal, 11 spacers, 11 cadhedrin domains, unknown extracellular, single pass transmembrane, unknown cytoplasmic: 
MFRQFYLWTCLASGIILGSLFEICLG
QYDDDCKLARGGPPATIVAIDEESRNGTILVDNMLIK
GTAGGPDPTIELSLKDNVDYWVLMDPVKQMLFLNSTGRVLDRDPPMNIHSIVVQVQCINKKVGTIIYHEVRIVVRDRNDNSPTF
KHESYYATVNELTPVGTTIFTGFSGD
NGATDIDDGPNGQIEYVIQYNPDDPTSNDTFEIPLMLTGNIVLRKRLNYEDKTRYFVIIQANDRAQNLNERRTTTTTLTVDVLDGDDLGPMF
LPCVLVPNTRDCRPLTYQAAIPELRTPEELNPIIVTPP
IQAIDQDRNIQPPSDRPGILYSILVGTPEDYPRFFHMHPRTAELSLLEPVNRDFHQKFDLVIKAEQDNGHPLPAFAGLHIEILDENNQSPYF
TMPSYQGYILESAPVGATISDSLNLTSPLR
IVALDKDIEDTKDPELHLFLNDYTSVFTVTQTGITRYLTLLQPVDREEQQTYTFSITAFDGVQESEPVIVNIQVMDANDNTPTF
PEISYDVYVYTDMRPGDSVIQ
LTAVDADEGSNGEITYEILVGAQGDFIINKTTGLITIAPGVEMIVGRTYALTVQAADNAPPAERRNSICTVYIEVLPPNNQSPPRF
PQLMYSLEISEAMRVGAVLLN
LQATDREGDSITYAIENGDPQRVFNLSETTGILTLGKALDRESTDRYILIITASDGRPDGTSTATVNIVVTDVNDNAPVF
DPYLPRNLSVVEEEANAFVGQ
VKATDPDAGINGQVHYSLGNFNNLFRITSNGSIYTAVKLNREVRDYYELVVVATDGAVHPRHSTLTLAIKVLDIDDNSPVF
TNSTYTVLVEENLPAGTTILQ
IEAKDVDLGANVSYRIRSPEVKHFFALHPFTGELSLLRSLDYEAFPDQEASITFLVEAFDIYGTMPPGIATVTVIVKDMNDYPPVF
SKRIYKGMVAPDAVKGTPITT
VYAEDADPPGLPASRVRYRVDDVQFPYPASIFEVEEDSGRVITRVNLNEEPTTIFKLVVVAFDDGEPVMSSSATVKILVLHPGEIPRF
TQEEYRPPPVSELATKGTM
VGVISAAAINQSIVYSIVSGNEEDTFGINNITGVIYVNGPLDYETRTSYVLRVQADSLEVVLANLRVPSKSNTAKVYIEIQDENNHPPVF
QKKFYIGGVSEDARMFTSVLR
VKATDKDTGNYSVMAYRLIIPPIKEGKEGFVVETYTGLIKTAMLFHNMRRSYFKFQVIATDDYGKGLSGKADVLVSVVNQLDMQV
IVSNVPPTLVEKKIEDLTEILDRYVQEQIPGAKVVVESIGARRHGDAFSLEDYTKCDLTVYAIDPQTNRAIDRNELFKFLDGKLLDINKDFQPYYGEGGRILEIRTPEAVTSIKKRGESLGYTE
GALLALAFIIILCCIPAILVVLV
SYRQFKVRQAECTKTARIQAALPAAKPAVPAPAPVAAPPPPPPPPPGAHLYEELGDSSILFLLYHFQQSRGNNSVSEDRKHQQVVMPFSSNTIEAHKSAHVDGSLKSNKLKSARKFTFLSDEDDLSAHNPLYKENISQVSTNSDISQRTDFVDPFSPKIQAKSKSLRGPREKIQRLWSQSVSLPRRLMRKVPNRPEIIDLQQWQGTRQKAENENTGICTNKRGSSNPLLTTEEANLTEKEEIRQGETLMIEGTEQLKSLSSDSSFCFPRPHFSFSTLPTVSRTVELKSEPNVISSPAECSLELSPSRPCVLHSSLSRRETPICMLPIETERNIFENFAHPPNISPSACPLPPPPPISPPSPPPAPAPLAPPPDISPFSLFCPPPSPPSIPLPLPPPTFFPLSVSTSGPPTPPLLPPFPTPLPPPPPSIPCPPPPSASFLSTECVCITGVKCTTNLMPAEKIKSSMTQLSTTTVCKTDPQREPKGILRHVKNLAELEKSVANMYSQIEKNYLRTNVSELQTMCPSEVTNMEITSEQNKGSLNNIVEGTEKQSHSQSTSL 

PCDH15muts.jpg

>CDH23_spacers color shows repeat reported by SMART
QVNRLPFFTNHFFDTYLLISEDTPVGSSVTQ
TFHNQPYSVRIPENTPVGTPIFI
IFINLPYSTNIYEHSPPGTTVRI
EFNSSEYSVAITELAQVGFALPLF
IFSQPLYNISLYENVTVGTSVLT
TFQKDAYVGALRENEPSVTQLVR
TFSKPAYFVSVVENIMAGATVLF
TWKDAPYYINLVEMTPPDSDVTT
TFQNLPFVAEVLEGIPAGVSIYQ
TFFPAVYNVSVSEDVPREFRVVW
IFLQSSYEASVPEDIPEGHSILQ
VFTQQQYSRLGLRETAGIGTSVIV
QFSNASYEAAILENLALGTEIVR
RFDFTSDSAVSIPEDCPVGQRVAT
VIESPFGYNVSVNENVGGGTAVVQ
MFQQPHYEVLLDEGPDTLNTSLIT
TFPRDYEGPFEVTEGQPGPRVWT
VLLNLPMNITISENSPVSSFVAH
LFTKSTYQAEVMENSPAGTPLTVLNGP
TFSPATLTVHLLENCPPGFSVLQ
EFLNPIQTVSVLESAEPGTVIAN
QFKPFGITYYMERILEGATPGTTLIA
IFDQPSYQEAVFEDVPVGTIILT
QFSKPQFSTSVYENEPAGTSVIT
VFVRPPNGTILHIREEIPLRSNVYE
LFVRPPKGSPQYQLLTVPEHSPRGTLVGNV
RFTKAEYTAGVATDAKVGSELIQ

>PCDH15_spacers colored region shows 26% identity alignment to CDH23 spacer repeat
QYDDDCKLARGGPPATIVAIDEESRNGTILVDNMLIK
TFKHESYYATVNELTPVGTTIFTGFSGD
MFLPCVLVPNTRDCRPLTYQAAIPELRTPEELNPIIVTPP
YFTMPSYQGYILESAPVGATISDSLNLTSPLR
TFPEISYDVYVYTDMRPGDSVIQ
PRFPQLMYSLEISEAMRVGAVLLN
VFDPYLPRNLSVVEEEANAFVGQ
VFTNSTYTVLVEENLPAGTTILQ
VFSKRIYKGMVAPDAVKGTPITT
FTQEEYRPPPVSELATKGTM
VFQKKFYIGGVSEDARMFTSVLR

Localization of disease alleles

Of the 49 known disease alleles of CDH23, 14 have a clear explanation in terms of directly disrupting a calcium binding motif. Quite a few linker domains are also affected, suggesting that L1122V is positioned where it could have an adverse impact (it matches location of deafness allele L0480Q). The first graphic below shows CDH23 marked up for these disease alleles as well as for signal peptide, linkers, cadherin domains, the single pass transmembrane region, potential glycosylation sites, and some apparent anomalies in wildtype calcium binding sites.

Glycosylation with a bulky complex carbohydrate would preclude that residue from participating in the parallel dimer. However potential sites are not always utilized; these might be phylogenetically conserved but that conservation would be difficult to distinguish from overall protein invariance. In any event, there is no evident pattern to sites with correct motifs so no constraints emerge for the dimer.

Several anomalous calcium binding sites occur in human. This raises the possibility that the reference human sequence arose from a carrier or diseased individual. However when these are investigated in the 44 species alignment, the 'anomalies' are invariant back to lamprey. So rather than been questionable alleles, they are instead highly conserved variants in these sites.

The second graphic examines each of the 49 mutations for phylogenetic conservation at its position, finding extreme conservation in almost all cases. Only rarely does the altered value of the allele show up in other species and in no case as part of the reduced alphabet at that position. At posiltion 755, an unusual phyloSNP H755P is observed after frog divergence, possibly correlating with the air/water shift in hearing. A more recent phyloSNP Q1496K occurs in gorilla, chimp and human but not gibbon and earlier diverging species.

The third graphic displays the location of introns within the CDH23 gene. These do not coincide at all with cadherin domain boundaries, nor is there any repeat pattern evident in exon size or in intron phases (00 01 21 bp overhangs). This suggests that that the gene attained its current structure very early on, prior to the main era of intron insertion (which was random with respect to domain boundaries). Gain and loss of cadherin domains via internal tandem repeats (eg from inhomogeneous recombination) is thus implausible given domains spilling over 2-3 partial exons and reading frame quite difficult to preserve. Indeed, a nearly complete sequence from elephant shark with an identical exonic structure to human establishes that CDH23 gene structure is little changed over the course of vertebrate evolution. These observations also apply to PCDH15.

Finally, the bottom alignments illustrate conservation locally about 3 alleles. This establishes that CDH23 has high conservation overall (relative to the average protein) but many positions are not nearly as well conserved as the disease allele positions outside of mammals. Consequently, this has considerable predictive value for evaluating other alleles that surface in genome sequencing projects or clinical setting: only variations on very deeply conserved residues have surfaced to date in deafness or syndromic disease.

*  context    change disease   location  effect

D VNDNAPTFH   D0124G  DFNB12  EC1 Ca+2 binding motif DVNDNAPTF disrupted
P YSTNIYEHS   P0240L  DFNB12  link   
E HSPPGTTVR   E0247K   USH1D  link    
R ENPLYSHGF   R0301Q  DFNB12  EC3 Ca+2 binding motif DRE disrupted
A LPLFIQVVD   A0366T   USH1D  link  
N ENDNRPIFS   N0452S  DFNB12  EC4 Ca+2 binding motif NENDNRPIF disrupted
L TVLATDNDA   L0480Q  DFNB12  link
A TDNDAGTFG   A0484P   USH1D  link
R LRATDEDSP   R0582Q  DFNB12  link
H NQKTGIATV   H0755Y   USH1D  sheet
D ETPTFFPAV   D0990N  DFNB12  EC5 Ca+2 binding motif DVNDETPTF disrupted
R ETTAAYMLI   R1060W  DFNB12  EC6 Ca+2 binding motif DRE disrupted
V TVLDVNDNR   V1090I   USH1D  link
N RPIFLQSSY   N1098S   USH1D  EC6 Ca+2 binding motif DVNDNRPIF disrupted
G PMRSSVRVI   G1186D  DFNB12  sheet
P VFTQQQYSR   P1206R   USH1D  EC7 Ca+2 binding motif DINDEAPVF disrupted
T QQQYSRLGL   T1209A   USH1D  link
D NLNQITYRF   D1341N  DFNB12  EC9 Ca+2 binding motif DNL disrupted
Q VVASDRGTP   Q1496H   USH1D  sheet
R KKDHILQVT   R1507Q   USH1D  sheet
A TRPAPPDRE   A1586P  DFNB12  sheet
E RQSFYHLVA   E1595K  DFNB12  EC11 Ca+2 binding motif DRE disrupted
Q CPILSHRLT   Q1716P  DFNB12  sheet
R DYEGPFEVT   R1746Q   USH1D  link
P LGEFVISPV   P1788L   USH1D  sheet
D NDPVLLNLP   D1846N  DFNB12  EC17 Ca+2 binding motif DINDNDPVL disrupted
F NITAGNRER   F1888S  DFNB12  sheet
R PLDRERIPE   R1912W   USH1D  sheet
D NPENPRIAR   D1930N   USH1D  sheet
G VVTVRSGVI   G2017S   USH1D  sheet
R EAFSPPILE   R2029W  DFNB12  EC19 Ca+2 binding motif DRE disrupted
D IGLLNSTAH   D2045N  DFNB12  sheet
D RGTVPLSGT   D2148N  DFNB12  sheet
D LNPKLEYHI   D2202N  DFNB12  EC21 Ca+2 binding motif DHD disrupted
D NGSPPRAAE   D2376V   USH1D  sheet
R EKKDHYILT   R2465W  DFNB12  EC23 Ca+2 binding motif DRE disrupted
S VYENEPAGT   S2517G   USH1D  link
T MMATDQDEG   T2530I   USH1D  link
R PVFVRPPNG   R2608H  DFNB12  EC23 Ca+2 binding motif DVNDNRPVF disrupted
G TLVGNVTGA   G2744S   USH1D  link
G NEEKNFHLQ   G2771S   USH1D  sheet
R VVLEDINDQ   R2833G   USH1D  sheet
I LRDDQRVKI   I2950N  DFNB12  sheet
R VKIVINEIP   R2956C  DFNB12  sheet
V RGFEEEFIH   V2968A   USH1D  link
P DDMSALQMA   P3059T  DFNB12  cytoplasm
R EPAAVKPDD   R3175H   USH1D  cytoplasm
R AAIQEYDNI   R3189W   USH1D  cytoplasm
S ELIQTELDE   S3245F   USH1D  cytoplasm

CDH23markup.jpg

CDH23mutPhylo.jpg

CDH23exons.jpg

        EC1           D0124G             Link1-2    P0240L E0247K        EC3        R0301Q  
homSap  VITRKVNIQVGDVNDNAPTFHNQPYSVRIPE  VQDMDPIFINLPYSTNIYEHSPPGTTVRII  SGVLTLNGLLDRENPLYSHGFILTVK
panTro  ...............................  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ..A.......................
gorGor  ...............................  ..............................  ,,,,,,,,,,,,,,,,,,,,,,,,,,
ponAbe  ...............................  ..............................  ..A.......................
rheMac  ...............................  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ..A.......................
calJac  ...............................  ..............................  ..A.......................
tarSyr  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ..A.......................
micMur  ...............................  ..............................  ..A.......................
otoGar  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,
tupBel  ...............................  ............................V.  ,,,,,,,,,,,,,,,,,,,,,,,,,,
musMus  ...............................  M...........................V.  ..A.......................
ratNor  ...............................  ............................V.  ..A.......................
dipOrd  ...............................  I...........................V.  ,,,,,,,,,,,,,,,,,,,,,,,,,,
cavPor  ...............................  I...........................V.  ..A.......................
speTri  ........-.........-............  ............................V.  ..A..................L....
oryCun  ...............................  ......V......................V  ..A.......................
ochPri  ...............................  ..............................  ..A.......................
vicPac  ...............................  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,
turTru  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ................V.....L.....V.  ..A.....V.................
bosTau  ...............................  ..............................  ..A.......................
equCab  ...............................  I.............................  ..A.......................
felCat  ...............................  ............................V.  ..A.......................
canFam  ...............................  ............................V.  ..A.......................
myoLuc  ...............................  ............................V.  ..A.......................
pteVam  ...............................  ..............................  ..A.......................
eriEur  ...............................  ......V.......................  ..A.......................
sorAra  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,
loxAfr  ...............................  ..............................  ,,,,,,,,,,,,,,,,,,,,,,,,,,
proCap  ...............................  I.............................  ,,,,,,,,,,,,,,,,,,,,,,,,,,
echTel  ...............................  ..............................  ..A.......................
dasNov  ...............................  I......................         ,,,,,,,,,,,,,,,,,,,,,,,,,,
choHof  ...............................  ............................V.  ,,,,,,,,,,,,,,,,,,,,,,,,,,
monDom  ...............................  ............................M.  ..A.....V.................
ornAna  ..................R............  I...........................M.  ,,,,,,,,,,,,,,,,,,,,,,,,,,
galGal  ..KGT.............R............  ...................N........M.  ..A.....P......F..S.......
taeGut  ..KGT...........T.Q............  ...................N........M.  ..A.....P......F..A..V....
anoCar  ..KGS.............R............  ...................N........M.  ..A.....P......F.GA.......
xenTro  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  I..................N...         ..A.....Q........TA.......
tetNig  .VKDI.K..I.E....VAI.QGE..I.H.E.  ........T........E.DV.L---..K.  ..S..VS.Q.........A..T....
takRub  .VKDT....I........I..G...T.H...  ........T........E.DV.L---..K.  ..S..VS.Q.........A..T....
gasAcu  .VKDT....I........V..G...T.....  ........T........E.DV.L.FE..K.  ..S..V..Q.........A..S...R
oryLat  .VKDT....I...........G...T.....  ........T....N..LQ.NV.L.YQ..N.  ..S..V..Q.........S..T....
danRer  .VKDT....I........S.Y....AIQ...  I.......T........M.DA..         ..S..VS.Q.........S...I...
petMar  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  ,,,,,,,,,,,,,,,,,,,,,,,,,,
homSap  VITRKVNIQVGDVNDNAPTFHNQPYSVRIPE  VQDMDPIFINLPYSTNIYEHSPPGTTVRII  SGVLTLNGLLDRENPLYSHGFILTVK
        EC1:          D0124G             LK1-2      P0240L E0247K  EC3:             R0301Q

>CDH23_calMil Callorhinchus milii (elephant shark) dots = missing residues, spaces = exons, right-aligned to cadherin domains
...................... 
...NRLPYFKNYFFDQYFLIYEDTPV                GASITTLLGEDSDHDPLVYGVVGEEASRYFAVESQTGVVWLRQPLDRE ................ VIKRIVNIQVGDVNDNVPIF EC1 
HSQPYSVRILE                       NTPVGTPIYIVNATDADQGTGGSVLYSFQPPSPFFSIDGARGVVTVVKPLDYEITQAYQLQVNAT DQDEIRPLSTLANLAITITDIQDMDPIF EC2 
INLPYSTNIYENSPP GKTVRVITAIDQDRGRPRGIGYTIVS              GNTNSIFALDYISGALILNGPLDRENPLYSSGFILTVR GTELNDDRTPSNATVSTTFNILVIDVNDNPPEF EC3 
NRSQYSVSIPELAQVGFALPLFIQVQDKDD                 .................................................. LFANETASDHVGFARVKINLINENDNRPVF EC4 
SQPLYNVNLFENASVGITVIRVI                           ..................... ...............................................DVNDNVPTF EC5 
.......................                  AIDDDSPPNNQITYSIFNASIQSNYFDISVSEGYG .......................................... ENDNSPTF EC6 
SQTSYIIAVSENIIA GATVLFVTATDLDQSREYGQESMIYSLEGSSQFRINTR                    GEITTTSLLDRETKFEYILIVRAVDGGMGYNQKTGIAT VNITLLDMNDNHPLW EC7 
KDEPYLVNVVEMSPAHTDVIT               VSAFDPDLSENGTVAYTIHPPNRFYHINSTTGKIRTSGAVLDRENMNVRAAEMMRKIIVSVTD .GNPPLRASSSTTVTVNLLDLNDNDPSF EC8 
ENLPFVAEVPEGLTAASSVFQ                      VLAVDPDENLNGLVTFTMQVGMPRLDFIMNTTTGLITSTALLDREKIAEYYLRIIASDAGMVPRSSTSTLTVR ..DVNDETPTF EC9 
........................................                    ...................................... DNGPAGSRRTGTATVYIRVLDVNDNRPIF EC10 
LQNTYEASVPENITMSTSILQ ....................                    .................................................. VILYVEDVNDESPVF EC11 
TQQQYSRLGLRETAGIGTSVNVVRATDKDT                       GDGGMVAYRILAGSERKFAIDESTGLITTIDYLDYETRTNYLMNISATDQAAPFNRGYCTVYITLMNELDEPVQF EC12 
TNATYEVTLMENIATGTDVIQIHAQSADIMNQITYRFDPDTSALALSLFQINRVT                       GIITVRGQVDRERGDFYLITVIADNGGPRKDSTV ......DENDNSPRF EC13
...................................                  VNYSILAGNTNGAFRIRTTNNSRGEVYIAKLLDREWISRYVL. ......................DINDNPPVI EC14
............                    ............................................................................ VTTMVFITILDENDNYPVF EC15
RQQLYEITLDEGPLTLYSFNITVNATDQDEGLNGTISYSILEGNIGNTFVI....                 GLIKAIKELDYEISHGRYTLIVAAIDQCADRERRLTSTTT ............PTF EC16
ARSYEGPFDITEGQPGPRVWTFIAADKDAGPNGQVEYSVIAGDPL                     EFIISPVDGELRVKRDAELDRETLSFYNITIMGKDRGTP..... ........DINDNDPVL EC17
.....................................................                   SGSVYVNRPLDRERTEEYRLTITVKDNPENPRHARR DSDFLIISISDENDNRPIF EC18 
TQNTYQAEILENSRA GTQITLVNGPIMAHDRDEGPNGVVSYQLLGNRTDFFTIDRTS                 ............................................DDNDNRPTF EC19
...............                      GFSVVQVKATDADIGSNQQLNYRIEGGAQDKFIIDLLTGVIKVANGVTVDREERESYTLIVIAMDRGIPSLSASATVNIIIDDVNDYRPDF EC20
INPIQTVSVSESAAIGTIIANVTAIDQDLRPRLEYYIIELVAKDDTDAVIQDQQRAFGINFET        ....................................... AKLTVNVQDVNDNAPRF EC21 
RPFGVKAFTERILEGATAGTTLISVTAVDSDRGLNGRIIYELLNLPTGGYIRLEDPAA                    ...ANRTVDYEEVHWLNFTVRAVDHGTPPRSVQVPVNLQIVDINDNNPVF EC22 
VQSSYQ ..VFEDVSLGTVIIQVSATDADYGSFALIEYSLVDGEGRFGINPTT                .DIYILSALDREKKDHYTLTAVAKDNPEGNPSKRRENSVQ ........NDYRPQFSKR EC23 
EFSTSVYENEPGGASVITMTARDLDEGENAVLQFSVEGPGA                             DAFRMDGDTGLITTSRLLQSHEMFNLTVVATDNGRPPLWGTTLLKVDVIDVNDNRPVF EC24 
VRPPNGTILHIME                .......VYEVLAVDSDEGLNGAVRYSFLKTGVNKDWEYFNIDSLSGIINTTIRLDREKQPVYN LILLAYDLGHPVPYETTQLLQVALDDIDDNEPSF EC25 
LKPP RGRFQYQLLSVPEHSRPGVIVGNVTGAVDADEGPNAVVYYFIA AGDPDRNFQLNRNGLLKILKDLDREINPYYSITVKASSNRNWSPARSARRNRGFNLSTDLTLQEVRIYLEDINDQPPRF EC26 
LKPEYTA GVAADAKVGSELIKVDAYDADIGNNSIVYYQILNIRYIKLQSNNSEEVGNVFII  GEKSGIVRTFDLFTAYSPGYFVVEIMVSDLAGHNDTAVIGIYILRDDQRVKIVINEIPERVREF EC27
GEQFIKLLSNITGAIVNTDDVQ FHVDKKGRVNFAQTNVLIHVVNRETNQILDVER VIQMIDENKEQLRNLFRNYNVLDVQPAVTVRPADDMTALQ ........................... 
............. GNQGFMDILDMPNTNKYSFEG ANPVWLDPFCRNLELAAQAEHEDDLPENLSDITDLWNSPARTH GTFGREPVTSKPEDDRYLRAAIQEYDSITKLGQIMREGPIK 
GSLLKVVLDDYLRLKKLFAARLAHKSTGSGDQSSLTE LIPSDMDED  DEKPMSRGTLRLKHRHPVEFKGPDGIHVVHGSTGTLLASDLNSLPEDDQKVLVRSLETLNTDSGSYNDRNARTESAKSTPMHKIKETIMDAPLEITEL*

>PCDH15_homSap Homo sapiens (human)
MFRQFYLWTCLASGIILGSLFEICLG QYDDDCKLARGGP
PATIVAIDEESRNGTILVDNMLIK                         GTAGGPDPTIELSLKDNVDYWVLMDPVKQMLFLNSTGRVLDRD     PPMNIHSIVVQVQCINKKVGTIIYHEVRIVVRDRNDNSPTF	EC1  R134G
KHESYYATVNELTPVGTTIFTGFSGD                NGATDIDDGPNGQIEYVIQYNPDDPTSNDTFEIPLMLTGNIVLRKRLNYE    DKTRYFVIIQANDRAQNLNERRTTTTTLTVDVLDGDDLGPMF	EC2  D178G G262D
LPCVLVPNTRDCRPLTYQAAIPELRTPEELNPIIVTPP IQAIDQDRNIQPPSDRPGILYSILVGTPEDYPRFFHMHPRTAELSLLEPVNRD       FHQKFDLVIKAEQDNGHPLPAFAGLHIEILDENNQSPYF	EC3  
TMPSYQGYILESAPVGATISDSLNLTSPLR               IVALDKDIEDTKDPELHLFLNDYTSVFTVTQTGITRYLTLLQPVDRE         EQQTYTFSITAFDGVQESEPVIVNIQVMDANDNTPTF	EC4
PEISYDVYVYTDMRPGDSVIQ                           LTAVDADEGSNGEITYEILVGAQGDFIINKTTGLITIAPGVEMI    VGRTYALTVQAADNAPPAERRNSICTVYIEVLPPNNQSPPRF	EC5
PQLMYSLEISEAMRVGAVLLN                             LQATDREGDSITYAIENGDPQRVFNLSETTGILTLGKALDRE        STDRYILIITASDGRPDGTSTATVNIVVTDVNDNAPVF	EC6
DPYLPRNLSVVEEEANAFVGQ                             VKATDPDAGINGQVHYSLGNFNNLFRITSNGSIYTAVKLNRE       VRDYYELVVVATDGAVHPRHSTLTLAIKVLDIDDNSPVF	EC7
TNSTYTVLVEENLPAGTTILQ                            IEAKDVDLGANVSYRIRSPEVKHFFALHPFTGELSLLRSLDYE   AFPDQEASITFLVEAFDIYGTMPPGIATVTVIVKDMNDYPPVF	EC8
SKRIYKGMVAPDAVKGTPITT                     VYAEDADPPGLPASRVRYRVDDVQFPYPASIFEVEEDSGRVITRVNLNEE        PTTIFKLVVVAFDDGEPVMSSSATVKILVLHPGEIPRF	EC9
TQEEYRPPPVSELATKGTM                             VGVISAAAINQSIVYSIVSGNEEDTFGINNITGVIYVNGPLDYETRTSYVLRVQADSLEVVLANLRVPSKSNTAKVYIEIQDENNHPPVF	EC10
QKKFYIGGVSEDARMFTSVLR                       VKATDKDTGNYSVMAYRLIIPPIKEGKEGFVVETYTGLIKTAMLFHNMRRSYFKFQVIATDDYGKGLSGKADVLVSVVNQLDMQVIVSNVPPTL	EC11
                                                     VEKKIEDLTEILDRYVQEQIPGAKVVVESIGARRHGDAFSLEDYTKCDLTVYAIDPQTNRAIDRNELFKFLDGKLLDINKDFQPY	.... Q1342K

>PCDH15_calMil Callorhinchus milii (elephant shark) dots = missing residues, spaces = exons
.............................. DCKLARSRP
PATIVAIDEESRN                          GTLLVDNMQIKGTAGGPSPTIALLLRDNHDFWVILDAVNQRLYLNSTGRVLDRD     ...............NLQVGSIINHEIRIIVRDRNDNSPQF	EC1
QHPQYYVAINE               LTPIGTTIFTGFSRNNGAIDIDDGPNRQIEYVIRQNPDDP  .SSKTFDIPLTLSGAVILRERLNYEEKTRYYVLVQAN    DRAPNPNDRRTSTTTLTVDVLDGDDLGPMF	EC2
LPCKLVNNTRDCRPLTYKASVPELTDP .RINPVNVTPPILAVDQDRNIQPADDRPGILFSILI GNPEDYANYFFLNNTTAELHLLQPINRDLHQKFDLVIK       AEQDNRHPLPVFANLHIEVLDENNQGPYF	EC3
ARTTYQGFIVESAPVGTSLSDSKNLTMALKIVALDNDVEE                TKDPHLRISVNDFSTVFGVT.TGITRYLTLLKPVDRERQQNYTFT         MTASDGVQESTSVIVHVFVIDANDNTPTF	EC4
HNISYSVDIYNDMRPGETVIQ                               ................................................................. NSITTVYIEVLPPNNQSPPRY	EC5
PQMLYNLEVSEAMRIGATLLSLQ                                    ATDREGDPITYRIQNGDPHQVFNITQ. ............................... .................PIF	EC6
DPIFLRNFTVLEEEANALVGQVR                                     AADADAGMNGQIRYSLGNYNGIFRITSNGSIYTTTPLDRETQDAYDLIVEASDGAVDPRRSTLTVPVRVLDIDDNSPIF	EC7
SQSSYTVTLPENNSPGIVILQLK                               ........................................................................... DMNDHTPEF	EC8
SKPIYKGMVAPDALKGTLIATVSAEDKDPP                     GGPASRIRYKVDMAHTQYSASIFDVEEDTGRVLTRVNLNEEPSSIFE         .VVIAFDDGEPVKFNSTTVVITVLQPSVIPRF	EC9
TQEEYR PPPVSEGARIGTTVVTVTAAAINQTIVYSIIAGND.                            ..FKIDERTGVITTNNTLDYETTKSFEIRIQADPLQLVRSNLRVPSR ANTAKVFIEVQDENDHAPVF	EC10
TRPLYLGGVAEDAKTFTSVLQVE                        .............RLIIPPSKDGKDGFVIEAYTGLIKSAITFRNMRRSYFKFEVIATDNYGTGLSSSAQVV ....................	EC11
.......... ........................................................... FLDGKLLDINKEFQSYLGQGGRILEIRTPDVVSNVKKQAQAVGYTEG.....................

Final assessment of CDH23 allele L1122V

Note L1122V will test out as benign at PolyPhen and SIFT because it is a common conservative substitution with comparative genomics support in fish that is not de-weighted for phylogenetic remoteness. Comparative genomics of these tools is limited to sequences at SwissProt and do not incorporate phylogenetic relations; consequently they miss the stable L-->V transition at the level of the tetrapod ancestor and all its descendent clades. Much is known about cadherin domains and their stability but that literature is not used. For example with 27 cadherin domains within the same molecule, the mutational record at homologous residues is likely informative. Consequently such initial screening tools do not utilize a significant part of the available information and are not authoritative here.

The evidence discussed above suggests that a moderately adverse auditory outcome for the V1122V homozygote without syndromic retinitis pigmentosa, similar but milder than non-syndromic L0480Q which also occurs 6 residues back from a DxD motif. Offsetting this, weakly aligning linker regions 6,10 and 15 have valine (as seen in the difference alignment above). The change is unlikely to improve CDH23 performance in hair cell tip links because leucine in this position has been stable for billions of years of branch length despite presumed opportunities for change (indeed residues around it have experienced change and fixation in the population). If valine were neutral, it would be part of the reduced alphabet at this position by now.

However CDH23 is expressed many other cellular sites including but not limited to the retina. Perhaps gain in functionality elsewhere could offset a slight loss of optimality in tip links. However compensation by an unseen allele in another gene seems unlikely given the position of 1122 in a linker domain. The bottom line is that only clinical observation of older homozygotes and mode of familial inheritance of effects can resolve the impact of this allele.