Personal genomics: ACTN3

From genomewiki
Revision as of 13:12, 10 December 2008 by Tomemerald (talk | contribs)
Jump to navigationJump to search

Introduction to ACTN3 comparative genomics

The alpha actinin gene ACTN3 is a coding gene on human chromosome 11, quite interesting in its own right but best known as ground zero in the debate over frivolity and unexpected consequences of personal genomics. This gene first of all needs careful and exhaustive re-annotation before considering this controversy because its existing peer-reviewed scientific literature (some 22 papers) is a mixture of pre-genomic era obsolescence and gross factual errors such as expression said specific to skeletal muscle.

Some unfortunate historic terminology needs to be explained. Actinins were erroneously thought similar to actin in early studies of myofibrillar components; instead they are homologically unrelated proteins that happen to bind actin. These 'actinins' were then improperly divided into 3 classes (alpha, beta, gamma) before it became known that their respective gene families were wholly unrelated (not homologous). For example, 'beta' actinins refer to heterodimers functioning as actin barbed-end capping proteins in skeletal muscle; they are comprised of the distinct gene families CAPZA and CAPZB themselves non-homologous and thus further misnamed. 'Gamma' actinins refer to yet other unrelated genes.

In this article, actinin shall mean alpha actinin, ie a protein encoded by one of the four paralogous genes ACTN1-ACTN4. Gene names are used for both gene or gene product (as this is always clear from context). Genus and species are indicated with standard 6-letter code (eg ACTN3_homSap). Care still must be taken with published articles and genBank entries that may fail to specify the alpha actinin under consideration despite a 2003 article calling for adherence to HGNC international nomenclature standards followed here.

The comparative genomics situation is further confused by high sequence conservation within the ACTN gene family, by paralog loss in some clades, by possible independent duplication events, and by pre-duplication parental genes only in early deuterostomes. It is not easy to assign transcripts or genomic fragments to correct orthology class by methods such as best reciprocal Blast, especially when the query itself is a fragment (eg third spectrin domain of ACTN3). Many genBank entries are unlabelled, mislabelled, or ambiguously labelled as to correct ortholog family.

However a reliable actinin classifier can be built by requiring flanking gene synteny, diagnostic signature residues and indels in building the reference sequence seed collection that focus on signature regions in which ortholog classes differ significantly from each other. For example ACTN2/3 share a five-residue deletion in exon 19 relative to (ancestral-length) ACTN1/4.

The human ortholog of ACTN3 is unusual (but not unique) in having a fairly abundant null allele, R577x, meaning the arginine at position 577 of the 901 residue protein has been replaced by a stop codon. Worldwide 18% of the human population is homozygous 577x 577x. It is very unlikely that a functional truncated protein can be produced because even if the promoter region is still functional, the mRNA would be degraded by nonsense-mediated decay (with no possibility of selenocysteine substitution) and the stable quaternary dimer necessary for function cannot form (as explained below).

It has not been established whether the 577x was the initial inactivating mutation, as 3 additional amino acid changes (Q523R, R628C, R776Q) have also accrued at otherwise invariant sites in this allele (ie, in the dna donor to the public human genome relative to genBank reference sequence NM_001104). The latter two substitutions are also CpG mutational hotspots (the entire mRNA has 131 such sites). It is not known whether these other changes became widespread before or after R577x nor whether they affect ACTN3 function. However it is not easy to inactivate a large structural protein comprised of independent modules by single substitutions.

Curiously Q523R, R577x, and R628C all occur in the third spectrin repeat despite this region constituting only 11% of the gene, yet not a single nonsynonymous base change has occured in the 2706 bp coding region. With the advent of HapMap and similar projects, the phenotypic associations of these changes, possible co-occurence with wildtype R577, and the date(s) of 577x founder mutations could be resolved.

All mammals with assembled genomes encode a CpG hotspot at codon 577. This has transitioned to TpG in the human 577x allele but is not a polymorphic site in any other known mammal, though the search has been restricted to the individual animals used in genome projects (since transcripts rarely extend this far), plus 36 unrelated baboons and 33 chimpanzees all genotyping to ‘wild-type’ 577R. Thus there is no support for 577x as balanced polymorphism in any mammal other than human even though Z-line skeletal muscle structures may be very similar.

ACTN3 alleles.jpg

In regards to the supposed evolutionary advantage of various allele combinations (proxied by sprinting or endurance sports prowess), humans remain slow and weak relative to other mammals irregardless of their codon 577 status. The fastest human sprinter cannot outrun a dog with heartworms, much les a rabbit or chubby grizzly bear. A wild male chimp -- without any training or drug enhancement -- has the strength and aggression to rip apart the fittest human cage fighter.

Complete loss of ACTN3 does not give rise to a disease state or even observable phenotype in humans, despite a fallacious initial association with dystrophinopathy. Double knockouts of the orthologous gene in mouse are quite viable but exhibit various measurable effects. Over evolutionary time, the gene has been lost by natural genomic deletion in chicken and finch without known impact, yet retained in lizard, snake, and frog and even doubled in zebrafish (but not other rayfinned fish), again without known effect.

This suggested to some that human ACTN3 had an inessential or inconsequential physiological role to begin with (or become so since divergence with chimps), with its loss readiliy compensatable by other genes (presumably the 80% paralog ACTN3 with which it forms a heterodimer). In this view, ACTN3 may be on its way out the door, to disappear over time as a biallelic pseudogene. Given that over 80 other human genes have been lost, some very conserved over long evolutionary spans] since chimp divergence, it would not be surprising to catch a loss-in-progress.

The primary hurdle to clear here is the extraordinary conservation (proteome: 90th percentile) over 450 million years of ACTN3 amino acid sequence. It appears that this gene arose by segmental duplication from ancestral ACTN2 after the divergence of chondrichthyes (where all matches are ambiguous with respect to ACTN2/3). Gene duplicates are not usually retained over such a time frame unless they have a distinct functional niche that provides selective protection from constantly accruing deleterious mutations ('use it or lose it').

It is quite possible that earlier loss of a companion gene essential to the functional chain of ACTN3 has triggered its subsequent degeneration. The skeletal myosin gene MYH16 is potentially an attractive candidate, though actinins do not bind myosins directly but rather via actin. If ACTN3 functionality overlapped with ACTN2 apart from a critical role involving MYH16, then the loss of the latter gene would leave ACTN2 with no role that could not be compensated for by ACTN3. MYH16 was also lost to an internal stop codon after a half-billiion years of conservation following its separation from other myosins.

Another explanation is balanced polymorphism ('somewhat less is more') along the lines of allele proportions maintained in sickle-cell hemoglobins (heterozygosity can be selectively advantageous in malarial resistance). The idea here that human groups are benefited if some individuals have exceptional speed or endurance.

For ACTN3, frequencies of the three possible diploid states (R577/R577, R577/577x, 577x/R577x) vary by ethnic group and supposedly correlate -- imperfectly but predictively -- with athletic prowess. While single-locus genetic determinism is preposterous as determinative of a complex and vaguely specified phenotype, this has nonetheless gained traction in the popular mindset.

Correlations per se have a poor track record in establishing causality -- for example, a low IQ might also correlate quite strongly with interest and participation in sports. Perhaps R577x (ACTN3 is in fact expressed in brain) merely contributes to low IQ.

Phenotypic effects of ACTN1, ACTN2, and ACTN4 loss

Total absence of ACTN3 in human and double knockout mice does not result in genetic disease. What about loss of its three paralogs?

ACTN4 is only member of the actinin gene family mapped to a human disease, focal segmental glomerulosclerosis, FSGS1, autosomal dominant toxic gain-of-function when any of 5 distinct substitutions (eg W59R, I149del, K255E, R310Q and V801) cause it to bind too tightly (persistent switched on mode) to the actin cytoskeleton in renal glomerular podocytes (visceral epithelial cells of kidneys involved in filtration). This eventually leads to renal failure.

ACTN4 however is also widely expressed with a variety of roles (motility, adhesion, endocytosis). Its two amino-terminal calponin homology domains crosslink actin filaments (F-actin) as regulated by Ca2+. Certain substitutions divert its usual localization away from actin stress fibers and focal adhesions. The question is whether paralogous mutations in the calponin homology domains of other actinins would have comparable effects. That could be studied in knock-in mouse ACTN3 to help determine its normal functions.

The structure of the K255E protein shows that the calponin domains remain in compact configuration despite disruption of the inter-calponin bridge to Trp147 of CH1.

ACTN4 mutations.jpg


Less is known about disease alleles of ACTN2. A single report of dilated cardiomyopathy (CMD) seemingly attributably to a heterozygous missense mutation at a conserved residue

1. Mohapatra, Jimenez Lin Mutations in the muscle LIM protein and alpha-actinin-2 genes in dilated cardiomyopathy and endocardial fibroelastosis. Molec. Genet. Metab. 80: 207-215, 2003.


ACTN3 transcription IS NOT restricted to skeletal muscle

(section to be added shortly)

R577x as ACTN3 phyloSNP

An alignment of exon 15 (which comprises about half of the third spectrin domain) shows that codon 577 was ancestrally K (lysine) in early vertebrates, a residue persisting without exception to the present day in all extant vertebrates diverging before ornAna (platypus). In the mammalian stem, K was replaced by R (arginine) and that residue persisted in mammals (26/26 species). This is the definition of phyloSNP -- a clade-defining synapomorphy with persistent ancestral state whose conservation both before and after implies structural and/or functional significance in both the clade and its complementary tree but different roles. Evidently residue 577 has long been significant in the third spectrin domain of ACTN3, implying about any change in humans is disadvantageous.

The non-mammalian sequences below had to be individually established as orthologs of the human ACTN3 gene using retained flanking synteny of neighboring genes because the UCSC comparative genomics track misaligns paralogs here. The retained synteny is not always two-sided and in some cases inversions have resulted in loss of immediate adjacency.

K577R mamm phyloSNP.jpg

R577x and co-evolution of actinin spectrin repeats

(section to be added shortly)