IMPDH duplication and CBS domain
Introduction to the IMPDH Gene Family
Inosine 5' monophosphate dehydrogenase (IMPDH) is a highly conserved ubiquitous enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in de novo synthesis of nucleic acids, maintaining intracellular balance of A and G nucleotides, hence regulating cell cycle pauses, cell growth, differentiation, apoptosis, and signalling (via say cGMP).
Vertebrates in general have two very close paralogs, constitutively expressed IMPDH1 and inducible IMPDH2, otherwise indistinguishable in catalytic activity, substrate affinities, tetrameric structure, and interaction with inhibitors. IMPDH2 is greatly up-regulated in proliferating cells, notably activated leukocytes and tumor cells. These two genes lie on different autosomal chromosomes but share 12 identically placed and phased introns, suggesting segmental duplication. That duplication took place after lamprey divergence but prior chondrichtyhian; the deuterostome chain of events should not be conflated with independent duplications in other species (eg yeast and cnidarian) that may not offer valid structural, expressional, or functional parallels.
IMPDH genes are unusually conserved, in the top 95th percentile of all human genes. Remarkably, there is no drift of amino or carboxy termini nor any indel of any length tolerated in any species over many billions of years of branch length. That conservation extends to both members of the IMPDH duplication; for example human IMPDH1 is still 84% identical and 91% similar to IMPDH2 some 500 million years after the duplication (without maintainence from gene conversion). More typically, one gene copy subfunctionalizes or neofunctionalizes after a duplication event and so diverges considerably more rapidly to optimize to that changed role. Here divergence may have been concentrated in upstream regulatory regions relevent to constitutive vs inducible expression.
Since structural features are maintained to 20% identity and less, very high correspondence can be expected between and across IMPDH proteins, even comparing remote clades. Thus small molecule affectors of activity will in general not discriminate well between paralogs in a given species. If IMPDH2 was targeted but IMPDH1 was also impacted, that could result in chemotherapeutic toxicity. However some degree of discrimination is possible since mycophenolic acid (MPA) has higher affinity for IMPDH2.
Curiously, IMPDH has two further remote paralogs in GMPR and GMPR2. These 9-exon guanosine monophosphate reductases catalyze the preceding reaction in the pathway, one of very few instances of support for the 1945 Horowitz retrograde theory of the origin of metabolic pathways. Their intronation pattern could illuminate the relative timing of the gene duplications involved -- if totally unrelated, then the gene duplication took place in prokaryotes or very early unicellular eukaryotes prior to the main intronation era.
IMPDH1 in human has generated 11 quite recent processed pseudogenes, indicating significant germ line expression. These have integrated into the genome at seemingly random locations, most recently in a gene desert in chrX p11.4. None of these appear to have retained functionality, though they still translate to percent identity in the high 80's, either indicative of a burst at some point in the phylogenetic tree (ie a shift in expression after a divergence) or simply better detectability of less decayed events. It emerges that rampant steady pseudogene production is restricted to old world primates (based on two dozen available genomes).
IMPDH2 has not generated a single pseudogene. Consequently specific targeting of IMPDH2 would not be expected to affect germ line nucleotide metabolism.
CBS domain
The CBS domain is an ancient non-catalytic paired region found in a wide range of otherwise non-homologous genes. These fusions are all very old; the CBS domain is not a mobile domain today. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. In IMPDH, the domain comprises part of exons 5,6,7 but does not correspond cleanly to exon boundaries, which fits the domain being appended to the catalytic domain in a common ancestor with prokaryotes prior to establishment of introns. GMPR and GMPR2 lack the CBS domain.
The CBS region can be deleted in its entirety without affecting catalytic properties or the ability to form homotetramers. However R224P and D226N mutations at the end of the CBS domain affect deeply invariant residues and result in retinitis pigmentosa (through a complex chain of events not understood). IMPDH1 may control the rate-limiting regulated step in regeneration of cyclic GMP needed intensively in GPCR photoreceptor signalling -- the bulk of GTP within photoreceptors is generated by IMPDH1 (found predominately in retinal inner segment and synaptic termini). However this role would be relevent solely to ciliary opsins, meaning rhabodomeric photoreceptors in Drosophila (and other arthropods) could not serve as model systems for RP10 retinitis pigmentosa.
A comparative review of CBS domains in otherwise non-homologous proteins (IMPDH, cystathionine beta synthase, AMP kinase gamma2 subunit PRKAG2, chloride channels) suggests a common function: ATP (or related) binding as cooperative allosteric effector of catalytic activity, possibly in part through indirect regulation of oligomericity. The rate-limiting step in a pathway is a logical place for such a sensor of conditions. Mutations in the CBS domain of these proteins also can initiate disease: homocystinuria (I435T, D444N, and S466L abolish AdoMet activation), familial hypertrophic cardiomyopathy (H142R causes insensitivity to energy depletion signals -- high AMP, low ATP), and 5 CLC chloride channel paralogous conditions.
By cross-aligning mutations in these other genes onto the CBS domain of IMPDH1, residues important to CBS ligand specificities (and avoidance of toxic CBS cross-reactivity) can perhaps be identified. For example, H142R of AMP kinase transmaps to a exceedingly conserved glycine in the SKKGKLPIV region of the CBS domain of human IMPDH1. Because the CBS domains are so anciently diverged, sequence-based mappings are barely possible and need to be affirmed by structural alignment. (The cystathionine beta synthase mutation numbering is offset with respect to RefSeq.)
The ability of IMPDH to bind single stranded nucleic acid (not including cognate mRNA) is likely just a byproduct of the ability of the CBS domain to bind mononucleotides. This additional capability could still have utility in various assays of observed amino acid polymorphisms.
IMPDH1 knockouts in mice exhibit only slow retinal degeneration; the rapid course of RP10 in CBS mutations in humans has been attributed to IMPDH1 misfolding and aggregation also seen with MPA (and alleviated by elevated GTP). Toxic gain of dysfunction would account for autosomal dominance perhaps better than partial loss of function in a heterozygote. In this scenario, suppression of transcript translation from the bad allele could have value in RP10 in conjunction with elevated guanosine.
Five different IMPDH1 variants, Thr116Met, Asp226Asn, Val268Ile, Gly324Asp, and His 372Pro, were identified in eight autosomal dominant RP families. Two additional IMPDH1 variants, Arg105Trp and Asn198Lys, were found in two patients with isolated LCA. None of the novel IMPDH1 mutants identified in this study altered the enzymatic activity of the corresponding proteins. In contrast, the affinity and/or the specificity of single-stranded nucleic acid binding were altered for each IMPDH1 mutant except the Gly324Asp variant. 2% of families with adRP, and de novo IMPDH1 mutations are also rare causes of isolated Leber congenital amaurosis.
The alignment below of CBS domains reflects the evolutionary history before and after the gene duplication. Stable differences arose between CBS-IMPDH1 and CBS-IMPDH2 despite outstanding conservation in constrained regions that form the binding cleft, determine sensor specificity, and provide ability to communicate nucleotide cell status to the catalytic core. The degree of conservation, if visualized on the 3D structural coordinates of human IMPDH1 bound to 6-Cl-IMP (IJCN at PDB) or human IMPDH2 complexed with MPA (1JR1) using Combosa, would likely be most informative.
Timing the IMPDH duplication
Using released genomes, wgs contigs, trace archives, cDNA, and ad hoc GenBank sequences, the evolutionary history of IMPDH can be traced. It emerges that echinoderms, hemichordates, cephalochordates, and urochordates have but a single copy of the gene. Since it is implausible that a second copy was lost repeatedly in separate clades, by parsimony early deuterostomes did not experience an IMPDH duplication. Consequently, earlier species with two copies, such as the cnidarian Nematostella and yeast, experienced independent duplications not necessarily to the same purpose, making them irrelevent as model species to human.
The situation in lamprey is slightly ambiguous because, while numerous exons or short blocks of contiguous exons can be located, these are hard to assign unambiguously to IMPDH1 or IMPDH2 due to the latter's high percent identity. However it appears that lamprey contains a single gene of IMPDH1 character -- trace coverage averages 3-4 per exon and these alway of the same amino acid sequence (up to inherent trace sequence errors).
The cartilaginous fish, Callorhinchus milii, is also in a state of incomplete assembly. Here however distinct multi-exon fragments can be recovered, with duplicate coverage at five of the 13 exons. These cluster with IMPDH1 and IMPDH2 respectively. However, the contigs are too short to incorporate neighboring genes and syntenic correlation with other vertebrates is not yet possible. (The paralogs in elephantfish should have the same neighbors, unless subsequent chromosomal rearrangements have irrevocably scrambled gene order.)
Transcripts for skate, Leucoraja erinacea, support this view: EE991359 clusters with blastx with IMPDH1 and DT726645 with IMPDH2. The situation is the same dogfish, Squalus acanthias, for EG027286 and CV720525 respectively. No genome projects are underway in these species so again a duplication specific to chondrichtythes is difficult to rule out. These species confirm that distinctions between the respective CBS domains were established shortly after divergence and conserved ever since.
Five teleost fish genomes are available but these experienced a whole genome duplication with complex retention patterns here. making them unsuitable for evolutionary issues. Frog has the expected paralog pair, as does the amniote, Anolis. Chicken presents an odd situation with a clear IMPDH2 ortholog but an assembly gap where IMPDH1 should be. At the trace archives, a diverged and gappy second gene can be located that resembles IMPDH1. It is likely a pseudogene despite the lack of internal stop codons. The second bird genome, zebrafinch, also lacks IMPDH1. Thus in Aves it appears from the available data that IMPDH1 has been lost.
Synteny is important in this particular gene family for building an accurately labelled set of reference sequences. That is, in a fast evolving clade, IMPDH1 and IMPDH2 might be confused if neighboring gene information was not available to establish orthology. However, the useful lifespan of synteny for IMPDH1 especially is quite short. For example, Xenopus gene order (+SND1 -IMPDH1 -OPN1SW +CALU -SAPS2) already differs from human either by local inversion, misassembly, gene indels, or growing ortholog unrecognizability. Where synteny would be really helpful, say Branchiostoma or Petromyzon of Callorhinchus, it is either entirely lost or not yet available at the current state of assembly.
human gene order about IMPDH +SND1 PH4 +LEP DALRD3 -RBM28 C3orf60 -IMPDH1 IMPDH2 +HIG2 QARS +METTL2B USP19 +CALU -OPN1SW
IMPDH Sequence Resources
Phylogenetically representative IMPDH inosine monophosphate dehdrogenase sequences from eumetazoans. Both genes -- and their regulatory regions -- could be sampled much more densely within mammals. >IMPDH1_homSap Homo sapiens (human) chr7 127,828,554 0 MADYLISGGTGYVPEDGLTAQQLFASADGLTYN 2 1 DFLILPGFIDFIADEV 0 0 DLTSALTRKITLKTPLISSPMDTVTEADMAIAMA 0 0 LMGGIGFIHHNCTPEFQANEVRKVK 0 0 KFEQGFITDPVVLSPSHTVGDVLEAKMRHGFSGIPITETGTMGSKLVGIVTSRDIDFLAEKDHTTLLSE 0 0 VMTPRIELVVAPAGVTLKEANEILQRSKK 1 2 GKLPIVNDCDELVAIIARTDLKKNRDYPLASKDSQKQLLCGAAVGTREDDKYRLDLLTQAGVDVIVL 0 0 DSSQGNSVYQIAMVHYIKQKYPHLQVIGGN 2 1 VVTAAQAKNLIDAGVDGLRVGMGCGSICITQE 1 2 VMACGRPQGTAVYKVAEYARRFGVPIIADGGIQTVGHVVKALALGAST 1 2 VMMGSLLAATTEAPGEYFFSDGVRLKKYRGMGSLDAMEKSSSSQKRYFS 2 1 EGDKVKIAQGVSGSIQDKGSIQKFVPYLIAGIQHGCQDIGARSLSVLR 2 1 SMMYSGELKFEKRTMSAQIEGGVHGLHS 2 1 YEKRLY* 0 >IMPDH2_homSap Homo sapiens (human) paired CDS domain exons 5-7 chr3 49,039,322 0 MADYLISGGTSYVPDDGLTAQQLFNCGDGLTYN 2 1 DFLILPGYIDFTADQV 0 0 DLTSALTKKITLKTPLVSSPMDTVTEAGMAIAMA 0 0 LTGGIGFIHHNCTPEFQANEVRKVK 0 0 KYEQGFITDPVVLSPKDRVRDVFEAKARHGFCGIPITDTGRMGSRLVGIISSRDIDFLKEEEHDCFLEE 0 0 IMTKREDLVVAPAGITLKEANEILQRSKK 1 2 GKLPIVNEDDELVAIIARTDLKKNRDYPLASKDAKKQLLCGAAIGTHEDDKYRLDLLAQAGVDVVVL 0 0 DSSQGNSIFQINMIKYIKDKYPNLQVIGGN 1 2 VVTAAQAKNLIDAGVDALRVGMGSGSICITQE 1 2 VLACGRPQATAVYKVSEYARRFGVPVIADGGIQNVGHIAKALALGAST 1 2 VMMGSLLAATTEAPGEYFFSDGIRLKKYRGMGSLDAMDKHLSSQNRYFS 2 1 EADKIKVAQGVSGAVQDKGSIHKFVPYLIAGIQHSCQDIGAKSLTQVR 2 1 AMMYSGELKFEKRTSSAQVEGGVHSLHS 2 1 YEKRLF* >IMPDH1_musMus Mus musculus (mouse) MADYLISGGTGYVPEDGLTAQQLFANADGLTYNDFLILPGFIDFIADEVDLTSALTRKITLKTPLISSPM DTVTEADMAIAMALMGGIGFIHHNCTPEFQANEVRKVKKFEQGFITDPVVLSPSHTVGDVLEAKIQHGFS GIPITATGTMGSKLVGIVTSRDIDFLAEKDHTTLLSEVMTPRVELVVAPAGVTLKEANEILQRSKKGKLP IVNDQDELVAIIARTDLKKNRDYPLASKDSHKQLLCGAAVGTREDDKYRLDLLTQAGADVIVLDSSQGNS VYQIAMVHYIKQKYPHLQVIGGNVVTAAQAKNLIDAGVDGLRVGMGCGSICITQEVMACGRPQGTAVYKV AEYARRFGVPVIADGGIQTVGHVVKALALGASTVMMGSLLAATTEAPGEYFFSDGVRLKKYRGMGSLDAM EKSSSSQKRYFSEGDKVKIAQGVSGSIQDKGSIQKFVPYLIAGIQHGCQDIGAQSLSVLRSMMYSGELKF EKRTMSAQIEGGVHGLHSYEKRLY* >IMPDH2_musMus Mus musculus (mouse) MADYLISGGTSYVPDDGLTAQQLFNCGDGLTYNDFLILPGYIDFTADQVDLTSALTKKITLKTPLVSSPM DTVTEAGMAIAMALTGGIGFIHHNCTPEFQANEVRKVKKYEQGFITDPVVLSPKDRVRDVFEAKARHGFC GIPITDTGRMGSRLVGIISSRDIDFLKEEEHDRFLEEIMTKREDLVVAPAGVTLKEANEILQRSKKGKLP IVNENDELVAIIARTDLKKNRDYPLASKDAKKQLLCGAAIGTHEDDKYRLDLLALAGVDVVVLDSSQGNS IFQINMIKYIKEKYPSLQVIGGNVVTAAQAKNLIDAGVDALRVGMGSGSICITQEVLACGRPQATAVYKV SEYARRFGVPVIADGGIQNVGHIAKALALGASTVMMGSLLAATTEAPGEYFFSDGIRLKKYRGMGSLDAM DKHLSSQNRYFSEADKIKVAQGVSGAVQDKGSIHKFVPYLIAGIQHSCQDIGAKSLTQVRAMMYSGELKF EKRTSSAQVEGGVHSLHSYEKRLF* >IMPDH1_bosTau Bos taurus (cow) MEEPRAPPAGSAPFPVSLQGGGTADVPEPGARQHPGHETAAQRY SARLLQAGYEPESMADYLISGGTGYVPEDGLTAQQLFANADGLTYNDFLILPGFIDFT ADEVDLTSALTRKITLKTPLISSPMDTVTEADMAIAMALMGGIGFIHHNCTPEFQANE VRKVKKFEQGFITDPVVLSPSHTVGDVLEAKIRHGFSGIPITETGTMGSKLVGIVTSR DIDFLAEKDHTTLLSEVMTPRNELVVAPAGVTLKEANEILQRSKKGKLPIVNDRDELV AIIARTDLKKNRDYPLASKDSHKQLLCGAAVGTREDDKYRLDLLTQAGADVIVLDSSQ GNSVYQIAMVHYIKQKYPHLQVIGGNVVTAAQAKNLIDAGVDGLRVGMGCGSICITQE VMACGRPQGTAVYKVAEYARRFGVPVIADGGIQTVGHVVKALALGASTVMMGSLLAAT TEAPGEYFFSDGVRLKKYRGMGSLDAMEKSSSSQKRYFSEGDKVKIAQGVSGSIQDKG SIQKFVPYLIAGIQHGCQDIGARSLSVLRSMMYSGELKFEKRTMSAQIEGGVHGLHSY EKRLY* >IMPDH2_bosTau Bos taurus (cow) MADYLISGGTSYVPDDGLTAQQLFNCGDGLTYNDFLILPGYIDFTADQVDLTSALTKKITLKTPLVSSPM DTVTEAGMAIAMALTGGIGFIHHNCTPEFQANEVRKVKKYEQGFITDPVVLSPRDRVRDVFEAKARHGFC GIPITDTGRMGSHLVGIISSRDIDFLKEEEHDRLLGEIMTKREDLVVAPAGITLKEANEILQRSKKGKLP IVNENDELVAIIARTDLKKNRDYPLASKDAKKQLLCGAAIGTHEDDKYRLDLLSQAGVDVVVLDSSQGNS IFQINMIKYIKEKYPSIQVIGGNVVTAAQAKNLIDAGVDALRVGMGSGSICITQEVLACGRPQATAVYKV SEYARRFGVPVIADGGIQNVGHIAKALALGASTVMMGSLLAATTEAPGEYFFSDGIRLKKYRGMGSLDAM DKHLSSQNRYFSEADKIKVAQGVSGAVQDKGSIHKFVPYLIAGIQHSCQDIGAKSLTQVRAMMYSGELKF EKRTSSAQVEGGVHSLHSYEKRLF* >IMPDH1_ornAna Ornithorhynchus anatinus frag MEGTPLISGGTGYVPLDGLTAQQLFAIADGLTYNDFLILPGFIDFTADEVDLTSALTRKITLK TPLISSPMDTVTEADMAIAMALMGGIGFIHHNCTPEFQANEVRKVKKFEQGFITDPVV LSPSHTVGDVLEAKARHGFSGIPITETGTMGSKLVGIVTSRDIDFLAEKDHATYLSEV MTRRTELVVAPAGVTLKEANEILQRSKKGKLPIVNDSDELVAIIARTDLKKNRDYPLA SKDAHKQLLCGAAVGTREDDKYRLDLLTQAGTDVIVLDSSQGNSVYQIAMVHYIKQKY PQLQVIGGNVVTAAQAKNLIDAGVDGLRVGMGCGSICITQEVMACGRPQGTAVYKVAE YARRFGVPVIADGGIQTVGHVVKALALGAST EGDKVKVAQGVSGSIQDKGSIQKFVP YLIAGIQHGCQDIGARSLSVLRSMMYSGELKFEKRTMSAQIEGGVHGLHSYEKRLY* >IMPDH2_ornAna Ornithorhynchus anatinus frag DFLILPGYIDFTADQVDLTSALTKKITLKTPLIS SPMDTVTEAGMAIAMALTGGIGFIHHNCTPEFQANEVRKVKKYEQGFITDPVVLSPKD RVRDVFEAKARHGFCGIPITDNGKMGSRLMGIISSRDIDFLKEEEHDLYLGEIMTKWE DLVVAPAGVTLKEANEILQRSKKGKLPIVNEDNELVAIIARTDLKKNRDYPLASKDAK KQLLCGAAIGTHEDDKYRLDLLAQAGVDVVVLDSSQGNSIFQINMIKYIKEKYHNLQV IGGNVVTAAQAKNLIDAGVDALRVGMGSGSICITQEVLACGRPQATAVYKVSEYARRF GVPVIADGGIQNVGHIAKALALGASTVMMGSLLAATTEAPGEYFFSDGIRLKKYRGMG SLDAMDKNLGSQNRYFSEADKIKVAQGVSGAVQDKGSIHKFVPYLIAGIQHSCQDIGA KSLTLVRAMMYSGELKFEKRTSSAQVEGGVHGLHSYEKRLF* >IMPDH1_galGal Gallus gallus (chicken) 60% indels eye (hatched) DR427747 tr 251351904 inconsistentTaeniopygia guttata missing too AIPATDGESGRLQMVAEAGADVVLLDSSQGNSACQISMIRYIKERYPDMQVVGGCVVTAA QCRRLIEAGADALRVGMGGGADCGDSDVPWGRPPGTAVYRVADYARRFGVPVIADGGIRA VGHIVKAAAVGASTVMLGSLLWGTTEAPGVLFSDGAVLKKFRGGAALEVTETGGGTQQHC YSDVAEGQMAQGVPAALQDKGSVQKFLPYLSAGIQRGCQHVGARSLSALRSMAYSGELKF ERRTLAAQLEGLQGLRACEKRLY* >IMPDH1_anoCar Anolis carolinensis (lizard) misassembly frag 0 2 1 0 0 0 0 LMGGIGIIHHNCTPEFQANEVRKVK 0 0 VQKFEQGFITDPVVLSPSHSVGDVFEAKVRHGFSGIPVTEAGKMGSTLVGIVTSRDIDFLSEKDYDTPLSE 0 0 VMTKRSDLVVAPAGVTLKEANEILQRSKK 1 2 GKLPIVNDADELVAIIARTDLKKNRDYPLASKDPRKQLLCGAAIGTREDDKYRLDLLTQAGVDVVVL 0 0 DSSQGNSVYQISMIHYIKHKYPELQVIGGN 2 1 VVTAAQAKNLIDAGVDALRVGMGCGSICITQE 1 2 VMACGRPQGTAVYKVAEYARRFGVPVIADGGIQTVGHVVKALSLGAST 1 2 VMMGSLLAATTEAPGEYFFSDGVRLKKYRGMGSLDAMEKNSSSQKRYFs 2 1 EGDKVKVAQGVSGSIQDKGSIQKFVPYLIAGIQHGCQDIGAKSLSILR 2 1 2 1* 0 >IMPDH2_galGal Gallus gallus (chicken) fully syntenic. no IMPDH1 MADYLISGGTGYVPDDGLTAQQLFSCGDGLTYNDFLILPGYIDFTADQVD LTSALTKRITLKTPLVSSPMDTVTEAGMAIAMALTGGIGFIHHNCTPEFQANEVRKVKKYEQGFITDPVVLSPNDRVRDVFEAKARHGFCGIPITDNGKM GGKLVGIISSRDIDFLKESEHDLPLGEIMTKREDLVVAPSGVMLKEANEILQRSKKGKLPIVNEDDELVAIIARTDLKKNRDYPLASKDSKKQLLCGAAI GTHEDDKYRLDLLVQAGVDAVVLDSSQGNSIFQINMIKYIKEKYPNLQVIGGNVVTAAQAKNLIDAGVDALRVGMGSGSICITQEVLACGRPQATAVYKV SEYARRFGVPVIADGGIQTVGHIAKALALGASTVMMGSLLAATTEAPGEYFFSDGIRLKKYRGMGSLDAMDKNLGSQNRYFSETDKIKVAQGVSGAVQDK GSIHKFIPYLIAGIQHSCQDIGAKSLTQVRAMMYSGELKFEKRTTSAQVEGGVHGLHSYEKRLF* >IMPDH1_xenTro Xenopus tropicalis (frog) SND1 ... OPN1SW CALU SAPS2 not syntenic MADYLISGGTGYVPEDGLTAQHLFANSDGLTYNDFLILPGFIDFTADEVDLTSALTRKITLKTPLISSPM DTVTESDMAIAMALMGGIGIIHHNCTPEFQANEVRKVKKFEQGFITDPVVMSLNHTVGDVFEAKNRHGFS GIPVTETGKMGSKLVGIVTSRDIDFLTEKDYSTYLSEVMTKRDELVVAPAGVTLKEANEILQRSKKGKLP IVNDSDELVAIIARTDLKKNRDYPLASKDCRKQLLCGAAIGTREDDKYRLDLLTQAGVDVVVLDSSQGNS VYQINMIHYIKQKYPELQVVGGNVVTAAQAKNLIDAGVDALRVGMGCGSICITQEVMACGRPQGTAVYKV AEYARRFGVPVIADGGIQTVGHVVKALALGASTVMMGSLLAATTEAPGEYFFSDGVRLKKYRGMGSLDAM EKNTSSQKRYFSEGDKVKVAQGVSGSIQDKGSIHKFVPYLIAGIQHGCQDIGAKSLSILRSMMYSGELKL EKRTMSAQVEGGVHGLHSYEKRLY* >IMPDH2_xenTro Xenopus tropicalis (frog) fully syntenic MADYLISGGTSYVPDDGLTAQQLFGAGDGLTYNDFLILPGYIDFTADQVD LTSALTKKITLKTPMVSSPMDTVTEASMAIAMALTGGIGIMHHNCTPEFQANEVRKVKKYEQGFITDPVVLSPKHRVRDVFEAKARHGFCGIPITENGKM GSKLAGIISSRDIDFLKSEEHDLALSEIMTRREDLVVAPAGVTLKEANEILQRSKKGKLPIVNGNDELVAIIARTDLKKNRDYPLASKDAKKQLLCGAAI GTHEDDKYRLDLLVQAGVDAVVLDSSQGNSIFQINMIKFIKEKYQDLQVIAGNVVTAAQAKNLIDAGADALRVGMGSGSICITQEVLACGRPQATAVYKV SEYARRFGVPVIADGGIQTVGHIAKALALGASTVMMGSLLAATTEAPGEYFFSDGIRLKKYRGMGSLDAMDKNVSSQKRYFSEADKIKVAQGVSGAVQDK GSIHKFIPYLIAGIQHSCQDIGAKSLTQVRAMMYSGELKFEKRTMSAQVEGGVHGLHSYEKRLF* >IMPDH1_calMil Callorhinchus milii (elephantshark) fragment 0 2 1 DFLILPGFIDFSAEEV 0 0 0 0 0 0 0 0 1 2 GKLPIVNDADELIAIIARTDLKKNRDYPLASKDSRKQLLCGAAIGTREDDKYRLDLLVQAGVDVIVL 0 0 DSSQGNSVFQINMIHCIKQRYPELQVVGGN 2 2 1 2 1 2 2 1 EGDKVKVAQGVSGSVQDKGSIHKFAPYLITGIQHGCQDIGAKSLSILR 2 1 SMMYSGELKFEKRTISAQVEGGVHGLHS 2 1 * 0 >IMPDH2_calMil Callorhinchus milii (elephantshark) fragment 0 MADYLISGGTGYVPDDGQSAQQLLGAGDGLTYn 2 1 DFLILPGYIDFTSDQV 0 0 DLTSALTKKITLKTPLVSSPMDTVTEANLAIAMA 0 0 LMGGIGIIHHNCTPEFQANEVRKVK 0 0 KYEQGFITDPVVMSPSEKVRDVFEAKARHGFSGIPITDNGKMGGKLKGIISSRDIDFLTEKEHDLLLSE 0 0 VMTRREDLVVAPAAVTLKEANEILQRSKK 1 2 GKLPIVNDNDELVAIIARTDLKKNRDYPLASKDSKKQLLCGAAIGTHEDDKYRLDQLFQAGVDVVVL 0 0 2 2 VVTAAQAKNLIDAGVDALRVGMGCGSICITQE 1 2 VMACGRPQGTAVYKVAEYARRFGVPVIADGGIQTVGHVVKALALGAST 1 2 2 1 ESDKIKVAQGVSGAVQDKGSIQKFIPYLIAGIQHGCQDIGAKSLTQLR 2 1 AMMYSGELKFEKRTLCAQVEGGVHGLHS 2 1 * 0 >IMPDH_petMar_frag2 Petromyzon marinus (lamprey) 854345740 854350173 0 MADFLISGGTGYVPDDGMTAQQLFAGADGLTYn 2 1 0 0 DLASALTKQITLKTPLVSSPMDTVTESSMAIAMA 0 0 LMGGVGVIHHNCTPEFQANEVRKVk 0 >IMPDH_petMar_frag3 Petromyzon marinus (lamprey) 565582434 565582613 0 MADYLISGGAGYVPEDGLTATQLFAGTDGLTYn 2 1 DFLILPGFIDFTVEEV 0 0 0 0 0 0 KYEQGFITDPVVLSPAHTVGDVLEAKARHGFSGIPITETGRMGGRLAGVITSRDIDFLTE 0 >IMPDH_petMar_frag1 Petromyzon marinus (lamprey) 700532459 700537638 2 GKLPIVNEAGELVAIMARTDLKKNRDFPLASKDAHKQLLCGAAVGTRPDDQHRLELLVKAGVDVVIL 0 0 DSSQGNSVFQINMIRYIKQTYPELQVIGGN 1 2 VVTAAQAKNLIDAGVDALRVGMGAGSICITQE 1 2 1 2 VMMGSLLATTEAPGEYFFSDGVRLKKYRGMGSLDAMEKNQGSQSRYFS 2 1 ENDKMKVAQGVSGSVQRQG 2 1 SMMYTGELKFEKRTMSAQVEGGVHGLHS 2 1 * 0 >IMPDH1_petMar Petromyzon marinus (lamprey) composite 0 MADYLISGGAGYVPEDGLTATQLFAGTDGLTYn 2 1 DFLILPGFIDFTVEEV 0 0 DLASALTKQITLKTPLVSSPMDTVTESSMAIAMA 0 0 LMGGVGVIHHNCTPEFQANEVRKVk 0 0 KYEQGFITDPVVLSPAHTVGDVLEAKARHGFSGIPITETGRMGGRLAGVITSRDIDFLTEKDHGIPLHQ 0 0 VMTKREDLVVAPSGVTLTEANELLQRSKK 1 2 GKLPIVNEAGELVAIMARTDLKKNRDFPLASKDAHKQLLCGAAVGTRPDDQHRLELLVKAGVDVVIL 0 0 DSSQGNSVFQINMIRYIKQTYPELQVIGGN 1 2 VVTAAQAKNLIDAGVDALRVGMGAGSICITQE 1 2 VMAVGRPQGTAVYRVAEYARRFGVPVMADGGIQTVGHITKALALGAST 1 2 VMMGSLLATTEAPGEYFFSDGVRLKKYRGMGSLDAMEKNQGSQSRYFS 2 1 ENDKMKVAQGVSGSVQRQG 2 1 SMMYTGELKFEKRTMSAQVEGGVHGLHS 2 1 * 0 >IMPDH_braFlo Branchiostoma floridae (lancelet) one copy, severe misassembly 0 MADYLISGGTGYVPDDGMTASQLFGTGEGLTYn 2 1 DFLILPGFIDFTADEV 0 0 DLTSAMTKKIQLKAPLVSSPMDTVSESDMAIAMA 0 0 LTGGIGIIHSNCTPEFQANEVRKVK 0 0 KYEQGFIMDPIVLSPEHTVGDVCEMKRKHGFSGIPITENGKLGGKLLGIVTSRDIDFMNSDHHHIKLRD 0 0 VMTPFEELVVGHAGVSLKEANETLQRSKK 1 2 GKLPIVNENDELVSLIARTDLKKNRDYPLASKDSKKQLLCGAAIGTREEDKYRVELLVQAGVDLVVL 0 0 DSSQGNSIYQINMIRYLKQKYSELQVIGGN 2 1 VVTAAQAKNLIDAGVDGLRVGMGSGSICITQE 1 2 VMAVGRPQGTAVYKVAEYARRFGVPVIADGGISTVGHITKALALGAST 1 2 VMMGSLLAGTSEAPGEYFFQDGVRLKKYRGMGSLEAMEKGKASQNRYFR 2 1 ESDKLKVAQGVTGSIQDKGSVHKFVPYLIAGIQHGCQDIGAKSLSSLR 2 1 SMMYSGELRFETRTVSAQVEGGVHGLHS 2 1 * 0 >IMPDH_cioInt Ciona intestinalis (tunicate) mRNA AK114441 MAEFLISGETSYVPEDGLTAAQLLNTGDGLTYNDFLILPGFIDFTASEVDLTSALTKKISL KTPLLSSPMDTVTESDMAIGMALMGGMGFIHYNCTPEFQAAEVRRVKKYEQGFIQNPVTL GPKATVRDVTDVKAMYGFSGIPVTDDGTPTGKLIGLVSSRDFDFLKPEESNTPLEQVMTG RDKLITADTSVTLQEANHILSQSKKGKLPIVDADDRLVSLIARTDLKKNREFPLASKDER KQLLCGAAISTREEDKHRLELLVEAGVDAVILDSSQGNSIYQINSIRYIRHKYPHLQVIA GNVVTAAQAKNLIDAGADALRVGMGSGSICITQEVMAVGRPQATAVYKVSEYARRFNVPV IADGGIQNVGHVTKALALGASTVMMGSLLAATTESPGEYFYSDGIRLKKYRGMGSVDAME SCKSSQSRYFSEKDKIRVAQGVSGAVQDKGSVHTFLPYLIAGIQHGCQDIGSRSMPMLRS MMYSGELKFERRSTSAQVEGGVHGLHSFEKPHL* >IMPDH_strPur Strongylocentrotus purpuratus (sea urchin) single gene 2 intronation changes 0 MSKKVKLMNGAMEPQVDDGLSGQQLFGSGDGLTYn 2 1 DFLILPGYIDFTSDQV 0 0 DLQSQLTKDITLKAPLVSSPMDTVTESSMAIAMA 0 0 LCGGIGIIHHNCSPEFQANEVRKVK 0 0 KYEQGFIMDPVVLGPNDTVGDVFGSKAKHGFS 1 2 GIPITDTGRLGGKLLGIVTARDIDFLKPESYVKPLST 00 AMTCREDLVVAPANVTLKQANDLLQKAKK 1 2 GKLPIVNEKDELVSLISRTDLKKHREFPLASKDPRKQLLCGAAIGTREEDKHRLDLLVQAGVDVVIL 0 0 DSSQGNSSFQVSMIKCTKAKYPELQVVAGN 1 2 VVTVAQAKNLIQAGADALRVGMGSGSICITQE 1 2 VMAVGRPQGTAVYRVAQYARSCGVPIIADGGITTVGHITKALSLGASS 1 2 VMMGSLLAGTTEAPGEYFFSDGVRLKKYRGMGSLDAMEKNQSSAKRYFR 2 1 ERDKLKVAQGVSGSIVDKGSIHKFVPYLIAGIQHGCQDIGALSLTVLR 2 1 EKMYSGEVRFERRSPSAQVEGGVHSLHS 2 1 *0 >IMPDH_sacKow Saccoglossus kowalevskii (acornworm) frag one copy 82% homSap 0 2 1 0 0 0 0 0 0 KYEQGFIMDALVMSANTTIKEVFAAKSQHGFSGIPITDNGKLGGRLLGIVTARDIDFVEPEFNDKPLEQFMTKREDLVVAPANVTLKEANDILQKSKK 1 2 GKLPIVNENDELVSLISRTDLKKHREFPLASKDSKKQLLCGAAIGTHESDKNRLDLLVQAGVDVIIL 0 0 1 2 VVTAAQAKNLIDAGVDALRVGMGSGSICITQE 1 2 VMAVGRPQGTSVYKVAEYARRFGVPVIADGGIGTVGHITKALALGAST 1 2 VMMGSLLAGTSEAPGEYYFSDGVRLKKYRGMGSLDAMEiNQSDRYFS 2 1 ESDKLKVAQGVSGSIIDKGSIHKFIPYLIAGIQHGCQDIGAKSMSMLR 2 1 2 1 *0 >IMPDH_helRob Helobdella robusta (leech) frag cDNA QAGVDFVVLDSSQGNSIYQIKLIKYIKEKYPNLQVIGGNVVTAAQAKNLIDAGVDALRVG MGSGSICITQEVMAVGRPQGTAVYKVAEYARRFGVPIIADGGIENVGHIVKALALGASTV MMGSLLAGTTEAPGEYYFADGVRLKKYRGMGSLDAMEQHKASQSRYFSDSDKVKVAQGVS GAVVDKGSIHKFLPYLISGVQHGCQDLGAKSLSCLRSMMYQGELKFEKRTTSSQIEGGVH GLHSYEKRLY* >IMPDH_schMan Schistosoma japonicum (flatworm) fragment FSLKVPFASSPMDTVTEAKMAIAMS LCGSIGFVHNNCSVEAQANEVKKVK KYNQGFILSPVVVSPRQPIYDIIEIKKKYGFG GIPVTEDGYMGSRLVGLVTLRDVDFLDPNDFNTPVEKVMTPFDDLVTAFSCVTLSEANDLLRKSKK GKLPIINENRELVALIARTDLQKNRDYPLASRDDENQLIVGAAISTHEGDFARVKALINSGVDIIVI DSSQGNSIYQLDMIKRIKSSFPDLQIIGGN IVTCAQAKNLIDAGVDGLRVGMGSGSICITQE VTAIGRSQAKAVYSVSEYAHKYDIPVIADGGIQNTGHIVKALSFGASS VMMGGLLAGTTESAGEYIFSDGVKLKKYRGMGSIEAMSQHTESQARYFSESDRIKVAQGVSGTIVDR GSVHQLVPYLVAGVKHGLQQIGARNITELHNMSRSGKLRFELRSPSAQLEGGVHSLYS YDKSMF* >IMPDH_schMed Schmidtea mediterranea (planaria) frag NZ_AAWT01010891 more GFSASELFNKKTGLTYNDFILLPGYIDFRSEEVDITTRITKNMYLKTPLVSSPMDTVTES NMAIAMALSGGIGILHHNCTIDTQANEIRRVKKYEQGFIVDPVVMSIDNTVGEVMNIKIK NGFTGIPITDNGKLGGKLVGLVTLRDIDFLPKSNWDLKVSEVMTPFENLITAPYGVALHE ANGILQRSKKGKLPIINENRELVALISRTDLKKNREYPLSSKDDR >IMPDH_ixoSca Ixodes scapularis MSDKKTTFSEDGLTAQHLIGSGDGLTYNDFLILPGFIDFNADDVD LTSKLTKNITLQAPLVSSPMDTVTESEMAIAMALCGGIGIIHHNCTPEHQANEVHKVKKYKHGFIHDPVVLSPNNC VADVFEVKRKHGFAGVPITENGKLGGKLVGMVTSRDIDFIPIEDHNRLLSEVMTSLK DLTVASSKVTLSEANSLLQKSKKGKLPLVNEGGELVSLIARTDLKKSRSYPLASKDENKQLIVGAAIGTREADKPRL ELLVQAGVDVVVLDSSQGNSVYQIEMVKYIKSKYPGLQVIGGNVVTTAQAKNLIEAGVDG LRVGMGSGSICITQEVMACGRPQATAVYKVAEYARRFGVPCVADGGVSSVGHIIKALALG ASTVMMGSMLAGTTESPGEYFFSNGVRLKKYRGMGSLDAMQSTEGGGSLNRYYQSDQDKV RVAQGVSGTIVDKGSIHRYVPYLITGIKYGCQDIGARSLDVLKANMYSGDIKFEKRSVSA QIEGGVHGLHSYEKRLY* >IMPDH_nemVec Nematostella vectenisis (anemone) 78% XM_001641825 0 MMADFLIAGGTSYVPEDGMTAGQLFQSDGLTYS 2 1 DFIILPGFIDFPATDV 0 0 DLTSPLTRRITIKTPLVSSPMDTVTESALATAMA 0 0 LNGGIGIIHHNCSIEFQANEIRKVKKFEQGFIMAPLVLSATNTVADVIDAKQRHGFSGIPIT 1 2 ENGQLGGILQGIVTSRDIDFLHGVENHKQLGE 0 0 VMTRLEDLVVAKAGITLNEANKILQMSKK 1 2 GKLPIVNEKGELVSLIARTDLKKNRDYPLASKDENKQLL 1 2 VGAAIGTREDDKARLHALVEAGVDVVVIDSSQGNSIYQLSLIS 2 1 HIKENYPNLQIVGGNVVTASQAKNLIDAGVDALRVGMGSGSICITQEVMAVGRPQGTAVYKVAEYARRFGVPVLADGGIQNVGHITKALSLGAST 1 2 VMMGSLLAGTSEAPGEYFFADGVRLKKYR 1 2 GMGSLSAMEKNSSSASRYFS 2 1 ENDKVKVAQGVSGSVVDKGSIHKFVPYLTAGIQHGCQDLGAKSLTSLR 2 1 SMMYSGELKFERRTTSSQIEGGVHGLHS 2 1 YEKRLF* 0
CBS Domain Sequence Resources
It is difficult to recover full length IMPDH sequences for a gene with 13 exons because of incomplete assemblies, confusion between IMPDH1 and IMPDH2 fragments, and the tendency for cDNA projects to begin at one end and not proceed quite to full length. Sequences more commonly are complete enough to have full length CBS domains. Thus to the extent that this domain regulates enzymatic activity through allosteric effects and that small molecule effectors should discriminate between the two CBS domains, it is informative to determine where conserved residues in the respective domains differ, ie determine a subtractive profiles.
>IMPDH1_homSap ITDPVVLSPSHTVGDVLEAKMRHGFSGIPITETGTMGSKLVGIVTSRDIDFLAEKDHTTLLSEVMTPRIELVVAPAGVTLKEANEILQRSKKGKLPIVNDCDELVAIIARTDLKKN >IMPDH1_bosTau ITDPVVLSPSHTVGDVLEAKIRHGFSGIPITETGTMGSKLVGIVTSRDIDFLAEKDHTTLLSEVMTPRNELVVAPAGVTLKEANEILQRSKKGKLPIVNDRDELVAIIARTDLKKN >IMPDH1_musMus ITDPVVLSPSHTVGDVLEAKIQHGFSGIPITATGTMGSKLVGIVTSRDIDFLAEKDHTTLLSEVMTPRVELVVAPAGVTLKEANEILQRSKKGKLPIVNDQDELVAIIARTDLKKN >IMPDH1_ornAna ITDPVVLSPSHTVGDVLEAKARHGFSGIPITETGTMGSKLVGIVTSRDIDFLAEKDHATYLSEVMTRRTELVVAPAGVTLKEANEILQRSKKGKLPIVNDSDELVAIIARTDLKKN >IMPDH1_anoCar ITDPVVLSPSHSVGDVFEAKVRHGFSGIPVTEAGKMGSTLVGIVTSRDIDFLSEKDYDTPLSEVMTKRSDLVVAPAGVTLKEANEILQRSKKGKLPIVNDADELVAIIARTDLKKN >IMPDH1_xenTro ITDPVVMSLNHTVGDVFEAKNRHGFSGIPVTETGKMGSKLVGIVTSRDIDFLTEKDYSTYLSEVMTKRDELVVAPAGVTLKEANEILQRSKKGKLPIVNDSDELVAIIARTDLKKN >IMPDH1_danRer ITDPVVMSPRHTVGDVFEAKVRHGFSGIPVTETGKMGSKLVGIVTSRDIDFLSEKDYDRPLEESMTKREDLVVAPAGVTLKEANDILQRSKKGKLPIVNDSDELVAIIARTDLKKN >IMPDH1_takRer ITDPVVMSPRHTVGDVFEAKIRHGFSGIPVTETGKMGSKLVGIVTSRDIDFLSEKDHDRPLEEAMTKREDLVVAPAGVTLKEANDILQRSKKGKLPIVNDSDELVAIIARTDLKKN >IMPDH1_leuRaj ITDPVVMSLHHTVGDVFEAKSRHGFSGIPITETGKMGSKLMGIVTSRDIDFLSEKDYDTALNEVMTKREDLVVAPAGVTLKEANGILQRSKKRKVPIVNDMDELISIIARTDLkkn >IMPDH1_petMar ITDPVVLSPAHTVGDVLEAKARHGFSGIPITETGRMGGRLAGVITSRDIDFLTEKDHGIPLHQVMTKREDLVVAPSGVTLTEANELLQRSKKGKLPIVNEAGELVAIMARTDLKKN >IMPDH2_homSap ITDPVVLSPKDRVRDVFEAKARHGFCGIPITDTGRMGSRLVGIISSRDIDFLKEEEHDCFLEEIMTKREDLVVAPAGITLKEANEILQRSKKGKLPIVNEDDELVAIIARTDLKKN >IMPDH2_musMus ITDPVVLSPKDRVRDVFEAKARHGFCGIPITDTGRMGSRLVGIISSRDIDFLKEEEHDRFLEEIMTKREDLVVAPAGVTLKEANEILQRSKKGKLPIVNENDELVAIIARTDLKKN >IMPDH2_bosTau ITDPVVLSPRDRVRDVFEAKARHGFCGIPITDTGRMGSHLVGIISSRDIDFLKEEEHDRLLGEIMTKREDLVVAPAGITLKEANEILQRSKKGKLPIVNENDELVAIIARTDLKKN >IMPDH2_galGal ITDPVVLSPNDRVRDVFEAKARHGFCGIPITDNGKMGGKLVGIISSRDIDFLKESEHDLPLGEIMTKREDLVVAPSGVMLKEANEILQRSKKGKLPIVNEDDELVAIIARTDLKKN >IMPDH2_ornAna ITDPVVLSPKDRVRDVFEAKARHGFCGIPITDNGKMGSRLMGIISSRDIDFLKEEEHDLYLGEIMTKWEDLVVAPAGVTLKEANEILQRSKKGKLPIVNEDNELVAIIARTDLKKN >IMPDH2_xenTro ITDPVVLSPKHRVRDVFEAKARHGFCGIPITENGKMGSKLAGIISSRDIDFLKSEEHDLALSEIMTRREDLVVAPAGVTLKEANEILQRSKKGKLPIVNGNDELVAIIARTDLKKN >IMPDH2_calMil ITDPVVMSPSEKVRDVFEAKARHGFSGIPITDNGKMGGKLKGIISSRDIDFLTEKEHDLLLSEVMTRREDLVVAPAAVTLKEANEILQRSKKGKLPIVNDNDELVAIIARTDLKKN >IMPDH2_leuRaj ITDPVVLSPRHRVRDVFEAKARHGFCGIPITDTGTLGGRLAGIISSRDIDFLEESEQELPLEQVMTRREELVVAPAGVRLKEANEILQRSKKGKLPIVNEQDQLVAIISRTDLkkn >IMPDH_braFlo IMDPIVLSPEHTVGDVCEMKRKHGFSGIPITENGKLGGKLLGIVTSRDIDFMNSDHHHIKLRDVMTPFEELVVGHAGVSLKEANETLQRSKKGKLPIVNENDELVSLIARTDLKKN >IMPDH_cioInt IQNPVTLGPKATVRDVTDVKAMYGFSGIPVTDDGTPTGKLIGLVSSRDFDFLKPEESNTPLEQVMTGRDKLITADTSVTLQEANHILSQSKKGKLPIVDADDRLVSLIARTDLKKN >IMPDH_strPur IMDPVVLGPNDTVGDVFGSKAKHGFSGIPITDTGRLGGKLLGIVTARDIDFLKPESYVKPLSTAMTCREDLVVAPANVTLKQANDLLQKAKKGKLPIVNEKDELVSLISRTDLKKH >IMPDH_sacKow IMDALVMSANTTIKEVFAAKSQHGFSGIPITDNGKLGGRLLGIVTARDIDFVEPEFNDKPLEQFMTKREDLVVAPANVTLKEANDILQKSKKGKLPIVNENDELVSLISRTDLKKH >IMPDH_schMed IVDPVVMSIDNTVGEVMNIKIKNGFTGIPITDNGKLGGKLVGLVTLRDIDFLPKSNWDLKVSEVMTPFENLITAPYGVALHEANGILQRSKKGKLPIINENRELVALISRTDLKKN >IMPDH_schMan ILSPVVVSPRQPIYDIIEIKKKYGFGGIPVTEDGYMGSRLVGLVTLRDVDFLDPNDFNTPVEKVMTPFDDLVTAFSCVTLSEANDLLRKSKKGKLPIINENRELVALIARTDLQKN >IMPDH_ixoSca IHDPVVLSPNNCVADVFEVKRKHGFAGVPITENGKLGGKLVGMVTSRDIDFIPIEDHNRLLSEVMTSLKDLTVASSKVTLSEANSLLQKSKKGKLPLVNEGGELVSLIARTDLKKS >IMPDH_nemVec IMAPLVLSATNTVADVIDAKQRHGFSGIPITENGQLGGILQGIVTSRDIDFLHGVENHKQLGEVMTRLEDLVVAKAGITLNEANKILQMSKKGKLPIVNEKGELVSLIARTDLKKN