Blog

Induced Pluripotent Stem Cells (iPSC) for Variant Biology Discovery – Dramatic Increases in Efficiency Are Starting to Occur!

guest co-author: Trisha Brock

The use of iPSC cells is quickly gaining momentum as tool of personalized medicine

Whether you call them stem-cell-like cells with unique expression signatures (Chin 2009), or if you stick with the seminal publication and call them induced Pluripotent Stem Cells (Takahashi 2007), iPSC technology is becoming an impressive system for helping understand genome-encoded disease biology.  We can expect iPSCs to have major impact in regenerative medicine, disease modeling, and drug development.

iPSC technology is a relatively new technique for variant profiling.  

Since inception with demonstration in mice (Takahashi 2006), progress has occurred rapidly. Original systems were developed using retroviral integration tools, which have high risk of chromosomal instability and tumorigenesis. Then, in the last 10 years, significant progress has been made to improve the process by making the technique non-integrative and more efficient (Omole and Fakoya 2018).  Genomic integration methods using viruses have traditionally yielded the highest efficiency, but non-integrative methods are quickly catching up. The original methods used a standard set of gene expression modifiers: Oct3/4, Sox2, Klf4, and c-Myc – typically referred to as the “OSKM cocktail.” These are encoded in viruses and plasmids that are transfected into the cell and integrate into the genome. This one time procedure contrast with non-integrating methods, which require repeat transfections to boost cells towards pluripotency (Warren 2010).  Once induced into the pluripotent state, the cell can be modified (CRISPR-based gene editing or similar methods) to establish control lines with and without variant in question. Finally cells are differentiated into desired tissue by exposure to appropriate tissue-specification factors. With this capacity to analyze patient derived tissue, you can feel fairly certain the biological differences seen between control and variant will translate well to the patient’s specific biology.

Adapted from El Hokayem 2016

Speed to variant biology data requires 2 to 5 months

Relative to rodent animal models, the system is fast.  Most human cell cultures can be enticed to achieve the proper pluripotent expression profile after about 4 weeks of growth and then there is another 2 months or more to induce desired tissue type.  For instance, in a popular method, DeRosa et al. used blood-derived Sendi-virus-transformed iPSCs to derived cortical neurons and test them for biological consequence with various functional assays (DeRosa 2018).   Biomarkers for transformation into iPSCs were cell-stain-cofirmed with antibodies for NANOG, Oct 3/4, and SOX2. Next, cells were differentiated to desired cortical neuron tissue type by exposure to relevant co-factors and monitored for sequential biomarkers production (Nestin to DCX to CAMK2A to TBR1) and finally, at 90 days, the last set of biomarkers was used (MAP2 and SYNAPSIN1).  The result, the creation of iPSCs and their differentiation into appropriate tissue types takes about 3 months or more, depending on the desired cell type needed (McKinney 2017 and DeRosa 2018).

Low efficiency problem of easily sourced iPSCs is showing signs of dramatic improvement

Low efficiency in creation of iPSCs has been the main drawback preventing routine use as a clinical diagnostic.  Efficiency is measured as number of iPSC cells obtained after dividing by the number of input cells prior to starting transformation. Efficiencies vary widely with multiple reports range from 0.1% to 0.001% (Malik and Rao 2013). The measured efficiency is heavily influenced by choice of starting material. Fibroblast show some of the highest efficiencies but typically require a biopsy plug from the skin to enable isolation of sufficient amounts of starting cells. Yet, researchers are hard at task working on conditions that improve efficiency and find easier to source material.  Two years back, a report published conversion of fibroblasts had improved to an efficiency near 3% (Pomeroy 2016). Isolation of iPSC is more convenient to the patient source of blood sources (PBMCs) but efficiency remain low at 0.15–0.32% (Zhou 2015). Generation of iPSCs from PBMCs is problematic because they are non adherent. For human cells, the lack of adherence is especially problematic because it leads to high activation of cell death through apoptosis. Researcher are finding the use of adhesion promoting matrices (Geltrex and rhLaminin-521) and ROCK inhibitors of apoptosis can greatly improve the efficiency process for making iPSCs from blood sources (Ye 2018). Additional recent developments are highly encouraging. For a nearly nonexistent invasiveness to the patient, researchers have demonstrated iPSC can be sourced from patient urine (Gaignerie 2018). Further, in a very impressive efficiency feat, a meticulous study of fibroblast cell conversion was performed by sampling a wide variety of optimizing conditions (Kogut 2018).  Using mRNA encoded transformation factors with a select set of microRNA inhibitors of transcription, the Kogut team demonstrated an amazing 800% efficiency in fibroblast conversion to iPSC. Additionally, they found they could isolate single cells and get conversion to iPSC in 90% of the isolates. Intriguingly, in an almost counter-intuitive finding, efficiency of fibroblast conversion started to drop dramatically when more than 1000 source cells were present at the starting conditions. Finally, their speed to conversion of source to iPSCs was only 15 days. In conclusion, it appears iPCS methodologies are breaking through the efficiency barrier and soon the use of patent-derived iPSCs will become part of routine clinical diagnostic procedures.

Transplantation of iPSCs as therapeutics is challenged by tumorigenics issues.

Three main applications of iPSC are its promise for regenerative medicine, disease modeling, and drug development, yet transplantation for regenerative medicine has issues of tumorigenesis to overcome (Focosi 2018).  The original work by Takahashi was plagued by significant tumorigenicity, because the method uses genomic-integrating viral vectors with transformation factors known for their tumorigenesis potential – mouse iPSC derived from the method would result in tumors at 20% of the time when the cells were reintroduced into mice (Omole and Fakoya 2018). Switching to a transformation cocktail that avoids the use of cMyc (OSKM to OSNL) reduced tumor formation rates.  Yet the alternative transformation factors do not eliminate tumorigenesis, possibly due to genomic integration causing insertional mutagenesis. To reduce tumorigenicity even further, researchers have been switching to non-integrative methods. Yet even then, a low level of tumorigenic capacity remains – apparently inherent to iPSC cells “totally potent” status, where they can become many different types of tissue, including cancerous ones. As a result, when differentiating iPSC into tissue for reintroduction into the patient as treatment, the FDA remains concerned about the tumorigenic potential of any cells that remain in the iPSC state. Current thoughts are that a 10-fold passage of a differentiated cell population may effectively eliminate the tumorigenic risk, yet this need for high passage number has tempered the enthusiasm for use of iPSCs for regenerative medicine applications.

Polygenic Consequence – iPSCs from patient tissues has unique advantage that the multiple “Risk Factors” variations of the patient background are retained.

One big advantage to using patient-derived tissue is the genetic background of the test system is exactly as occurs in the patient. For instance, DeRosa and team studied Autism-related variants using patient-derived iPSCs differentiated into neuronal tissues (DeRosa 2018). They looked at 6 patients with variants in target genes suspected of involvement in Autism Spectrum Disorder.  Genomic data was provided on 5 of the patient conditions (see Table below). All suspect variants were observed as heterozygotes. As a result, clear pathogenicity of any variant is lacking, when referenced against existing databases sources. Nevertheless, they saw clear phenotypic consequence of all 6 cell lines examined in their studies (RNA-seq, multi-electrode array recordings, spontaneous calcium transients and scratch recovery assays).

In what might be a highly recommended next step, the De Rossa authors could use CRISPR on their iPSC lines to make isogenic controls.  For instance, on cell line 377110, the VPS123B variation is a prime candidate for using CRISPR-Cas9 gene editing techniques to reverse the rs28940272 locus back to Asparagine.  If this isogenic control behaved as wildtype in all the phenotyping assays deployed, then the authors would have generated a definitive demonstration that the Asn2993Ser variant in VPS13B is pathogenic.

Modeling functional defects in autism variants could be done in C elegans (see Table below).  5 of the 8 genes in the DeRosa work have homology that exceeds 42%. The PRICKLE1 gene has sufficient similarity. Either its Val57Phe or Glu185Ter variations could be installed into a PRICKLE1-humanized animal model.  The Glu185Ter would very likely model a loss-of-function allele effect, but Val57Phe is a missense variant that could go either way. The change of valine to a phenylalanine may disrupt by either gain-of-function, or it may cause a loss-of-function leading to quite different phenotype in functional assays. Only testing directly in an a model system will be the way to get the needed definition of mechanism of action.

Modeling epilepsy with iPSCs requires 135 day preparation protocol

iPSC cell technology will likely become a widespread tool of personalized medicine and it holds significant promise for neuronal regeneration (Wu 2018).  Although it suffers from challenges of asynchronicity, tumorigenicity, timeliness, and low efficiency (Vitrac 2018, Omole and Fakoya 2018), much progress has been made, as attested with this article’s focus on the DeRosa paper.

To list aspects of concern and advantage with iPSC tech, we have:

  • Tumorigenicity and Immunogenicity.  This is especially an important concern in regards to regenerative medicine uses of iPSC.  Perhaps the most promising approaches involve the use if various RNA molecules to provide the requisite reprogramming factors at minimal potential for tumorigenicity and immunogenicity.
  • Clinical Diagnostic.  Sendai virus tech seems to be one of the most promising methods.  Stability reprogramming factors is high, so repetitive redosing is not as problematic.  Also sourcing from easy to acquire primary culture (blood, urine, and saliva) is promising for minimizing the procedure’s invasiveness to the patient.
  • Isogenic Control.  CRISPR-Cas9 tech can be used to return a suspect variant back to wild type.  Functional studies on the variant strain and its isogenic control will determine if the variant is pathogenic or benign.

Chin MH et al. Induced pluripotent stem cells and embryonic stem cells are distinguished by gene expression signatures. Cell Stem Cell. 2009 Jul 2;5(1):111-23. doi: 10.1016/j.stem.2009.06.008.  https://www.ncbi.nlm.nih.gov/pubmed/19570518    

DeRosa BA et al. Convergent Pathways in Idiopathic Autism Revealed by Time Course Transcriptomic Analysis of Patient-Derived Neurons. Sci Rep. 2018 May 30;8(1):8423. doi: 10.1038/s41598-018-26495-1. https://www.ncbi.nlm.nih.gov/pubmed/29849033

El Hokayem J et al. Blood Derived Induced Pluripotent Stem Cells (iPSCs): Benefits, Challenges and the Road Ahead. J Alzheimers Dis Parkinsonism. 2016 Oct;6(5). pii: 275. doi: 10.4172/2161-0460.1000275. Epub 2016 Oct 25. https://www.ncbi.nlm.nih.gov/pubmed/27882265

Focosi D and Amabile G. Induced Pluripotent Stem Cell-Derived Red Blood Cells and Platelet Concentrates: From Bench to Bedside. Cells. 2017 Dec 27;7(1). pii: E2. doi: 10.3390/cells7010002. https://www.ncbi.nlm.nih.gov/pubmed/29280988

Gaignerie A, Lefort N Rousselle M, Forest-Choquet V, Flippe L, Francois-Campion V, Girardeau A, Caillaud A, Chariau C, Francheteau Q, Derevier A, Chaubron F, Knöbel S, Gaborit N, Si-Tayeb K, David L. Urine-derived cells provide a readily accessible cell type for feeder-free mRNA reprogramming. Sci Rep. 2018 Sep 25;8(1):14363. doi: 10.1038/s41598-018-32645-2. https://www.ncbi.nlm.nih.gov/pubmed/30254308

Pomeroy JE, Hough SR, Davidson KC, Quaas AM, Rees JA, Pera MF. Stem Cell Surface Marker Expression Defines Late Stages of Reprogramming to Pluripotency in Human Fibroblasts. Stem Cells Transl Med. 2016 Jul;5(7):870-82. doi: 10.5966/sctm.2015-0250. Epub 2016 May 9. https://www.ncbi.nlm.nih.gov/pubmed/27160704

Kogut I et al. High-efficiency RNA-based reprogramming of human primary fibroblasts. Nat Commun. 2018 Feb 21;9(1):745. doi: 10.1038/s41467-018-03190-3. https://www.ncbi.nlm.nih.gov/pubmed/29467427

Malik N and Rao MS. A review of the methods for human iPSC derivation. Methods Mol Biol. 2013;997:23-33. doi: 10.1007/978-1-62703-348-0_3. https://www.ncbi.nlm.nih.gov/pubmed/23546745

McKinney CE. Using induced pluripotent stem cells derived neurons to model brain diseases. Neural Regen Res. 2017 Jul;12(7):1062-1067. doi: 10.4103/1673-5374.211180. https://www.ncbi.nlm.nih.gov/pubmed/28852383

Omole AE and Fakoya AOJ. Ten years of progress and promise of induced pluripotent stem cells: historical origins, characteristics, mechanisms, limitations, and potential applications. PeerJ. 2018 May 11;6:e4370. doi: 10.7717/peerj.4370. eCollection 2018. https://www.ncbi.nlm.nih.gov/pubmed/29770269

Takahashi K and Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell. 2006 Aug 25;126(4):663-76. Epub 2006 Aug 10. https://www.ncbi.nlm.nih.gov/pubmed/16904174

Vitrac A and Cloëz-Tayarani I. Induced pluripotent stem cells as a tool to study brain circuits in autism-related disorders. Stem Cell Res Ther. 2018 Aug 23;9(1):226. doi: 10.1186/s13287-018-0966-2. https://www.ncbi.nlm.nih.gov/pubmed/30139379

Warren L et al. Highly efficient reprogramming to pluripotency and directed differentiation of human cells with synthetic modified mRNA. Cell Stem Cell. 2010 Nov 5;7(5):618-30. doi: 10.1016/j.stem.2010.08.012. Epub 2010 Sep 30. https://www.ncbi.nlm.nih.gov/pubmed/20888316

Wu S et al. On the Viability and Potential Value of Stem Cells for Repair and Treatment of CentralNeurotrauma: Overview and Speculations. Front Neurol. 2018 Aug 13;9:602. doi: 10.3389/fneur.2018.00602. eCollection 2018. https://www.ncbi.nlm.nih.gov/pubmed/30150968

Zhou H, Martinez H, Sun B, Li A, Zimmer M, Katsanis N, Davis EE, Kurtzberg J, Lipnick S, Noggle S, Rao M, Chang S. Rapid and Efficient Generation of Transgene-Free iPSC from a Small Volume of Cryopreserved Blood. Stem Cell Rev. 2015 Aug;11(4):652-65. doi: 10.1007/s12015-015-9586-8. https://www.ncbi.nlm.nih.gov/pubmed/25951995

Child Neurology Meeting: Bernard Sachs Award to William B. Dobyns and Precision Medicine in Epilepsy Lecture

The Child Neurology Society honored William (Bill) Dobyns for his highly impactful efforts in characterizing child neurology. In a prolific and highly influential carrier, Bill talked about how the key influencers in his life shaped the directions he pursued.  Reinforcing the title of his talk “The Names of Things, ” Bill was the first to discover and name the LIS1 gene of lissencephaly (Dobyns 1993).

Whats in a name?

A spicy little comment was made by Bill. He let it be known he did not appreciate the HUGO Nomenclature Committee’s decision to rename LIS1 to PAFAH1B1. Definitely a mouthful and a change that does not seem necessary, since LIS1 appears to have no confusion in Pubmed literature searches.

To worm it or not – Can comparative biology can help?

It appears gene humanization in the nematode may be of use in characterizing lissencephaly genes. A quasi-random sampling of 9 genes mentioned in Bill’s presentation was obtained (all the ones I wrote down in my notes!). Next, a table of properties was made:

We first ask: What is the homology to C. elegans? 8 of 9 genes are identified as homologs. All but the AUTS2 are found to have a worm gene equivalent, as scored for similarity percentage using the DIOPT website. Next, we ask Which of these homologs have sufficient similarity for application of a gene-swap technique? Using inspiration from Douglas Adams for a rather arbitrary cutoff of 42, we get 6 of 8 homologs with enough similarity that gene substitution with human cDNA has a chance of retaining function and rescuing a null phenotype. We are currently 3 for 3 in genes tried for gene-swap in epilepsy – STXBP1, KCNQ2, CACNB4 – all can function and rescue their respective C. elegans null. Finally, we ask Is there something to rescue in the Dobyns gene list? 4 of the 6 good homologs are found to be lethal (LIS1, MTOR, ZIC1 and PIK3R2), when removed by either by genetic deletion and or RNAi knock-down.  For the remaining two genes (ATK2 and PC2CA), although not appearing to be lethal, they have detectable phenotypic consequence when their homolog is absent in the nematode.  Bottom line, 6 of 9 have good prospects for using comparative biology to answer variant pathogenicity questions.

Symposium IV: Precision Medicine – Epilepsy, The Next Frontier

This session was of high interest to the geneticist wanting to design high-throughput animal models for use in rare disease discovery. Jeffery Loeb opened with a “Big Data” discussion on linking electrode implantation in the brain with various properties of tissue biopsies (genomics, transcriptomics, metabolomic, and histological data). It was a fascinating and bewildering look at the 1700 genes that are associated with bursting/spiking phenomena in the brain. Next was Maduri Hedge who took us on a tour of what is the latest and greatest in whole genome analysis – basically what is happening in the 5342 OMIN-identified disease genes. Much of the answer is we do not know yet, because there is an ever expanding universe of Variants of Uncertain Significance (VUS), fueled by the fact that each person averages 3,000,000 differences in sequence when compared to reference human genome. (Hedge 2017).  With 1% of genome as coding, this suggest 30,000 variations occur in exonic coding sequence.  The amount of this coding sequence variation that is known pathogenic is much less.  Filtering out high frequency alleles and checking against databases where variant call is either pathogenic or likely-pathogenic, we get an average of 2.1 pathogenic/likely-pathogenic variants per person (Rego 2018). The same data reveals each person can expect another 17.6 variant alleles as either VUS or as unassigned in status.

Demonstration of rapid pathogenicity assessment of a VUS allele.

Next presentation was a great tag-team duo of Alfred George and John Millichap speaking on the rapid pathogenicity assessment of a VUS in KCNQ2 gene. Its title, “Emerging Paradigms for Precision Medicine in Early Life Epilepsy: Bed-to-Bench Workflow” exemplified where genomic biology is now starting to hold focus. They showed data for a three week turn-around in providing functional data on a patient variant.  Their data demonstrated a newly-discovered variant in KCNQ2 behaves as a loss-of-function allele.  In a very short time, from genomic data to clinically-actionable results, the patient had a genetic mechanism revealed for their illness and drug treatment options became more clear. This same variant, by the way, was assessed as a VUS call, 3 months later when the genomics lab finished applying the traditional ACMG-AMP guideline assessment criteria.  Obviously functional studies deployed quickly after genomic data acquisition can be quite impactful in helping find treatment choices for the clinician and their patient.

Time to Get Organized

Finally, wrapping up what proved to be a lively lecture series and a very long QnA afterward, we had Anne Berg give us a plea for coming together as a community and doing what has been done in cancer, and with Spinraza on SMA – Find ways to work together, get DNA diagnostics done as quickly as possible and then screen variations for function consequence.  There is a big need for an organized effort in epilepsy.  Early intervention in neurological diseases has the capacity to create a dramatic improvement in outcome. Anne provided us some data to help understand how big is the problem:

The size and coordination of effort in cancer has recently been giving us miraculous gains. For some types of cancer, what was nearly a death sentence 10-20 years ago is now, in many of today’s oncology practices, a responsive and well-controlled disease. Anne feels it is time for a profound shift in the way epilepsy is approached and treated. We need to apply the same rigor and interactivity that drove the success of cancer biology understanding. When it come to pediatric populations, cancer is less significant than epilepsy.  There are 29,000 new pediatric epilepsy cases each year, while pediatric cancer cases occur at 13,000 per year – a more than 2x lower in rate.  The complexity of disease types between epilepsy and cancer is of similar size, so systematically fostering the level of researcher-clinician interaction as seen in cancer biology should be highly productive for epilepsy understanding and discovery.  Ultimately the epilepsy ecosystem is ripe for new approaches that can be adequately and rapidly address the complexity of epilepsy biology and uncover new therapeutic modalities.

Dobyns WB et al. Lissencephaly. A human brain malformation associated with deletion of the LIS1 gene located at chromosome 17p13. JAMA. 1993 Dec 15;270(23):2838-42. https://www.ncbi.nlm.nih.gov/pubmed/7907669

Hegde M, Santani A,Mao R,Ferreira-Gonzalez A,Weck KE and Voelkerding KV. Development and Validation of Clinical Whole-Exome and Whole-Genome Sequencing for Detection of Germline Variants in Inherited Disease. Arch Pathol Lab Med. 2017 Jun;141(6):798-805. doi: 10.5858/arpa.2016-0622-RA. Epub 2017 Mar 31. https://www.ncbi.nlm.nih.gov/pubmed/28362156

Shannon Rego, Orit Dagan-Rosenfeld, Wenyu Zhou, M. Reza Sailani, Patricia Limcaoco, Elizabeth Colbert, Monika Avina, Jessica Wheeler, Colleen Craig, Denis Salins, Hannes L. Rost, Jessilyn Dunn, Tracey McLaughlin, Lars M. Steinmetz, Jonathan A. Bernstein, Michael P. Snyder.
High Frequency Actionable Pathogenic Exome Mutations in an Average-Risk Cohort. bioRxiv 151225; doi: https://doi.org/10.1101/151225

Worming into Relevance – Human disease models in the C. elegans nematode

The philosopher Friedrich Nietzsche once said:

“You have evolved from worm to man, but much within you is still worm.”

Genetic diversity in individuals and between species is responsible for bewildering variability and biological niche adaptation of life, yet much of the essential genes involved in disease presentation are highly conserved from yeast to humans. For instance, the direct comparison C. elegans genome to H. sapiens reveals only 44% of the genes are similar (Shaye and Greenwald 2011).  Yet when one restricts the comparison to the 6460 genes known to be associated with genetic disease (1/3rd the human genome), clear similarity (orthology) to C. elegans occurs for 79% of the human disease genes (ClinVar database). This high degree of interspecies conservation between worm and human has recently become more recognized and appreciated for use in disease biology understanding (Golden 2017, Wang 2017, Wangler 2017, Apfeld and Alpers 2018)

Undiagnosed Disease

The Undiagnosed Disease Network (UDN) has been making significant strides in rare disease research and holds an emphasis on developing animal models for use as tools of variant biology discovery (Splinter 2018). Tim Schedl at Washington University in St Louis is now a recent addition to the UDN.  Tim will run the C elegans nematode Model Organisms Screening Center (MOSC). The worm is proving to be a useful tool for uncovering the disease related biology (Bend 2016, Wangler 2107, Luo 2017, Chao 2017, Oláhová 2018, Liu 2018, Guiberson 2018). Perhaps the biggest place where the nematode holds promise is its proven capacity for high-thoughput screening (Leung 2013, Rangaraju 2015, O’Reilly 2016, Lucanic 2018, Partridge 2018).  The Leung publication achieved an impressive 340,000 compounds tested in 1536-well format in 5 weeks.  If we can find the systems that accurately model human disease in the nematode animal model, we will have developed a system ripe for high-throughput discovery of new pharmaceuticals.

Can we model clinical variant biology in the worm?

Figure 1. A) A clinical variant is installed into native homolog of the nematode. B) Position of STXBP1 pathogenic variants in the nematode unc-18 gene.

Using CRISPR and related techniques, it is now quite easy to insert DNA changes into the genome of animal models. For example, our company just hit a milestone of 2000 transgenics delivered. But the puzzle now becomes are we doing it right. What is the best transgenic configuration to address disease biology?  We can put a clinical variant in the native gene of the nematode (Figure 1A). Although there is a large amount of similarity when making gene-to-gene connections in disease genes, a question arrises: is there enough similarity at the amino acid level?  When one maps out the location of know pathogenic variants to their locations at the amino acid sequence, one occasionally finds that established or likely-pathogenic variants do not occur at a conserved position.  For instance, in a sequence occuring in the middle of the C. elegans STXBP1 gene, three variants suspected of pathogenicity are shown (Figure 1B).  The patient variants R406H and L426P are at conserved positions in the worm unc-18 homolog. Unfortunately M443R does not have the identical base in C. elegans.  If we insert a Methionine (“M”) in for the worm’s Alanine will we get function?  …Probably.  If we put in an Arginine instead (“R”) will the gene work? …Probably not.  Yet, we are left with a bit of ambiguity because location is not identical between species. A more robust way to model human biology is to put in the entire human gene into the nematode genome.  To explore this concept, we have developed a technique that we call “gene-swap humanization” (Figure 2). We remove the endogenous gene of the worm and replace it with a cDNA copy of the human gene.  This creates a gene-humanized animal.

Figure 2. human gene is swapped in for native sequence then observe if it can retain function.

We have only just started, but we have uncovered an interesting finding:  Gene-swap humanization appears to be better than native locus for use as platform for modeling disease variant biology.  We decided to compare variant installs in the native locus to installs in the humanize locus. The main driver to try this out was sequence similarity, or more precisely, the lack of it.  When one compares the human to mouse for the STXBP1 gene, one see 100% identical amino acid usage between the two species.  Contrast this to human sequence to the worm’s unc-18 gene – one sees only 59% identical amino acid usage.  With nearly half the amino acids not being fully conserved, can we model in the native locus effectively, or….

What will happen if we swap-in the human gene as a gene replacement of the worm gene?

The new gene editing CRISPR technology is allowing us to address this question. Genes are now quite easy to order up as fully synthetic.  The DNA sequence comes in a plasmid format that we can inject, with the right CRISPR components, directly into the gonad of the worm. The progeny of the animal will have a high propensity to have picked up the genome edit. We use PCR technology to find the edited animal and then grow up the population for testing in various activity assays (Figure 3).  From concept to variant-specific animal data in less than one month.

Figure 3.  comparison of native vs humanized locus.  Electrophysiology of A) native vs B) humanized locus. Chemotaxis assay of C) native vs D) humanized locus.

What we found fascinating were two finding:

1. Human gene retains function in worm.

2. Variants in human gene render the worm sensitized for pathogenicity assessment.

That second observation begs more explanation.  In figure 3, you will see two types of phenotyping assays performed on three known pathogenic variants of STXBP1 (R292H, R406H and R388X). When we put these patient-variants into the native locus at their sequence conserved positions, we see only the R388X gives a big deviation of function.  Alternatively, when we put these variants in the gene-swapped hSTXBP1 locus, we see that all variants exhibit a statistically significant deviation of function in both assays. As a result, it appears the gene-swap humanized locus is better choice for modeling clinical variants.

“What about that first observation?” Lets not overlook that one! This shows that for this particular protein, its physiological role is highly conserved between worm and man. She/he (worms are self-fertilizing hermaphrodites – more on that later…) is giving data that suggest modeling variants in humanized worms can be useful as a platform for variant biology discovery!

Figure 4. Assay differences and similarities within and between genes. A) Electrophysiology assays on STXBP1 exhibits gain of function and loss of function phenotypes. B) Locomotion assay detects some loss of function activity. C) Chemotaxis assay see all variants as loss of function alleles. D) KCNQ2 Electrophysiology see gain of function when the kqt-1 homolog is removed.

We have found one more interesting finding in our humanized strain studies.  The types of phenotyping assays used on a clinical variant will depend on the unique biology of that variant.  In figure 4, we have three different activity assays.  What should jump out at you is that each assay has a unique response. The R292H and R406H variants have gain-of-function response in the electrophysiology, while the R388X has lost function.  In the next assay, the animal locomotion measurement, we see wild-type activity in R292H, while R406H and R388X are exhibiting movement deficiencies.  In the last highly-sensitive chemotaxis assay, only R292H retains an ability to sniff out and go to food – the remaining two, R406H and R388X, are not capable of getting to food in the 1 hr test period of the assay.  One last observation for this figure –  there is high similarity in electrophysiology response for the KCNQ2 knock-out and the R292H and R406H variants in STXBP1. All three (KCNQ2 knock-out and the two STXBP1 variants) show gain-of-function response.  This is expected if all these variants lead to M-current modulation of potassium channels. What we speculate based on prior findings (Devaux 2016) is the R292H and R406H are defective in chaperone function of syntaxin.  They no longer prevent syntaxin from being an inhibitor of potassium channels and effectively these STXBP1 variants have a physiological mode of action leading to sodium channel inhibition. Prediction: ezogabine will be useful on patients with hSTXBP1(p.R292H) and hSTXBP1(p.R406H) as a treatment for reversing their levels and intensity of epilepsy seizures, but probably be contraindicated for hSTXBP1(p.R388X) clinical variants.

Biosensors

Figure 5. Genetically-encoded biosensors of transcriptional activity induced by patient variant.

The last component of this blog post is to discuss formats for making the animals perform drug screening in high-throughput formats. Loss of locomotion can be a powerful screen, but what might be more sensitive is the use of genetically-encoded biosensors. The biosensor formats we favor most are the transcriptionally-activation reporter systems (Figure 3). Typically these are GFP/RFP combos where one reporter is driven by a promoter that is either up regulated or down regulated upon being in a particular genetic background. Loss-of-function alleles can have a myriad of gene transcriptional effects and biosensor reporters have previously been used to dissect physiological pathway interactions in C. elegans (Urano 2002, Lea 2005, Kahn 2008).

Figure 6. Two biosensors 

We have used biosensors to probe drug response in nuclear hormone receptor signaling (Figure 6). We compared biosensor responses to assays of behavior.  Perhaps what is most striking about the data is the sensitivity.  In the dafadine assay we see nearly 10x higher sensitivity with the biosensor based assay.  From a drug discovery standpoint, this means 10x less compound is need in order to see an effect.  For the dafachronric acid assay, the result is even higher – 30x higher than the biological activity output.  For those who do drug discovery, the application of biosensors to clinical-variant profiling will be worth the investment in biosensor construction – cost saving will be obtained from reduced consumption of the expensive compounds used in a drug library.

Apfeld J et al. What Can We Learn About Human Disease from the Nematode C. elegans? Methods Mol Biol. 2018;1706:53-75. doi: 10.1007/978-1-4939-7471-9_4. https://www.ncbi.nlm.nih.gov/pubmed/29423793

Bend E et al. NALCN channelopathies: Distinguishing gain-of-function and loss-of-function mutations. Neurology. 2016 Sep 13;87(11):1131-9. doi: 10.1212/WNL.0000000000003095. https://www.ncbi.nlm.nih.gov/pubmed/27558372

Devaux, J. et al. A possible link between KCNQ2- and STXBP1-related encephalopathies: STXBP1 reduces the inhibitory impact of syntaxin-1A on M current. Epilepsia. 2017 Dec;58(12):2073-2084. doi: 10.1111/epi.13927. https://www.ncbi.nlm.nih.gov/pubmed/29067685

Golden A. From phenologs to silent suppressors: Identifying potential therapeutic targets for human disease. Mol Reprod Dev. 2017 Nov;84(11):1118-1132. doi: 10.1002/mrd.22880. Epub 2017 Oct 3. https://www.ncbi.nlm.nih.gov/pubmed/28834577

Kahn NW et al. Proteasomal dysfunction activates the transcription factor SKN-1 and produces a selective oxidative-stress response in Caenorhabditis elegans. Biochem J. 2008 Jan 1;409(1):205-13. https://www.ncbi.nlm.nih.gov/pubmed/17714076

Leung CK et al. An ultra high-throughput, whole-animal screen for small molecule modulators of a specific genetic pathway in Caenorhabditis elegans. PLoS One. 2013 Apr 29;8(4):e62166. doi: 10.1371/journal.pone.0062166. Print 2013. https://www.ncbi.nlm.nih.gov/pubmed/23637990

Liu, N. et al. Functional variants in TBX2 are associated with a syndromic cardiovascular and skeletal developmental disorder. Hum Mol Genet. 2018 Jul 15;27(14):2454-2465. doi: 10.1093/hmg/ddy146. https://www.ncbi.nlm.nih.gov/pubmed/29726930

Lucanic M et al. A Simple Method for High Throughput Chemical Screening in Caenorhabditis Elegans. Journal of visualized experiments : J Vis Exp. 2018 Mar 20;(133). doi: 10.3791/56892. https://www.ncbi.nlm.nih.gov/pubmed/29630057

Luo X et al. Clinically severe CACNA1A alleles affect synaptic function and neurodegeneration differentially. PLoS Genet. 2017 Jul 24;13(7):e1006905. doi: 10.1371/journal.pgen.1006905. eCollection 2017 Jul. https://www.ncbi.nlm.nih.gov/pubmed/28742085

Oláhová, M. et al. Biallelic Mutations in ATP5F1D, which Encodes a Subunit of ATP Synthase, Cause a Metabolic Disorder. Am J Hum Genet. 2018 Mar 1;102(3):494-504. doi: 10.1016/j.ajhg.2018.01.020. https://www.ncbi.nlm.nih.gov/pubmed/29478781

O’Reilly L et al. High-Throughput, Liquid-Based Genome-Wide RNAi Screening in C. elegans. Methods Mol Biol. 2016;1470:151-62. doi: 10.1007/978-1-4939-6337-9_12. https://www.ncbi.nlm.nih.gov/pubmed/27581291

Partridge F et al. An automated high-throughput system for phenotypic screening of chemical libraries on C. elegans and parasitic nematodes. Int J Parasitol Drugs Drug Resist. 2018 Apr;8(1):8-21. doi: 10.1016/j.ijpddr.2017.11.004. https://www.ncbi.nlm.nih.gov/pubmed/29223747

Rangaraju S et al. High-throughput small-molecule screening in Caenorhabditis elegans. Methods Mol Biol. 2015;1263:139-55. doi: 10.1007/978-1-4939-2269-7_11. https://www.ncbi.nlm.nih.gov/pubmed/25618342

Rea SL et al. A stress-sensitive reporter predicts longevity in isogenic populations of Caenorhabditis elegans. Nat Genet. 2005 Aug;37(8):894-8. Epub 2005 Jul 24. https://www.ncbi.nlm.nih.gov/pubmed/16041374

Shaye DD and Greenwald I. OrthoList: a compendium of C. elegans genes with human orthologs. PLoS One. 2011;6(5):e20085. doi: 10.1371/journal.pone.0020085. Epub 2011 May 25. https://www.ncbi.nlm.nih.gov/pubmed/21647448

Splinter K et al. Effect of Genetic Diagnosis on Patients with Previously Undiagnosed Disease. N Engl J Med. 2018 Oct 10. doi: 10.1056/NEJMoa1714458. https://www.ncbi.nlm.nih.gov/pubmed/30304647

Urano F et al. A survival pathway for Caenorhabditis elegans with a blocked unfolded protein response. J Cell Biol. 2002 Aug 19;158(4):639-46. Epub 2002 Aug 19. https://www.ncbi.nlm.nih.gov/pubmed/12186849

Wang J et al. MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome. Am J Hum Genet. 2017 Jun 1;100(6):843-853. doi: 10.1016/j.ajhg.2017.04.010. Epub 2017 May 11. https://www.ncbi.nlm.nih.gov/pubmed/28502612

Wangler, M. F. et al. Model Organisms Facilitate Rare Disease Diagnosis and Therapeutic Research. Genetics. 2017 Sep;207(1):9-27. doi: 10.1534/genetics.117.203067. https://www.ncbi.nlm.nih.gov/pubmed/28874452

Zhou, P. et al. Novel mutations and phenotypes of epilepsy-associated genes in epileptic encephalopathies. Genes Brain Behav. 2018 Jan 4. doi: 10.1111/gbb.12456. https://www.ncbi.nlm.nih.gov/pubmed/29314583

KCNQ2 Cures Summit – Patient Family Passion and Involvement

The passion to find answers is inspiring.

I had the good fortune to attend the KCNQ2 Cure Summit 2018. Meeting the patient families and seeing their passion to find answers, brings meaning and urgency to the work I have been doing to develop high-throughput drug screening platforms.

image Credit: Chris Hopkins with permission of KCNQ2 patient family gathering

What can we do to help these families?

The first step I have been doing is to get more understanding.  KCNQ2 cure is a patient organization that strives to raise “research funds for KCNQ2 epileptic encephalopathy, a rare and catastrophic form of epilepsy beginning in the first days of life.” By nature of the name, the foundation focus on how DNA sequence variation in the KCNQ2 gene does or does not causes epilepsy.  For those variations that cause disease (pathogenic), the foundation supports research to find new treatments and therapies. As I previously blogged, KCNQ2 has 21% of the 855 variants in the NCBI database (https://www.ncbi.nlm.nih.gov/variation/view) as known or suspected to be pathogenic. About another 10% are known to be benign or likely benign. That leaves 69% variants for which we are scratching our head and feeling uncertain, or there nothing known about them.  The challenge to the expecting parent is “What is the likelihood I will be passing on a disease to my child?”

For the expecting parent what becomes important is to know the status of the KCNQ2 gene being passed on to the child. There are two ways to pass the genetic disease to our children. For some of the KCNQ2 variants, they are recessive – both copies of the gene will need to have these recessive pathogenic alleles in order to manifest disease.  It may be the father provides a chromosome where the KCNQ2 gene is defective with a pathogenic variation.  From the mother, her gene copy for KCNQ2 may contain a variant of uncertain significance (a “VUS” allele).  The chance that epilepsy will manifest in the child remains uncertain due to uncertainty of the mother’s VUS allele.

Yet a majority of variants in KCNQ2 are autosomal dominant. Unlike a recessive allele, only one bad copy is needed for the disease to manifest. In these cases, the mutation is almost never detected at high level in the parent.  Instead of being past down by the parent, the pathogenic variant is thought to have occurred by random chance and becomes manifest at conception. These are considered to be de novo variations because they are not detected to be present in one of the chromosomes of either parent – both the mom and dad appear to be homozygous wild-type. In many of these cases, a random mutation occurred when either the Dad’s sperm or the mother’s egg were formed. This happens because replication of DNA is not perfect. There is a very low frequency of DNA replication errors – close to 1 in every 1,000,000,000 base pairs (Pray 2008). That means that each gamete (sperm or egg) has at least 3 errors somewhere in the genome. Thankfully 98.8% of the genome is non-coding, so a change in coding sequence is quite rare. Add in that the genome size is 20,000 genes, it becomes very rare that the KCNQ2 gene picks up a mutation by chance.

What are the chances of that!

image Credit: rosendahl at flikr

Mosaic in the parent

Yet some de novo assigned conditions have much higher frequency than they should. A very interesting phenomenon that was illustrated multiple times by the speakers at the KCNQ2 Cure Summit was the issue of mosaicism. If getting a de novo variation is an ultra rare “roll-of-the-dice,” how is it that a surprising number of the KCNQ2 families with “de novo” origin have more than 1 effected sibling?  That should just not happen at chance if DNA replication is as good as we have measured it to be in laboratory tests.  A big component of the answer to that puzzle turns out to be a phenomenon called mosaicism (Weckhuysen 2012, Milh 2015, Mulkey 2017).  Lets now imagine a situation where the Dad may have pick up a random pathogenic mutation occurring shortly after he was conceived by his parents.  If the error occurred during the 3rd cell division event in utero, then 1/8th of his cells would have the pathogenic variant.  Now as a grown man and expecting father, 1 in 16 of his sperm harbor the pathogenic variant.  There is significant (non-rare) risk he can pass the variant on to more than one of his children.  Yet, because more than 80% of his cells do not have the variant, he remains dosage compensated and unaffected.  His child’s variation appears “de novo,” but it is actually being inherited at a less than mendelian frequency.  Dr. Ingrid Scheffer, who is helping write the definitions of what is epilepsy (Scheffer 2017), gave a presentation suggesting near 8% of the proband Mom and/or Dad on de novo assignments have evidence of mosaicism. She suggest high depth sequencing of exomes should become more important to the clinician in helping them get an accurate diagnosis.

Enhancer/suppressor effect

Another phenomenon that could alter the presentation of a gene’s phenotype is to have a compensating variation somewhere else in the genome.  This is the form of a classic suppressor effects taught in genetics courses. Another gene, usually upstream in the functional pathway, can have a mutation that suppresses the effect of the variant in question – a condition referred to as an epistasis.  For instance, a KCNQ2 variant in one person may lead to a pathogenicity that is quite severe, while another person having the same KCNQ2 variant presents with a less severe phenotype due to a compensating mutation somewhere else in the KCNQ2 signaling pathway.  Conceptually applied, if you had a Gain-of-Function (GOF) variant in KCNQ2 and a Loss of Function (LOF) variants in either SCN1a or STXBP1, this would could counter the bad effect and restore activity towards wild-type behavior.  This effect in ion channels has been recently documented and described (Noebels 2017). Dr Phillip Pearl, the Director of Epilepsy and Clinical Neurophysiology at Boston Children’s Hospital and William G. Lennox Chair and Professor of Neurology at Harvard Medical School, gave a lecture on variant types and how they fall into classes of mild (BNFE) and severe (EE) variant groups.  So, mediating severity will not only be the molecular nature of variation severity in the KCNQ2 gene, but also what is the profile of the background variants for enhancing or suppressing that given variant’s expressivity and penetrance.

If you would like to learn more, see some of the KCNQ2 Cure speaker presentations here

Pray, L. DNA Replication and Causes of Mutation. Nature Education 2008 1(1):214 https://www.nature.com/scitable/topicpage/dna-replication-and-causes-of-mutation-409

Scheffer IE et al. ILAE classification of the epilepsies: Position paper of the ILAE Commission for Classificationand Terminology. Epilepsia. 2017 Apr;58(4):512-521. doi: 10.1111/epi.13709. Epub 2017 Mar 8. https://www.ncbi.nlm.nih.gov/pubmed/28276062

Weckhuysen S et al. KCNQ2 encephalopathy: emerging phenotype of a neonatal epileptic encephalopathy. Ann Neurol. 2012 Jan;71(1):15-25. doi: 10.1002/ana.22644. https://www.ncbi.nlm.nih.gov/pubmed/22275249

Milh M et al. Variable clinical expression in patients with mosaicism for KCNQ2 mutations. Am J Med Genet A. 2015 Oct;167A(10):2314-8. doi: 10.1002/ajmg.a.37152. Epub 2015 May 10. https://www.ncbi.nlm.nih.gov/pubmed/25959266

Mulkey SB et al. Neonatal nonepileptic myoclonus is a prominent clinical feature of KCNQ2 gain-of-function variants R201C and R201H. Epilepsia. 2017 Mar;58(3):436-445. doi: 10.1111/epi.13676. Epub 2017 Jan 31. https://www.ncbi.nlm.nih.gov/pubmed/28139826

Noebels J. Precision physiology and rescue of brain ion channel disorders. J Gen Physiol. 2017 May 1;149(5):533-546. doi: 10.1085/jgp.201711759. Epub 2017 Apr 20. https://www.ncbi.nlm.nih.gov/pubmed/28428202

Genome Data Suppliers

When will you decide it is time to give a spit?

It will not take too much effort.  You order a kit, contributed a batch of spittle into a receptacle cup, then send it away for DNA analysis.

The most common utility format for this type of DNA testing is ancestry analysis:

imagecredit: Molly K. McLaughlin for PC Reviews Magazine

Ancestry determinations have been the mainstay for early adoption of DNA sequence analysis technology.  Health testing has lagged, primarily due to regulatory concerns, but with the viewpoint starting to support the individual’s right to know, more people are getting DNA analysis for health impact profiling.  The conservative path to recommend in obtaining health-related DNA information is to consult a doctor. And then perhaps guide them to the service you seek.  Many providers may not be up to speed and you will be helping them navigate the path to preventative genomics utilization in their patient populations.  Be pleased when they respond that this is interesting, but they would like for you to consult with a genetic counselor prior to and after the data comes in.

Direct to Consumer

For the highly adventurous of us who want more than ancestry, you can follow the path taken by Tom Petch at Medgadget. In a detailed story of his endeavor to uncover his preventative genomics potential, Tom used whole-genome analysis kits supplied by Dante. He obtained close to 1 Gig of raw data files and a detailed report covering a range of dispositions. He found out some intriguing idiosyncrasies explaining his predilection to coffee. This was followed by the more puzzling finding of having variants that both increase and decrease his risk of Alzheimer’s.

Having the raw data files is highly intriguing to me.  There is a treasure trove of information in there that will only be released as time ticks away and our understanding of disease biology increases.  Some providers such as Helix offer a plan where you can park your DNA files with them and they will periodically “auto update” you as variant biology upgrades become available.  Veritas is another company offering whole genome analysis but it is not clear from their website I will be handed my VCF files on a flash drive. 

For me, I want to keep the adventure to be heavily under my control.  I want access to the raw data.   I want to identify and catalog the full depth of my variant profile. The variations that don’t fit the norm are likely to number in the 1000’s, and perhaps 10,000’s if we include non-coding.  To get some of my answers on variant pathogenicity, I will use NIH’s Variation Viewer. This will allow me to get many of my variants understood for their significance.  The ClinVar Miner and SNPedia databases might be referenced also if a novel variant is in question.  Another profoundly interesting tool when you are searching for information on a specific variant is Mastermined by Genomenon.  And for even greater detail, I may reach out to my Human Genetics contacts who will have unique insights and database access to get me even greater resolution.  Where I might find some variants are binning to the Uncertain Significance category, I might be motivated enough to make an animal model with the variant in question installed.  With the resulting variant avatar created as an animal model, I will then start a set of functional assays to see if my variant of uncertain significance exhibits a certain significant deviation of function.

Dealing with Uncertain Significance of the Genome

“Risky Business”

Most of us are wired to be risk averse. Yet, I have been giving serious contemplation to the “risky business” of having my whole genome be sequenced as a preventative medicine approach to my healthcare.

What will happen if go there?

I find myself staring into the murky abyss from the edge of the data cliff. A creepy feeling urges in my belly from the depths of my ski-bum days… Should I….

“Huck-your-meat”

photocredit: Bradly J. Boner for Jackson Hole Magazine

Upon landing, I will need “spoon-my-tracks” to get the data to be interpretable and informative as possible.

photocredit: James Fagedes at Foothill freak

Mental Health

There is plenty of reason to be cautious. A recent publication by the Hasting Center report urges a high level of caution (Johnston 2018).  Although we can be somewhat dismissive of their dismissiveness –  the cost of whole genome sequencing is dramatically dropping and phenotyping technology is rapidly improving – one observation is likely to hold true for a while:

“Given the psychosocial costs of predicting one’s own or one’s child’s future life plans based on uncertain [Genomic] testing results, we think the hope and optimism deserve to be tempered.”

So it is clear there will be quite a bit of uncertainty when one opens the Pandora’s box of the genome, but hope and optimism will remain. Whether there is clearly actionable results, or frustrating uncertainty, the knowledge gained means there are things to be done, platforms to build, and cures to be discovered.

Not If, but When

Caution keeps us safe. But safe for how long? By not “going there” we might be just deluding our selves from the inevitable.  At some point, we will have a deep understanding of the consequence of genome variation.  The first to fall into line will be variants delivering functional consequence on the monogenic side of the spectrum.  These will be the easiest to model and uncover biological consequence because the variant will have clear and sometimes deterministic output on life quality and healthspan. More challenging will be the variants whose effects predominate in polygenic contexts.  These are the more subtle “risk factor” effects where the other variants in one’s genome are the influencers that either enhance or suppress the capacity of the risk factor variant to manifest.  Adding to the challenge of understanding a risk factor is the influence of external factors, such as diet and exercise. Or the more internal factors such as genomic imprinting and gene methylation status.

Yet it is clear where we are heading. Much of the uncertainty will be resolved and we will soon be living in the genome-actionable era where medicine becomes highly personalized to the individual’s variant profile.  For a glimpse of what the future holds, and if you can make the time for an amazing Rob Reid interview of Dr. Robert Green from Harvard, put the headphones on for the following podcast:

 

1. Johnston J et al. Sequencing Newborns: A Call for Nuanced Use of Genomic Technologies. Hastings Cent Rep. 2018 Jul-Aug;48(2):S2-6. https://onlinelibrary.wiley.com/doi/full/10.1002/hast.874

Properties of Top 20 Epilepsy Genes

Epilepsy Genetics

1 out of 100 persons is living with active epilepsy (Zack and Kobau 2017, WHO 2018). For the subset that can be pinned down to having a genetic cause, there are about 70 or so genes involved in causing the illness (Lindy 2018). Until recently, the frequency of a gene’s association with epilepsy remained unclear. The GeneDX study by the Lindy team is helping bring clarity by analyzing the genetic underpinnings in 8565 patients with active epilepsy.

Positive Cases

From the GeneDX study, we can plot the rank of the top 20 genes in epilepsy (green graph at top of the figure). The SCN1A gene is by far the major source gene for genetically-associated epilepsies. 27% of all the pathogenic variants in the top 20 epilepsy genes are in SCN1A and 24% in all 70 known epilepsy genes. For instance, 322 positive cases occur in SCN1A out of a total of 1181 positive case in the top 20 genes. Applying the data set to the larger populations the 1181 cases of 8565 (13%) as genetically caused, suggest close to 1 of 1000 persons are living with gene-induced epilepsy. For SCN1A, the population estimate is 78,000 individuals in USA or 1.8 Million worldwide living with pathogenic lesions in their SCN1A gene.

Total Variants

Another way to look at the rank is to ask how many variants occur in each gene (Blue graph). Gathering data from NCBI’s Variant Viewer (https://www.ncbi.nlm.nih.gov/variation/view) There is a large difference in numbers of observed variant per gene. For example, in two genes of similar protein size (SCN1A and TSC2), there is a 2.5 fold difference in their relative numbers of variants.

Pathogenic Variants

In another aspect derived from Variant Viewer, we can look at numbers of variants known to be pathogenic. For the top 20 epilepsy genes, SCN1A comes out on top again but the next gene with high levels of pathogenic variants is MECP2 (purple graph).

Pathogenic vs total

Finally, when we look at the ratio of pathogenic to total variants, we see some interesting findings. MECP2 has major sensitivity by having a high pathogenic variant load. Similarly, UBE3 also harbors a high pathogenic variant ratio. Another gene jumping up for sensitivity is the expected SCN1A gene. But now we also get KCNQ2, CDKL5, STXBP1, SLC2A1, FOXG1, and ARX as variation-sensitive genes.

Favorite genes and an anomaly

A gene for which I hold high passion is STXBP1. This gene is ranked #6 as likely genetic cause of epilepsy. STXBP1 has 47 variants of 349 as pathogenic (13%). Another gene on our list for humanization development is KCNQ2. This gene is at the #2 position for being a frequent cause of epilepsy. KCNQ2 has 855 variants of which 181 are pathogenic so it has a pathogenicity load of 21%. The TSC2 gene has a strange result. There are 145 pathogenic variants out of 3445 total which gives a 4% pathogenicity penetration. Why are there so many non-pathogenic variants for this gene? Perhaps this gene is highly flexible and can tolerate a high level of variant load. There is a slightly higher proportion of synonymous variants at 27%. Population frequency in the measured variant pool exceeds 0.0001 for only 12% of the variants. So, most of the TSC2 alleles are rare. My other suspicion is TSC2 is like BRCA1 gene. There has been the very large size of researchers studying each of these genes. This leads to more patients being examined for variants in TSC2 and BRCA1. The result, these two genes have attained a higher sampling of the variant diversity occuring in the general population.

1. Zack MM and Kobau R. National and State Estimates of the Numbers of Adults and Children with Active Epilepsy – United States, 2015. CDC: MMWR Morb Mortal Wkly Rep. 2017 Aug 11;66(31):821-825. doi: 10.15585/mmwr.mm6631a1. https://www.ncbi.nlm.nih.gov/pubmed/28796763

2. (World Health Organization) Epilepsy (access 8/1/2018) http://www.who.int/mediacentre/factsheets/fs999/en/

3. Lindy AS et al. Diagnostic outcomes for genetic testing of 70 genes in 8565 patients with epilepsy and neurodevelopmental disorders. Epilepsia. 2018 May;59(5):1062-1071. doi: 10.1111/epi.14074. Epub 2018 Apr 14. https://www.ncbi.nlm.nih.gov/pubmed/29655203

C. elegans as Fast and Affordable System for Variant Phenotyping

Systems for Functional Studies

A variety of modeling systems can be used to explore variant function. Initially, many researchers turn to a computational approach to aid variant assessments (Eilbeck 2017). A recent bioinformatics study was used to refine the variant classification of voltage-gated sodium channels (KCNQs) for their contributions to epilepsy (Hol 2017). Yet many of the KCNQ variants remained VUS alleles after the bioinformatic refinement, so alternative functional assays are needed to capture full functional assessment. Various techniques applied for obtaining functional data from clinical variants range from bacterial and yeast expression systems to mammalian cell models (Rodríguez-Escudero 2015, Woods 2016). In one example, bacterial expression of recombinant USP6 was used to detect pathological enzymatic activity in one of 18 clinical variants (liu 2016). Recombinant protein assays can be effective for enzymatic genes, but for disease genes with complex interaction phenotypes, expression in bacteria or yeast removes the gene from its native context and prevents the exploration of pathologies that involve these complex interactions. A mouse or rat animal model is the “gold standard” for finding clear phenotypes from a clinical variant, yet the cost and time spent are more than 10x relative to C. elegans animal models. Further, humanized rodent models are expensive to deploy early in drug development (McGonigle 2014). Disease modeling with Induced Pluripotent Stem Cells (iPSC) offers an exciting platform to study clinical variants (Csöbönyeiová 2016), but the removal of the cells from their native context of the intact animal regrettably removes the important effect of a tissue-based environment. 3D cell culturing techniques and organs-on-a-chip can be useful in restoring proper microenvironment context (Breslin 2013), but the ease of use for routine analysis clinical variants has yet to evolve.

Alternative Animal Models

An emerging approach used by the Undiagnosed Disease Network (Wang 2017), a variant observed in humans is homology modeled by installing the same amino acid change at a conserved position in the disease gene homologs. For instance, clinical variants in CACNA1A were installed in zebrafish and Drosophila and pathogenic activities were observed (Luo 2017). In a recent publication using C. elegans, CRISPR was used to install a patient sequence variant suspect of having CLIFAHDD syndrome (Congenital Contractures of the Limbs and Face with Hypotonia and Developmental Delay) due to defects in the NALCN gene (Bend 2016). These authors demonstrated a gain-of-function pathogenicity assignment could be made for a patient V637F sequence variation. Similar results have been observed for variant installs in other C. elegans disease gene orthologs (Sorkaç 2016, Bulger 2017, Prior 2017, Canning 2018, Pierce 2018, Troulinaki 2018). These results reinforce the C. elegans animal model as a good platform for modeling gene function of pathological variants. The fast life-cycle and ease of genome engineering allow direct modeling of pathogenic alleles to occur within a one-month timeline needed to create and analyze the transgenic animals.

1. Eilbeck K et al. Settling the score: variant prioritization and Mendelian disease. Nat Rev Genet. 2017 Oct;18(10):599-612. doi: 10.1038/nrg.2017.52. https://www.ncbi.nlm.nih.gov/pubmed/28804138

2. Hol et al. Comparison and optimization of in silico algorithms for predicting the pathogenicity of sodium channel variants in epilepsy. Epilepsia. 2017 Jul;58(7):1190-1198. doi: 10.1111/epi.13798. Epub 2017 May 18. https://www.ncbi.nlm.nih.gov/pubmed/28518218

3. Rodríguez-Escudero I et al. Yeast-based methods to assess PTEN phosphoinositide phosphatase activity in vivo. Methods. 2015 May;77-78:172-9. doi: 10.1016/j.ymeth.2014.10.020. Epub 2014 Oct 25. https://www.ncbi.nlm.nih.gov/pubmed/25448481

4. Woods NT et al. Functional assays provide a robust tool for the clinical annotation of genetic variants of uncertain significance. NPJ Genom Med. 2016;1. pii: 16001. doi: 10.1038/npjgenmed.2016.1. Epub 2016 Mar 2. https://www.ncbi.nlm.nih.gov/pubmed/28781887

5. Liu YL et al. The impacts of nineteen mutations on the enzymatic activity of USP26. Gene. 2018 Jan 30;641:292-296. doi: 10.1016/j.gene.2017.10.074. Epub 2017 Oct 27. https://www.ncbi.nlm.nih.gov/pubmed/29111204

6. McGonigle P. Animal Models of CNS Disorders. Biochemical Pharmacology 87, no. 1 (January 1, 2014): 140–49. doi:10.1016/j.bcp.2013.06.016. https://www.ncbi.nlm.nih.gov/pubmed/23811310

7. Csöbönyeiová M et al. Recent Advances in iPSC Technologies Involving Cardiovascular and Neurodegenerative Disease Modeling. General Physiology and Biophysics 35, no. 1 (January 2016): 1–12. doi:10.4149/gpb_2015023. https://www.ncbi.nlm.nih.gov/pubmed/26492069

8. Breslin S and O’Driscoll L. Three-Dimensional Cell Culture: The Missing Link in Drug Discovery. Drug Discovery Today 18, no. 5–6 (March 2013): 240–49. doi:10.1016/j.drudis.2012.10.003. https://www.ncbi.nlm.nih.gov/pubmed/23073387

9. Wang J et al. MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome. Am J Hum Genet. 2017 Jun 1;100(6):843-853. doi: 10.1016/j.ajhg.2017.04.010. Epub 2017 May 11. https://www.ncbi.nlm.nih.gov/pubmed/28502612

10. Luo X et al. Clinically severe CACNA1A alleles affect synaptic function and neurodegeneration differentially. PLoS Genet. 2017 Jul 24;13(7):e1006905. doi: 10.1371/journal.pgen.1006905. eCollection 2017 Jul. https://www.ncbi.nlm.nih.gov/pubmed/28742085

11. Bend EG et al. NALCN channelopathies: Distinguishing gain-of-function and loss-of-function mutations. Neurology. 2016 Sep 13;87(11):1131-9. doi: 10.1212/WNL.0000000000003095. Epub 2016 Aug 24. https://www.ncbi.nlm.nih.gov/pubmed/27558372

12. Sorkaç A et al. In Vivo Modelling of ATP1A3 G316S-Induced Ataxia in C. elegans Using CRISPR/Cas9-Mediated Homologous Recombination Reveals Dominant Loss of Function Defects. PLoS One. 2016 Dec 9;11(12):e0167963. doi: 10.1371/journal.pone.0167963. https://www.ncbi.nlm.nih.gov/pubmed/27936181

13. Bulger DA et al. Caenorhabditis elegans DAF-2 as a Model for Human Insulin Receptoropathies.G3 (Bethesda). 2017 Jan 5;7(1):257-268. doi: 10.1534/g3.116.037184.. Available at: https://www.ncbi.nlm.nih.gov/pubmed/27856697

14. Prior H et al. Highly Efficient, Rapid and Co-CRISPR-Independent Genome Editing in Caenorhabditis elegans. G3 (Bethesda). 2017 Nov 6;7(11):3693-3698. doi: 10.1534/g3.117.300216. https://www.ncbi.nlm.nih.gov/pubmed/28893845

15. Canning P et al. CDKL Family Kinases Have Evolved Distinct Structural Features and Ciliary Function. Cell Rep. 2018 Jan 23;22(4):885-894. doi: 10.1016/j.celrep.2017.12.083. https://www.ncbi.nlm.nih.gov/pubmed/29420175

16. Pierce SB et al. De novo mutation in RING1 with epigenetic effects on neurodevelopment. Proc Natl Acad Sci U S A. 2018 Feb 13;115(7):1558-1563. doi: 10.1073/pnas.1721290115. https://www.ncbi.nlm.nih.gov/pubmed/29386386

17. Troulinaki K et al. WAH-1/AIF regulates mitochondrial oxidative phosphorylation in the nematode Caenorhabditis elegans. Cell Death Discov. 2018 Jan 29;4:2. doi: 10.1038/s41420-017-0005-6. https://www.ncbi.nlm.nih.gov/pubmed/29531799

Explosive Growth of Gene Variant Numbers

Genetic Testing

Clinical geneticists have an acute need to understand pathogenicity in genomes of their patients (Figure 1). Cost per human genome has now approached $1000 each (Wetterstrand 2018). This affordable cost is allowing clinicians to start incorporating next-generation sequencing (NGS) technology into the patient diagnosis.

Variant Diversity

The American College of Medical Genetics and Genomics and the Association for Molecular Pathology recommend that variants be classified in five groups (Pathogenic, Likely Pathogenic, Uncertain Significance, Likely Benign, Benign) (Richards 2015). Further, they suggest greater efforts are needed to resolve variants of Uncertain Significance (VUS) into either Pathogenic or Benign status. The re-classification of a VUS is no small task. In a recent publication on the analysis of the clinical variant database (ClinVar), the number of known genetic variants in human disease was reported on 9/9/2016 to be 72,472 (Manolio 2017). One year later (9/21/2017), a survey of ClinVar using an online database viewer reported 359,938 known variants (Henrie 2018). This 4.9-fold increase in the number of known clinical variants over one year reflects the explosive application of whole genome sequencing in clinical diagnostics (Stavropoulos 2015, Ellingford 2016, Volk 2017).

Need for Innovation

The daunting task now is to determine the significance of each new variant. Bioinformatics can provide some insight into variant pathogenicity (Oliver 2015). Unfortunately, the number of VUS alleles has remained close to 40% year-on-year since 2015. As of the July 2018, there are 192,843 identified VUS alleles. With such high numbers, there is a pressing need to quickly correlate genotype to phenotype and determine if VUS alleles are benign or pathogenic (Cox 2015). Model systems that reconstitute mutations in a physiological context are a robust method to demonstrate variant pathogenicity (Eilbeck 2017). Traditionally, mouse models have been used to characterize defective function in VUS alleles, but the expanding universe of clinical variants is overwhelming the current capacity. Higher throughput animal modeling is needed to address the growing demand.

1. Wetterstrand KA. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP) Accessed July 12, 2018. https://www.genome.gov/sequencingcostsdata

2. Richards S et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015 May;17(5):405-24. doi: 10.1038/gim.2015.30. Epub 2015 Mar 5. https://www.ncbi.nlm.nih.gov/pubmed/25741868

3. Manolio TA et al. Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research. Cell. 2017 Mar 23;169(1):6-12. doi: 10.1016/j.cell.2017.03.005. https://www.ncbi.nlm.nih.gov/pubmed/28340351

4. Henrie A. et al. ClinVar Miner: Demonstrating utility of a Web-based tool for viewing and filtering ClinVar data. Hum Mutat. 2018 May 23. doi: 10.1002/humu.23555. https://www.ncbi.nlm.nih.gov/pubmed/29790234

5. Stavropoulos DJ et al. Whole Genome Sequencing Expands Diagnostic Utility and Improves Clinical Management in Pediatric Medicine. NPJ Genom Med. 2016 Jan 13;1. pii: 15012. doi: 10.1038/npjgenmed.2015.12. https://www.ncbi.nlm.nih.gov/pubmed/28567303

6. Ellingford JM et al. Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease. Ophthalmology. 2016 May;123(5):1143-50. doi: 10.1016/j.ophtha.2016.01.009. https://www.ncbi.nlm.nih.gov/pubmed/26872967

7. Volk AE and Kubisch C. The rapid evolution of molecular genetic diagnostics in neuromuscular diseases. Curr Opin Neurol. 2017 Oct;30(5):523-528. doi: 10.1097/WCO.0000000000000478. https://www.ncbi.nlm.nih.gov/pubmed/28665809

8. Oliver GR et al. Bioinformatics for clinical next generation sequencing. Clin Chem. 2015 Jan;61(1):124-35. doi: 10.1373/clinchem.2014.224360. https://www.ncbi.nlm.nih.gov/pubmed/25451870

9. Cox TC. Utility and limitations of animal models for the functional validation of human sequence variants. Mol Genet Genomic Med. 2015 Sep;3(5):375-82. doi: 10.1002/mgg3.167. https://www.ncbi.nlm.nih.gov/pubmed/26436102

10. Eilbeck K et al. Settling the score: variant prioritization and Mendelian disease. Nat Rev Genet. 2017 Oct;18(10):599-612. doi: 10.1038/nrg.2017.52. https://www.ncbi.nlm.nih.gov/pubmed/28804138

Beginning of the Journey to the End

The purpose of this blog has many uses.  It allows me to create content exploring genome biology in more depth than what can be achieved in a mere 140 characters.  But more ambitiously, I will be embarking on a journey of self-discovery, of the molecular and structural kind.  I plan to acquire my own genome’s data at high-density read depth.  I expect to catalog over 1,000,000 variations from my genome when compared to the norm.   Of these variations, I expect about 1000 will be in the coding sequence of my genes.  About 7% of these will be obviously deleterious to gene function.  Yet, because we are diploids organisms with two copies of every gene, many of these “bad” genes will be backed up by a good copy.  As a result, I will know what is my carrier status for bad alleles. Intriguingly, I will be uncovering which important genes (or hopefully not so important genes) have had two hits that render their phenotypic consequence as defective.