Application of Hypersensitivity Assays to Discovery of Therapeutics

Chemical hypersensitivity is a common method to probe gene dysfunction and can be deployed in drug screens to find new therapeutics. In this blog post, we will focus on models of Inborn Errors of Metabolism (IEM) and describe how these genetic conditions can lead to hypersensitivity to a metabolite. By using humanization techniques, we take advantage of the ancient biology between humans and other organisms to create stand-ins – patient avatars – for drug screening studies. These genetically engineered model systems enable fast and affordable phenotypic screens in whole organism format to enable researchers to find molecules that alleviate the metabolic stress occurring from an IEM deficiency. Ultimately, this report showcases how chemical hypersensitivity is generalizable to drug discovery tool applicable to many genetic disorders.

Clinical variants disrupting metabolic gene can lead to build up of toxic metabolites (Figure 1).  This phenomenon can be used to create a functional assay where the model system has hypersensitivity to metabolites upstream of the gene’s function in a metabolic pathway.

Figure 1.  Hypothetical metabolic pathway.  When ENZ 2 gene is defective, metabolite 2 can build up to toxic level that lead to paralysis and death.

Chemical Hypersensitivity Due to Metabolic Block

In the above example, the second enzyme of the metabolic pathway, “ENZ 2,” is the cause of a genetic disease  that inhibits the enzymatic conversion of metabolite 2 into metabolite 3.  For patients with this condition, exposure to metabolite 1 leads to toxic metabolite 2 build up and activation of reactive oxygen species, ultimately leading to paralysis and death.  Because of this metabolic pathway blockade, patients can experience hypersensitivity with exposure to metabolite 1.

Model Systems in the Fluidics Paradigm

A variety of model systems are possible for use in testing for metabolite hypersensitivity.  Starting with the simple model organisms, a human gene associated with disease can be installed as a gene replacement. In the case of metabolic disorders, the high degree of sequence conservation in these ancient genes often enable the human gene to rescue the function of the removed ortholog (the animal’s version of the disease gene). When we add in iPSC and then install clinical variants in the human gene locus, we get three types of model systems (C. elegans nematode, zebrafish and iPSC to model clinical variants (Figure 2). These models are advantageous for drug discovery because they fit the fluidics paradigm – The zebrafish and nematode can live their entire lifecycle in liquid and differentiated iPSC can be hooked up in biocircuits with microfluidics. By being in fluid, assessment of oral bioavailability is simply done by adding drugs to the liquid growth media. In this liquid environment, The first step is to create the wt-Avatars. In the nematode, the evolutionary distance often renders a gene to gene comparison with low homology (sequence identity under 75%). As a result only a portion of the pathogenic alleles can be modeled as amino acid substitution in the nematode’s native locus.  For work around we use Whole Gene Humanization – CRISPR gene editing is used to remove the coding sequence at the native locus and replace it with the human gene coding sequence.  When the human sequence restores normal function, we know we are looking at a high degree of conservation of biology.

In zebrafish, the orthologous gene is often at a sequence identity that is equal to or greater than 75% of the human sequence. This often renders the zebrafish’s gene sufficient for modeling the patient condition as a single amino acid substitution, yet occasionally whole-gene humanization will need to be deployed. For iPSC, we source from a healthy patient (reference line – “wt Avatar”) and make the single amino acid variation to model the patient variant condition (var-Avatar).   Bottomline, modeling a patient condition requires the creation of a wild-type humanized animal or the use of an unmodified line (wt-Avatar) and then the insertion of the missense variant that models the genetic variant of the patient (var-Avatar). In regards to IEM deficiencies, this system enables detection of phenotypic abnormalities that are often accentuated by metabolite exposure.

Figure 2. Three types of model systems are zebrafish and C. elegans nematode animal models (most relevant when used in whole gene humanized formats) and the use of iPSCs. Humanizing mutations are made to recreate the same variation seen in the patient (var-Avatar) which are then compared to the wild type control (wt-Avatar).

Detection of Hypersensitivity Phenotypes

To provide a phenotypic screen that is linked to a mechanism of action, the var-Avatar lines can be examined for their hypersensitivity to specific environmental stressors. Applied to a metabolic gene, the environmental stressor is exposure to an upstream metabolite. Using the ENZ 2 example of Figure 1, increasing dosage of metabolite 1 can be monitored for its effect on paralysis in the var-Avatar model system. To measure activity in the nematode, the 96 well format of the wMicrotracker apparatus is used to track loss of locomotion. As the animal becomes more paralyzed light beam disruption rate decreases. By exposing a nematode var-Avatar to different concentrations of metabolite, an LD50 curve can be generated (Figure 3).

Figure 3. wMicrotracker is used to generate LD50 curves for var-Avatar and wt-Avatar upon exposure to different concentrations of metabolite. As metabolite concentration increases, survival of animals decreases. The var-Avatar has an earlier drop in survival when compared to wt-Avatar. An intermediate concentration of metabolite (40mM) can be used to discriminate hypersensitivity in the var-Avatar animals.

In the var-Avatar, the loss of enzyme function with ENZ 2 pathogenic variants creates pathway blockage and allows build up of metabolite 2. When the metabolite 2 reaches toxic levels and the cell becomes oxidatively stressed, which leads to cell apoptosis and eventually animal paralysis and death. In this example, 40 mM of metabolite 1 is enough to cause paralysis and lethality in the var-Avatar of ENZ 2 deficient animals, but it is not enough to create paralysis in wt-Avatar control animals. As a result, a drug screen can be developed that uses this 40mM concentration as a cut-off tool for finding drugs that can alleviate the metabolic block and enable survival of the ENZ 2 var-Avatar (Figure 4).

Figure 4. Rescue of hypersensitivity screen applied to drug repurposing library.  A “hit” is obtained when the drug rescues survival in var-Avatar in 40 mM metabolite but does not influence survival at the wt-Avatar at its LD50 measured metabolite concentration.

Drug Screen

A drug screen can be performed on repurposing libraries to find hits for use in clinical trials. In the above example, the 40 mM of metabolite 1 is used to detect hits as a var-Avatar animal that can survive when exposed to a drug from the repurposing library. Additionally, the hit is considered valid when the drug has minimal to no effect on the wt-Avatar control. For metabolic disorders screening,the first step (Phase 1) is a Stress Test to determine if upstream metabolites can lead to a hypersensitivity.  In the next step (Phase 2), prior to the library screening, testing for Metabolic Modulation is performed by observing if changes in hypersensitivity can be achieved with downstream metabolites, cofactors, oxidants and antioxidants (Figure 5).  In the final step (Phase 3), A Library Screen is performed to find modulatory effects from FDA approved drugs.

Figure 5.  The wt-Avatar and var-Avatar animals are compared for their response to stress test compounds (upstream metabolites) to achieve optimal separation of wt-Avatar vs var-Avatar (dotted lines). Then a set of metabolic modulators (downstream metabolites, oxidants and antioxidants) are tested for their capacity to restore normal sensitivity in var-Avatar and have minor change in wt-Avatar sensitivity (dashed lines).

Phase 1 – Stress Testing to Detect Compound Hypersensitivity

In the first phase of a 3 phase process, an assessment is made for the detection of metabolite sensitivities to metabolites and ROS mediators. In this pilot screen, a small selection of chemicals are used to determine if metabolite sensitivity can be detected. First upstream metabolites are exposed to the model system. These will often lead to their build up to toxic levels which are achieved sooner in the var-Avatar line vs the wt-Avatar line.

Scope Steps: Hypersensitivity Screen (upstream metabolites)

  1. Obtain wt-Avatar and var-Avatar: Generate humanized animals for modeling a metabolic gene deficiency – created as a prerequisite project.
  2. Obtain Stress-Test Compounds: Select upstream metabolites (1 to 10 molecules)
  3. Solubilize Stress-Test Compound: Resuspend the metabolites in 100% DMSO at 100mM concentration to mimic the sample conditions of most drug screening libraries.
  4. Measure Activity: Determine suppression as retention of locomotion using wmicrotracker plate reader instrumentation.
  5. Generate LD50 curves: Expose animals at concentrations 0.01, 0.1, 1, 10, 100, and 1000 uM (1% DMSO)
  6. Select Optimal Separation: Examine LD50 curves for conditions creating a high degree of separation between wt-Avatar and var-Avatar.
  7. Repeat With Combinations (optional): If necessary, repeat with combinations of chemicals to attempt to create at least 4x separation in chemical sensitivity between wt-Avatar and var-Avatar.

Phase 2 – Metabolic Modulation Effects

Once metabolic hypersensitivity is established, a second set of molecules are examined for their ability to alleviate metabolic defect and restore normal activity. Downstream metabolites and cofactors can be added to the system to determine if they can restore normal metabolite sensitivity.  Additionally, In many metabolic disorders the buildup of metabolite intermediates lead to disruption of normal levels of Reactive Oxygen Species (ROS) (doi: 10.1155/2018/1246069). As a result, two other groups of molecules can be screened for their effect on metabolite hypersensitivity. Various antioxidants (resveratrol, glutathione, metformin) can be tested for their ability to decrease sensitivity to metabolite build up. On the other hand, some metabolic intermediates may attenuate ROS which at low levels are necessary for homeostasis signaling. Various oxidants (paraquat, juglone and AAPH (2,2′-azobis-2-methyl-propanimidamide)) may lead to restoration of normal low levels of ROS and alleviate metabolic hypersensitivity.

Scope Steps: Metabolic Modulation (downstream metabolites, cofactors, oxidants, antioxidants)

  1. Obtain metabolic-modulator Compounds: Select downstream metabolites, and oxidants and antioxidants (5 to 10 chemicals)
  2. Solubilize metabolic-modulator Compound: Resuspend the compounds in 100% DMSO at 100mM concentration to mimic the sample conditions of most drug screening libraries.
  3. Measure Activity: Determine suppression as retention of locomotion using wmicrotracker plate reader instrumentation.
  4. Generate EC50 curves: Using a upstream metabolite at a concentration that leads to a strong deficiency (ie. death), add metabolic-modulator at concentrations 0.01, 0.1, 1, 10, 100, and 1000 uM (1% DMSO)
  5. Identify Modulator Candidates: Examine the EC50 curves for the modulators effect on wt-Avatar and var-Avatar.
  6. Repeat With Combinations (optional): If necessary, repeat with combinations of chemicals to attempt to decrease the separation between wt-Avatar and var-Avatar.

Phase 3: Library Screening Approach

For speed to clinical trials, a good choice is to use repurposing libraries. These are the FDA-approved drugs that are well vetted for ADMET issues, which makes them a safe choice for therapeutic consideration. We will apply the assay developed in phase 1 to scale across a commercially-sourced drug screening library. A variety of sources for compound libraries utilizing FDA-approved drugs are available (TargetMol, ApexBio, Chembridge, Microsource, Prestwick, Seleckchem, etc… – typically about 2000-3000 mlcs). As an example, the metabolite hypersensitivity assay is applied to the var-Avatar line (patient variant animal) at 10uM concentrations of compounds from the Apexbio FDA-approved library of 2726 compounds. The top hits (up to 200) are counter screened on wt-Avatar animals and the results are scored for minimal alteration of wild-type metabolite sensitivity. A rank is developed that balances strong response in the var-Avatar line against a strong response in the wt-Avatar line. The top 20 hits are examined at a range of doses to find the EC50 values in suppressing metabolite hypersensitivity. Data package is prepared and delivered to the client.

Scope Steps: Examine Rescue of metabolite Sensitivity in var-Avatar line with 2000+ Chemical Library

  1. Obtain library: Multiple repurposing libraries are available – for example, ApexBio library of 2726 FDA-approved chemicals.
  2. Drug Exposure: Using Stress-Test chemical concentrations optimized in phase 1, test The var-Avatar line against the library of 2726 molecules at 10uM concentration for their ability to suppress metabolite hypersensitivity.
  3. Measure Activity: Determine suppression as retention of locomotion using wmicrotracker plate reader instrumentation.
  4. Perform Primary Screen: Score compounds as positive or negative for ability to suppress metabolite hypersensitivity by retaining locomotion.
  5. Identify Preliminary Hits: Select up to 200 compounds (≤ mlcs) from the positive category for use in repeat screen on wt-Avatar animals.
  6. Perform Counter Screen: Using metabolite concentration optimized in phase 1, test the ≤200 mlcs at 10uM concentration for their ability to NOT suppress metabolite sensitivity in wt-Avatar animals.
  7. Rank Hits: Select up to 20 compounds (≤20 mlcs) that are positive in the var-Avatar for suppression of metabolite hypersensitivity and negative for suppression of metabolite sensitivity in the wt-Avatar.
  8. Characterize Hits: Perform EC50 assays on var-Avatar line for the 20 compounds.
  9. Report Out: Send a report to the client.

RWE – Real World Evidence Applications

A real world example occurs in the propionate metabolic pathway.  Defects in the ECHS1 gene lead to altered propionate metabolism shunting propionate away from Acetyl-CoA production and instead only allow Succinyl-CoA production (Figure 6). 

Figure 6. Defects in ECHS1 block propionate metabolism via acetyl-CoA pathway. Instead only the Succinyl-CoA pathway remains.

Potentially hampering succinyl-CoA production is a vitamin B12 dependency of the last enzyme in the Succinyl-CoA production pathway. When both B12 is limiting and the Acetyl-CoA production is blocked by a ECHS1 defect, hypersensitivity to propionate occurs. Since the propionate pathway is highly conserved between the nematode and humans, creating a deficiency in the nematode equivalent of ECHS1 (the ech-6 gene) can create a propionate hypersensitivity (Figure 7).

Figure 7. Differential effects of propionate exposure in C. elegans nematode models. Propionate exposure to the nematode exhibits a sensitivity in wild type (N2) control that has LD50 near 90 mM.  When RNAi mediated knockdown of ech-6 gene is performed, a propionate hypersensitivity ensues and LD50 drops to near 10 mM propionate.  The result is a near 10x difference in propionate sensitivity when the ech-6 locus is blocked from expression of the nematode’s version of ECHS1.  Addition of B12 vitamin partially rescues the propionate sensitivity.  (data adapted from Watson et al. Elife. 2016 Jul 6;5:e17670)

Loss of function of ech-6 creates propionate hypersensitivity

RNAi was used to knock down expression from the ech-6 gene (Watson et al. Elife. 2016 Jul 6;5:e17670). The result was hypersensitivity to propionate – a 9 fold increase in LD50 with exposure to the metabolite. When B12 was added to the system, partial rescue of hypersensitivity occurred. From this data where the drug screen was performed at 30 mM propionate, the level of rescue of survival was 4 fold higher with B12 exposure. 40-50 mM propionate appears to be the ideal range for detecting chemicals capable of rescue of propionate hypersensitivity in ech-6 null animals (KO). Inserting the human ECHS1 as gene replacement of ech-6 and observing restoration of normal propionate sensitivity will establish that the animal model (wt-Avatar) can be used to create a background system for modeling human disease. Next, installing variants in the humanized strain will create a model system for specifically exploring a patient’s genetic condition (var-Avatar) (Figure 8). When good separation between var-Avatar and wt-Avatar is observed, a screen for rescue of propionate hypersensitivity can be performed and B12 can be used as positive control.

Figure 8. Propionate sensitivity for four types of animal models. The KO null does not express ech-6 gene and is expected to be highly sensitive to propionate.  The var-Avatar is a whole-gene humanized wt-Avatar containing a patient coding-sequence variant.  The wt-Avatar is a humanized animal created as a whole-gene replacement of the ech-6 coding sequence. The wild-type N2 is the unmodified animal model commonly used as a control animal for comparison to gene-modified animals.

B3GAT3 Drug Target for Modulating Propionate Hypersensitivity

Reassuringly, other genes involved in propionate metabolism are known to have hypersensitivity when they are made defective. For instance, the PCCA gene, in combination with PCCB, is a heteromeric enzyme responsible for converting propionyl-CoA to D-methylmalonyl-CoA, which is an intermediate step of the metabolism of propionate to succinyl-CoA. When loss of function is created in the nematode ortholog pcca-1, propionate hypersensitivity occurs and these animals do not survive at above 50 mM propionate (Watson et al. Elife. 2016 Jul 6;5:e17670).  This same group used the propionate sensitivity assay to determine that  loss-of-function mutations in glct-3 (an ortholog of B3GAT3) create a propionate resistant animal (Na et al., PLoS Genet. 2020 Aug 28;16(8):e1008984). As a result, it appears inhibition of B3GAT3 may be useful in offsetting the hypersensitivity of defects within the propionate pathway.  In silico approaches can be used to dock molecules to the B3GAT3 structure (1FGG, 1KWS, 3CU0) (Figure 9).  Hits found can then be validated by testing in a humanized animal model.

Figure 9. Molecular dynamics screens large libraries to find top hits for validation in an animal model.  

Zebrafish Model for Study of RPE65 Defects

Switching to zebrafish models, vitamin A metabolite toxicity can be used to probe defects in the RPE65 gene. First, morpholinos or crispants can be used to dramatically reduce expression of this gene in the fish and lead to loss of function phenotypes.  Next, the addition of an mRNA to the injection mix can be used to rescue the loss of function and restore normal metabolic activity.  With regards to the loss of function activity, morpholinos in zebrafish work by shutting down translation of a gene transcript.  In the alternative approach, crispants work by disrupting the gene’s coding sequence in a large proportion of cells (>90%). Often the effect with either a pproach is a 10x or more decrease in gene expression which effectively models a loss-of-function variant. It is expected loss of function of the zebrafish isoforms for RPE65 will result in metabolite hypersensitivity.

In humans, loss of the RPE65 retinoid isomerohydrolase, an enzyme of the visual cycle involved in catalytic recycling of retinyl palmitate to 11-cys-retinol, will render the REP65 deficient animals hypersensitive to retinyl palmitate and its precursors. The all-trans-retinal (atRAL) is a precursor with established toxicity in the visual system (Chen et al. J Biol Chem. 2012 Feb 10; 287(7): 5059–5069; Gao et al., J Biol Chem. 2018 Sep 14; 293(37): 14507–14519). The RPE65 enzyme is critical in the last step of the visual cycle by creating the 11-cis-retionol needed by opsins to convert light into a biochemical signal. Deficiencies in this enzyme are likely to render the animals prone to light-induced blindness. In zebrafish, there are three genes (rpe65a, rpe65b, and rpe65c) that provide support for the RPE65 role in humans (Ward et al., Front Cell Dev Biol. 2018; 6: 37). A crispant CRISPR-based somatic knockout strategy can be employed where three sgRNAs each targeting the three isoforms are used to create removal of retinoid isomerohydrolase from nearly all tissues (~95%) in a zebrafish embryo. Exposing the resulting larvae to bright light and all-trans-retinal (atRAL) is likely to create a hypersensitivity that manifests as a high rate of blindness.

To rescue the blindness, an mRNA encoding human retinoid isomerohydrolase (hRPE65) is added to the crispant mix. This allows the visual cycle to remain active and toxic levels of atRAL are avoided. To determine if a variant in hRPE65 is pathogenic, any of the 177 missense variants identified in the clinical population database of ClinVar can be made as an mRNA and then be examined for their pathogenic potential (lack of rescue). The mRNA effectively becomes a var-Avatar model for testing a variant’s hypersensitivity to atRAL.  For the variants that exhibit accelerated blindness, small molecules can be explored for their ability to restore normal sensitivity to atRAL exposure.

An iPSC Model for Study of NOTCH3 Deficiencies

As an evolving model system, induced pluripotent stem cells (iPSCs) can be derivatized into tissues that are populated into microphysiological systems (MPS). Some of the MPS use iPSC-derived tissue that models the vascular system. An important disease to model in vascular formats is cerebrovascular disorders. CADASIL is one of the vascular diseases with an established genetic cause – variations in the extracellular domain of NOTCH3 gene result in small blood vesicle defects in the brain. Although NOTCH3 variations causative for CADASIL are rare, the disease is considered an ideal model for cerebrovascular disease, which affects 25% of all stroke patients and 45% of all dementias. Further we know there is a therapeutic antibody binding site on the extracellular side of the NOTCH3 protein that modulates the disease by promoting extracellular cleavage and turnover of the NOTCH3 gene. Loss or gain of cystines are the most common pathogenic variants in NOTCH3. A reference iPSCs can be modified by CRISPR to contain the patient variant (var-Avatar) and be compared to the unmodified cells (wt-Avatar). In CADASIL patients, oxidative stress occurs (Neves et  al., JCI Insight. 2019 Dec 5; 4(23): e131344.) Specifically, CADASIL patients exhibit Nox5-induced oxidative stress as measured by Lucigenin-enhanced chemiluminescence assay of NADPH-dependent ROS production. So, although the NOTCH3 gene is not considered a metabolic gene, disruptions in its function behave like many metabolic deficiencies and lead to ROS imbalance.  For a hypersensitivity screen, it is likely oxidative stressors such as paraquat will exhibit hypersensitivity in iPSC-derived MPS models of CADASIL patients.

Summary

Stressor hypersensitivity can be used to detect favorable drug effects.  Applied to Inborn Errors of Metabolism, the use of an upstream metabolite can be a stressor that leads to hypersensitivity to the metabolite. Patients with defects in an enzyme of a metabolic pathway experience unhealthy build up of a metabolite intermediates. We can take advantage of the ancient biology between humans and other organisms for the genes involved in metabolism where we create the patient’s genetic defect in the model system, either nematode or zebrafish (or iPSC). This humanized animal (or modified iPSC) then becomes the patient avatar for use in drug screening studies. For metabolic deficiencies, the animal can be used for phenotypic screens to find molecules that alleviate the metabolic stress occurring from the deficiency.  Ultimately the approach is likely to be widely generalizable to many genetic disorders.

The Path to Affordable Therapeutics in Rare Disease – Tackling Congenital Disorder of Glycosylation in the PMM2 gene.

Authors: Hannah Huston, Alexandra Narin, and Chris Hopkins

By definition, there are not a lot of patients for any given Rare Disease. A disease is only considered rare (or “orphan”) if it affects fewer than 200,000 people (NIH). Often, a rare genetic condition caused by a malfunction of a gene is even more infrequent. Maybe only a small handful of people on the planet will have genetic lesions in a given gene. Despite this, rare disease as a group is actually quite common. There are over 7,000 genes in everyone’s genome that can harbor a gene variation which leads to a disease. One way to conceptualize the variants in a rare disease is to imagine an inverted image of a galaxy, where black dot clusters of Rare variants have different levels of phenotypic severity (Figure 1). The variant clusters in the outer fringes are near normal activity, but in the center is the black null of lethality.

CREDIT: Adapted from the European Space Agency

Although a pair of individuals with similar variations in a given gene is not often found, the likelihood that any two individuals will have a defective variation in any one of the 7,000 genes is much higher. The result is that about 1 in 15 persons are afflicted with a rare disease condition. So, in aggregate, it turns out that rare diseases are a highly common health care issue that suppresses quality of life for a large proportion of the planet’s population.

The challenge of today’s personalized medicine is to develop therapeutic approaches that can treat these rare disease populations in a cost effective way. In the traditional therapeutic development path, the cost to bring a chemical entity through the challenges of toxicity and efficiency and have it reach the market, has cost billions of dollars. This sizable sum of money can be challenging to justify when only a dozen or so individuals will stand to benefit from such a high cost. As a society, we must get inventive and find more affordable approaches to bring rare disease therapeutics to market. Answering this call to action are small biotech companies, such as Perlara.

Perlara, located in the San Francisco Bay Area, was founded in February 2014 by Ethan Perlstein. The company is “on a mission to accelerate the discovery of cures for rare genetic diseases and uncover underlying mechanisms that enable the development of treatments that work across a range of diseases and individuals” (Perlara Website). Through their PerlQuests™, Perlara partners with families, researchers, and patients to find treatments for these otherwise forgotten, yet very real, rare diseases. One such PerlQuest, focusing on PMM2 (Phosphomannomutase 2 deficiency), holds a special place in the heart of Dr. Sangeetha Iyer, the Director of Preclinical Development at Perlara.

Dr. lyer (currently a senior PM at Pfizer) has a background in neurodegenerative disorders and rare genetic diseases. She received her Ph.D. in molecular pharmacology from the University of Pittsburgh and then went on to do her postdoc research at the University of Texas, Austin. Sangeetha has worked with a plethora of models, starting her career with mouse models, and working with xenopus oocytes, drosophila, and C. elegans. Not only does she have a wide understanding of model organisms, but Dr. lyer also has had over 10 years of experience in model and assay development and drug screening for human genetic disorders. 

Dr. lyer developed nematode models of rare diseases and conducted several successful screen campaigns for rare diseases, one of which was the aforementioned PMM2, the Phosphomannomutase 2 deficiency. These efforts led to a clinical trial with one patient, Maggie (n=1). The trial outcome was a success! Maggie is doing quite well and the trial is currently being expanded to include other patients. Recently, these findings were published in a paper titled: Repurposing the aldose reductase inhibitor and diabetic neuropathy drug epalrestat for the congenital disorder of glycosylation PMM2-CDG.

What is PMM2?

PMM2-CDG, formerly known as congenital disorder of glycosylation type 1a, is a rare multisystem disorder that involves a normal, but complex, chemical process known as glycosylation (PMM2-CDG 2015) 

PMM2-CDG is caused by mutations of the phosphomannomutase-2 (PMM2) gene and is inherited as an autosomal recessive condition. The variation reduces the function of the PMM2 enzyme and leads to improper levels of glycosylation. The disease can affect any part of the body, though most cases usually have an important neurological component. PMM2-CDG is associated with a broad and highly variable range of symptoms and can vary in severity from mild cases to the severe, disabling or life-threatening cases. Most cases appear in infancy or early childhood, like in Maggie’s case, thus, this patient population became the focus of Perlara’s PerlQuest.

Maggie’s Quest

The PMM2 PerlQuest came about because, prior to meeting Maggie, Dr. lyer had started working on lysosomal storage disorders. The research team had just completed some work on a glycosylation disorder (NGLY-1), whose loss of function leads to developmental delay and seizures. In presenting this work, they were introduced to the Maggie’ parents, as PMM2 deficiency is one of the most common causes of glycosylation disorders.  Sangetha and the team at Perlara felt this was a good candidate for using humanizing mutations to create an animal model of the disease in a simple model organism.

“Some model systems work, but not a whole lot and we felt that it fit very well with our platform of model organisms. Perlara was working with yeast model systems, drosophila as well as C. elegans along with the basic model organism pipeline and on the other hand, we had patient fibroblast, which were also available for PMM2. So PMM2 basically came onto our radar because of Maggie, the girl who has PMM2 disorder. And after meeting with her parents and having some conversations about the utility of our platform, we decided to go ahead and model some of our mutations and see if we could conduct a drug screening campaign. That’s how that program came to be.” (Iyer, Podcast: 17 minutes of Science, 2020)

To pursue PMM2 treatment options, Sangeetha and her team at Perlara decided to go down the path of drug repurposing. This involves testing already known drugs and compounds in alternative ways to see if they are viable treatment options for other diseases outside of their initial purpose. One of the main benefits of drug repurposing is that it speeds up the traditional drug discovery timeline as the compounds are already known, and often have vast amounts of information already available for researchers to use. As a result, drug development can be achieved while cutting costs immensely: instead of the typical hundreds of millions it takes to reach clinical trials, drug repurposing trials can be conducted with millions or even sub-millions.

The process:

To begin their repurposing campaign, researchers at Perlara initially started testing with yeast models of PMM2. They then wanted to move into C. elegans, but hit a roadblock. C. elegans already had one PMM2 model, but it was lethal which meant it would not be a viable tool for their campaign. At this point, Perlara approached InVivo Biosystems for help building a new C. elegans model to use. “With InVivo Biosystems’ help, we were able to model another, a different patient mutation, one that does not have this severe lack of enzyme activity”(Iyer, Podcast: 17 minutes of Science, 2020). The new worm model that InVivo Biosystems built for Perlara has an ortholog for PMM2 in which the protein is 54% identical to humans. This may not sound like a lot, but it is very significant. Additionally, and possibly most importantly, the mutation sites were conserved between humans and C. elegans. Because of this conservation, Perlara was able to use the C. elegans models engineered with a specific point mutation that modeled the exact mutation seen in the patient.

“One of the reasons we believed in the power of model organisms specifically for rare monogenic diseases was because when you have a single gene ortholog and one that has high similarity to what one might encounter in humans, you can model the same mutation as you see in humans, in those model organisms.” (Iyer, Podcast: 17 minutes of Science, 2020). 

Dr. lyer performed a drug screen using a 2560-compound Microsource Spectrum library consisting of FDA-approved drugs, bioactive tool compounds and natural products. The top 20 hits were found to be either antidiabetic or antioxidant molecules. Remarkably, the dominant portion of hits were antioxidant flavonoids with known utility as aldose reductase inhibitors (ARIs). Next they checked the activity of the leads in yeast and patient-derived fibroblast to see if cross species conservation of activity can be observed. The result was identification of α-cyano-4-hydroxycinnamic acid (CHCA) as the most cross species bioactive for ARI activity. The structure activity relationship of the CHCA molecule was used to develop a drug pharmacophore profile which allows identification of a set of commercially-available ARIs (tolrestat, ranirestat, imirestat, zopolrestat, sorbinil, ponalrestat, alrestatin, fidarestat and epalrestat).  Testing these new molecules in worms and patient fibroblast revealed epalrestat as the best activity lead.

The aldose reductase is an enzyme that shunts glucose down the polyol pathway with its conversion of glucose into sorbitol. The inhibition of this enzyme activity has two favorable effects. It leads to an increase in glucose-1,6-bisphosphate production which activates PMM2. And it prevent activation of the polyol pathway activity which generates Reactive Oxygen Species (ROS) and leads to Advanced Glycation End-products (“AGE”).  The AGE are especially nasty because they create abnormal protein glycosylation and cause a normal protein to be recognized by the immune system as “foreign” protein. Then a cascade of auto-inflammatory response is initiated.

It is likely PMM2 deficiency set in motion quite a few cellular stress responses. Not only does it result in reduction in normal levels of N-linked glycosylation, it also results in increased shunt of glucose through the polyol pathway. This leads to high levels of sorbitol which readily alkylates with the amines in the body’s proteins rendering the proteins a “foreign” in appearance to the immune system and inflammation results. This ripple effect of cellular stress stress leads to the clinical presentations of the disease. 

“Until the time that we did this work, nobody had discovered that you could boost PMM2 enzyme activity through some other artificial shunt pathway, which is essentially what our model organism screens were telling us, that there was another way to increase PMM2 activity.” (Iyer, Podcast: 17 minutes of Science, 2020).

And while this was true, they had discovered another way to increase PMM2 activity in both their yeast and worm models, they were not certain how this would translate to the human enzyme. At this point, Dr lyer and her team at Perlara were able to incorporate the human fibroblasts which were known to have the defective enzyme activity. The team used both generic Fibroblasts from Coriell, a non-profit repository for patient fibroblasts, in addition to using the fibroblasts from Maggie. In both cases, the fibroblasts showed that PMM2 enzyme activity was in fact increased when exposed to epalrestat. According to Dr lyer, “that’s where the final piece of the puzzle came together.”

Maggie’s Cure

One of the benefits of drug repurposing over developing new compounds is that these drugs are already on the market and have had a wealth of safety data collected. While not always the case, this typically makes it easier to get approval to use the drugs. Although epalrestat had never been approved for use in the US, it had been on the market for over 20 years in other countries and was no longer under patent protection.

At this point, Perlara started talking with Maggie’s family and Dr. Eva Morava from the Mayo Clinic to see about treatment opportunities for Maggie using epalrestat. Dr. Morava, along with Maggie’s parents and Ethan Perlstein, put together an n=1 IND application which they submitted to the FDA in order to gain approval for the use of epalrestat. Thanks to the trove of safety data that was available for epalrestat and the body of data generated by Perlara substantiating that epalrestat increased PMM2 enzyme activity in a variety of modern systems, they were able to gain approval for the trial. 

Maggie has now been on the drug for over a year (Interview conducted when Maggie had been on the drug for about 10 months). She has gained weight and can have conversations, something she was unable to do pre-treatment. Her motor skills and coordination have skyrocketed — she can even ride a bike now. Maggie is continuing to take epalrestat, and her team is now working to expand the trial to a larger group in the hopes of helping others. Dr lyer credits the success of this drug repurposing study to the model organisms which were able to generate the data they needed quickly, efficiently, and affordably in order to gain their FDA approval. 

“To see a drug have an impact, a positive impact, on the child, in a child was just incredibly powerful” (Iyer, Podcast: 17 minutes of Science, 2020)

References

  1. Sangeetha Interview: https://invivobiosystems.com/17-minutes-of-science/from-bench-to-bedside-using-model-organisms-to-find-rare-disease-treatments/?highlight=sangeetha
  2. PMM2-CDG. (2015, August 06). Retrieved April 28, 2021, from https://rarediseases.org/rare-diseases/pmm2-cdg/#:~:text=Summary,chemical%20process%20known%20as%20glycosylation.
  3. Iyer, S., Sam, F. S., DiPrimio, N., Preston, G., Verhejein, J., Murthy, K., . . . Perlstein, E. (2019, November 11). Repurposing the aldose reductase inhibitor and diabetic neuropathy drug epalrestat for the congenital disorder of glycosylation PMM2-CDG. Retrieved from: https://journals.biologists.com/dmm/article/12/11/dmm040584/223279/Repurposing-the-aldose-reductase-inhibitor-and
  4. Aldose reductase inhibitors for the treatment of Diabetic polyneuropathy. (n.d.). Retrieved April 28, 2021, from https://www.cochrane.org/CD004572/NEUROMUSC_aldose-reductase-inhibitors-for-the-treatment-of-diabetic-polyneuropathy#:~:text=Aldose%20reductase%20inhibitors%20are%20a,or%20reverse%20progression%20of%20neuropathy
  5. Aldose reductase inhibitors for the treatment of Diabetic polyneuropathy. (n.d.). Retrieved April 28, 2021, from https://www.cochrane.org/CD004572/NEUROMUSC_aldose-reductase-inhibitors-for-the-treatment-of-diabetic-polyneuropathy#:~:text=Aldose%20reductase%20inhibitors%20are%20a,or%20reverse%20progression%20of%20neuropathy

The Good, Bad and the Ugly of Reactive Oxygen Species

Like Clint Eastwood, I have always enjoyed the liberation of doing things a bit different from the standard way, but when it comes to the Free Radicals (peroxides, superoxide, hydroxyl radical, singlet oxygen and alpha-oxygen), one needs to question the intent and effect.

Intriguingly, a little bit of free radical action can extend lifespan (Wang and Hekimi 2015), as demonstrated from C. elegans work where, at under 0.1 mM paraquat, the low levels of superoxide anion lead to longer lifespan (Yang and Hekimi 2010; Yee, Yang, and Hekimi 2014). In fact, these longevity finding led this group to speculate:

“Superoxide generation acts as a signal in young mutant animals to trigger changes of gene expression that prevent or attenuate the effects of subsequent aging.”

Yet it is clear that in C. elegans, when the dose reaches 0.4 mM for paraquat, the animals can no longer reach adulthood (Senchuk, Dues, and Van Raamsdonk 2017). And above this concentration, (at either 2-4 mM chronic exposure for days, or > 20mM acute exposure for a few hours), the effects are lethal to life. So, perhaps a bit like getting exercise, a little bit can do some good, but way too much and it will bring you down. So there appears to be an optimal level of ROS in the cell (“the Good”), where the levels of ROS signalling are good for growth, survival and apoptosis, and yet when ROS is pushed above this low level, toxicity to the animal starts to prevail. This helps explain the strange “inverted U effect” seen in C. elegans, where antioxidants at low doses may DECREASE lifespan while oxidants at low doses can INCREASE lifespan. Yet at higher doses the effect becomes switched and oxidants compromise lifespan while antioxidant tend to extend it (Desjardins et al. 2017).

In a recent comprehensive review (Shields, Traa, and Van Raamsdonk 2021), this paradox is acknowledged but the general trend is:

“Interventions that increase ROS tend to decrease lifespan, while interventions that decrease ROS tend to increase lifespan.”

Further, this trend provides support for:

“The free radical theory of aging (FRTA), proposes that oxidative damage caused by reactive oxygen species (ROS) is the primary cause of aging.”

Gene Disruption can Manifest as Shorter Lifespan

Progeria (Hutchinson-Gilford Syndrome) is an extremely rare, progressive genetic disorder that causes children to age rapidly, starting in their first two years of life. The LMNA gene is the sole cause of this syndrome, although there are at least a dozen other gene dysfunctions that lead to progeroid syndromes of shortened life (POLR3A, BANF1, WRN, PYCR1, PDGFRB, RECQL2, B4GALT7, SLC25A24, BSCL2, GORAB, ERCC8, ERCC4) or a prematurely-age appearance (LTBP4, FBLN5, ATP6V0A2, ATP6V1A, EFEMP2, ALDH18A1, ELN). Many of these genes are lethal when made as a homozygous KO disruption, but some genes are not essential and can lead to premature aging when removed from the genome in certain organisms. One such gene is DYRK1A. Although the homozygous knockout in mice is lethal (Fotaki et al. 2002), the removal of this gene in C. elegans renders an animal that can survive but have a shortened lifespan.

Specifically, in the loss-of-function KO of the DYRK1A ortholog (mbk-1), the lifespan was found to be about 30% shorter than wild type. When a human coding sequence for DYRK1A was inserted as gene replacement of mbk-1 (hDYRK1A), an intermediate level of restored lifespan was achieved. When a clinical variant was installed into the humanized locus (R467Q), a partial loss-of-function defect was observed (shorter lifespan). One of our clinical partners at the Mayo Clinic, Dr Tom Caulfield, was able to use the assessment to indicate the clinical variant has abnormal DYRK1A function and is potentially pathogenic.

Does exposure to antioxidant conditions lead to longer lifespan?

In laboratory studies on genetically-modified model organisms, two generalities prevail. The overexpression of antioxidant enzymes has a tendency to extend lifespan (yeast, flies and worm), while their elimination has a tendency to shorten lifespan (yeast, worms, flies and mice) (Shields, Traa, and Van Raamsdonk 2021). For “chemically-modified” model organisms, these same authors note that a variety of antioxidants (N-Acetyl Cysteine, Vitamin C, and Vitamin E) are known to extend lifespan in some organism (worms, flies and mice), but it remains controversial if there is benefit in humans (Bjelakovic et al. 2012; Bjelakovic, Nikolova, and Gluud 2013).  Even more contradictory, exposure to low levels of ROS generating chemicals (2-deoxy-d-glucose, juglone, paraquat, plumbagin, menadione, rotenone, arsenite, metformin and d-glucosamine) all leads to increased lifespan (yeast, worms, and mice) (Shields, Traa, and Van Raamsdonk 2021).  So in the “chemically-modified” animals due to chemical supplementation, both oxidants and antioxidants can help increase lifespan. Key effects on healthspan/lifespan effects will require attention to mixture and dose. As a result, to get answers for any given compound or mixture, a bit of trial and error testing is what is needed (and perhaps a “fist full of dollars”)

Bjelakovic, Goran, Dimitrinka Nikolova, and Christian Gluud. 2013. “Antioxidant Supplements to Prevent Mortality.” JAMA: The Journal of the American Medical Association.

Bjelakovic, Goran, Dimitrinka Nikolova, Lise Lotte Gluud, Rosa G. Simonetti, and Christian Gluud. 2012. “Antioxidant Supplements for Prevention of Mortality in Healthy Participants and Patients with Various Diseases.” Cochrane Database of Systematic Reviews , no. 3 (March): CD007176.

Desjardins, David, Briseida Cacho-Valadez, Ju-Ling Liu, Ying Wang, Callista Yee, Kristine Bernard, Arman Khaki, Lionel Breton, and Siegfried Hekimi. 2017. “Antioxidants Reveal an Inverted U-Shaped Dose-Response Relationship between Reactive Oxygen Species Levels and the Rate of Aging in Caenorhabditis Elegans.” Aging Cell 16 (1): 104–12.

Fotaki, Vassiliki, Mara Dierssen, Soledad Alcántara, Salvador Martínez, Eulàlia Martí, Caty Casas, Joana Visa, Eduardo Soriano, Xavier Estivill, and Maria L. Arbonés. 2002. “Dyrk1A Haploinsufficiency Affects Viability and Causes Developmental Delay and Abnormal Brain Morphology in Mice.” Molecular and Cellular Biology 22 (18): 6636–47.

Senchuk, Megan M., Dylan J. Dues, and Jeremy M. Van Raamsdonk. 2017. “Measuring Oxidative Stress in : Paraquat and Juglone Sensitivity Assays.” Bio-Protocol 7 (1). https://doi.org/10.21769/BioProtoc.2086.

Shields, Hazel J., Annika Traa, and Jeremy M. Van Raamsdonk. 2021. “Beneficial and Detrimental Effects of Reactive Oxygen Species on Lifespan: A Comprehensive Review of Comparative and Experimental Studies.” Frontiers in Cell and Developmental Biology 9 (February): 628157.

Wang, Ying, and Siegfried Hekimi. 2015. “Mitochondrial Dysfunction and Longevity in Animals: Untangling the Knot.” Science 350 (6265): 1204–7.

Yang, Wen, and Siegfried Hekimi. 2010. “A Mitochondrial Superoxide Signal Triggers Increased Longevity in Caenorhabditis Elegans.” PLoS Biology 8 (12): e1000556.Yee, Callista, Wen Yang, and Siegfried Hekimi. 2014. “The Intrinsic Apoptosis Pathway Mediates the pro-Longevity Response to Mitochondrial ROS in C. Elegans.” Cell 157 (4): 897–909.

Hey People ….its SSSH! time – Getting Easier Integration via Transgenesis Innovation

Contrary to what the librarian says to you when you are part of loud conversation, SSSH! time here is referring to the Self-Selecting Safe Harbor (SSSH) tool invented by Zach Stevenson and crew in the Patrick Phillips Lab at the U of Oregon.

There is a back story here. We are proud of our people at InVivo Biosystems (IVB).  Some, like me, have been hanging around with IVB for quite a long time. Others, like Zach, come and go, but still leave their mark.

Zachary joined us when we were pre-merger Knudra Transgencis.  He was fairly new to genome engineering, but Zach was a quick study.  He became a master of CRISPR-based transgenesis which he leveraged in his next career move – helping him get into graduate school at the University of Oregon.  Zach and the team at Knudra had tasked themselves with the aim of finding better tools for detecting genome integration.  We needed efficient systems that help identify only the animals that have experienced genomic integration. Better yet, the tool would be most effective if only the desired genome integrated strains could survive after exposure to a toxic compound.  During Zach’s time at Kundra, the idea floated around a bit, but it never got the legs of experiment implementation to demonstrate its feasibility.

Once in graduate school Zach teamed up with Megan J. Moerdyk-Schauwecker and Brennen Jamison in the Phillips Lab to get the real world evidence that demonstrated the idea can work.  Their team chose the hygR gene, to determine if a split-hygR gene could be harvested as tool to identify integration in a safe harbor locus (Stevenson et al. G3. 2020)

The principle is simple – chop the hygR gene into two parts.  Put the long part into the genome of C. elegans and put the other part in your transgene plasmid. Zach did this at the MosSCI ttTi5605 safe harbor locus. This transgenic target strain contains most of the hygR gene but is missing a critical segment needed for creation of a functional hygromycin B phosphotransferase. Next, their transgene of interest was made in a plasmid that also contains the missing hygR part. The trick now is to have the same sgRNA site in the plasmid and in the edited safe harbor site. The interaction of the plasmid and the genome when injected with CRISPR reagents renders a region of overlap of about 700 bp on each end of the insertion cargo that allows homology repair to do its magic. When designed right, you only need one sgRNA to initiate the DNA cuts that trigger efficient homologous recombination repair.  This technique works great in C. elegans transgenesis. Add hygromycin B to the growth plates and only the genomic-integrated animals can survive.  Whether it can work in embryo injections with other organisms remains to determined.

At IVB we are building on this to use our fast and easy CRISPR-sdm technique to place a small the small split-hygR fragment at any locus of the genome. This will allow us to drive large constructs 5 to 10 KB (and perhaps even 20-100 KB) into any native locus.

Bottom line: Getting some SSSH! time with this split-hygR technique can calm the frustration of the aggravated C. elegans researcher.

Zebrafish Modeling of SARS-CoV-2 susceptibility in Rare Disease

Five months into the COVID-19 pandemic, the world is at 29,119,433 confirmed cases of SARS-Cov2 infection, including 925,965 deaths (6 August 2020 https://covid19.who.int/), which is over 3% of the planet’s human population. Of all the persons infected with the virus, 3.2% have died. If COVID-19 pandemic follows the trajectory of the 1918 influenza, 1/3rd of the world population will become infected [1], and nearly 300 million deaths will occur in the next few years. To put it in perspective, that is more than twice the number of military and civilian casualties of World Wars I and II combined. The medical community is challenged in optimizing their response due to a highly diverse array of infection severity in the human population. Some individuals are asymptomatic but test positive for SARS-CoV-2 infection, while other exposed individuals experience severe, sometimes fatal COVID-19 infection [2–4]. Disease severity heterogeneity is in part due to patients with underlying health conditions or comorbidities such as hypertension and diabetes which we believe share a common pathophysiology of renin-angiotensin system (RAS) [5]. Estimates for the number of people with an underlying condition for increased severity risk of COVID-19 are 1.7 billion persons (22%) [6]. As a result, there is an urgent need to understand the common molecular and cellular pathophysiologic basis of patients with a diversity of comorbidities and identify those with a rare disease that are at high risk for clinical deterioration. Therefore a rare human disease may be the perfect physiologic model to better understand the disease and generate more individualized therapeutic medical responses and positive outcomes for higher risk COVID-19 groups.

Some Rare Disease groups may be hyper susceptible to COVID-19 infection

There is emerging evidence that RD patients have higher COVID-19 infection risk among all human populations. For example, patients with deficiencies in cellular chloride transport due to CFTR variants associated with Cystic Fibrosis (CF) are more prone to viral and bacterial infection [7]. Yet, because COVID-19 is a newly emergent disease, clear correlation of outcomes in SARS-CoV-2 infection in CF are extremely limited in infected CF patients [8,9] but the concern remains high for these patient groups [10]. Another RD group that is likely to be negatively influenced by SARS-CoV-2 infection, are patents with CADASIL. CADASIL (Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy) is caused by genetic lesions in the extracellular domain in NOTCH3 [11]. Like CF, CADASIL appears to be linked to accelerated disease progression via Influenza virus infection [12]. Yet the co-morbidity of SARS-CoV-2 infection with NOTCH3 pathogenic variations is only speculated to be associated with advancing CADASIL presentation [13]. Clear evidence is needed in order for a pathological association between COVID-19 infection and CADASIL is supported or refuted. With this knowledge, appropriate therapeutic medical responses can be identified.

The Renin-Angiotensin System (RAS) is at the center for controlling severity of COVID-19 infection.

SARS-CoV-2 utilizes binding of ACE2 to gain entry into host tissues [14]. The resulting internalization and attenuation of ACE2 enzymatic activity is potentially a significant contributor to COVID-19 disease severity. ACE2 is a critical negative regulator of angiotensin activity. Overactive RAS signaling is implicated in higher risk for cardiovascular [15], renal [16,17], and neurodegenerative [18] disease and diabetes [19]. Angiotensin I (Ang I) signaling peptide is produced from the angiotensinogen precursor by the proteolytic activity of renin. In the canonical RAS signaling pathway, Ang I is catalyzed into Ang II by the ACE dipeptidase. Ang II peptide can bind two GPCR receptors AT1R and AT2R with opposing effects. Ang II, a high affinity agonist of the AT1R receptor, promotes inflammation, apoptosis and vasoconstriction, which often results in hypertension of the patient. The agonist activity of Ang II at AT2R receptors has an opposite effect of being anti-inflammatory, anti-apoptotic and vasodilating, which are activities that lead to lower blood pressure. The expression levels of these receptors will have profound influence on hypertension in tissues as well as the presence of Ang II metabolites. Ang II is catalyzed into Ang III by an amino peptidase that remove the N terminal asparagine and results in a molecule with increased affinity for the AT2R receptor and thus activates anti-hypertensive activity [20]. Additional control of the hypertensive state is achieved via the ACE2 dipeptidase which catalyzes Ang I into Ang 1-9 and Ang II into Ang 1-7. Ang 1-9 acts as agonists of the AT2R receptor. Similar to the anti-hypertensive activity of Ang 1-9, the Ang 1-7 peptide also promotes hypotension but through agonist activity on a different GPCR, the MasR receptor [21].

Small Vessel Disease (SVD) is a major health issue.

SVD, defined as “perforating arteries, arterioles, capillaries, and venules” is currently associated with 20% of stroke and 40% of dementia [22]. ACE polymorphisms have been associated with stroke-associated white matter hyperintensities [23] and ischemic stroke [24]. CADASIL is one of the most common single-gene disorders causing cerebral SVD [25]. Hypertension is a risk factor for SVD [26] and leads to vascular remodeling [27]. Since Ang II leads to vascular remodeling via VSMC dedifferentiation [28,29], it becomes plausible that CADASIL variants in NOTCH3 participate in generalizable VSMC remodeling of SVD via altered RAS signaling activity which may predispose these patients to a hypersensitivity to COVID-19 infection. Yet direct evidence is lacking that CADASIL-associated NOTCH3 variants have altered RAS signaling activity that leads to higher viral infectivity and/or RAS stress response.

(adapted from U.S. National Library of Medicine)

The zebrafish animal model is well suited to modeling CVD and CADASIL.

The zebrafish model organism is one of the fastest growing animal models. Walcot and Peterson have proposed zebrafish are a good model for cardiovascular disease due to “its morphological and physiological similarity to human cerebral vasculature, its ability to be genetically manipulated, and its fecundity allowing for large-scale, phenotype-based screens” [30].  For instance a Tg(flk1:GFP) reporter can be expressed in blood vessels and be visualized by fluorescence microscopy. Although iPSCs allow for a more native context, the ability of zebrafish to reproducibly make microvascular structures make them highly attractive for SVD modeling. Further, expertise and skill in gene editing is allowing the rapid creation of gene knock-outs and knock-ins throughout the zebrafish genome. We now have the ability to humanize either by putting in patient gene variations at the zebrafish version of the gene.  Or, swapping out the entire gene for the human gene coding sequence.  The end result is a well controlled system for examining effects of clinical variants on gene function.

CRISPR Engineering can be used to install variants into zebrafish

The creation of precision gene edits in zebrafish allows accurate measurement of the functional consequence of a clinical variant. The use of CRISPR (clustered-regularly-interspaced-short-palindromic-repeats) guide RNA targets cas9 nuclease to a specific genomic locus and has become a common method for targeting DNA cleavage at a genomic locus near a clinical variant target site.  The CRISPR method has become quite routine for genomic locus disruption through Non-Homologous-End-Joining (NHEJ) activity, which creates gene-disrupting indels at a cleaved locus [31,32].  More challenging to achieve is the use of Homology-Directed-Repair to create precision genome editing at a target locus [33]. Often a donor-homology DNA is used to instruct the cell’s natural DNA repair mechanisms to insert a specific sequence of DNA at the cleaved locus. Some groups have developed ways to use donor homology sequences to guide precision insertion of content into the genome of iPSCs and create precision deletions [34] or precision insertion of reporter genes [35,36]. Yet, interference from NHEJ-mediated indels in attempts at precision editing can pose a problem when the researcher desires to  isolate a line with only precision edits. Often biallelic editing occurs in an HDR attempt that results in a complex heterozygote. One allele may report to edit as desired, but often the other sister chromatid locus has an undesirable indel. Development of methods that suppress indel formation by avoiding NHEJ activity can be useful in the creation of a biallelic conversion that creates the desired HDR-mediated edit in both chromatid loci.

A zebrafish model system for assessing COVID-19 viral uptake sensitivity is set up by first humanizing appropriate genes (NOTCH3) and then installing CADASIL-associated variants.

Humanization increases the data relevance of animal models. In this project idea, we can create a humanized zebrafish expressing the hNOTCH3 coding sequence inserted in the first exon of the zebrafish notch3 gene. In a two step procedure, we first use HDR-directed CRISPR gene editing to insert a phiC31 transposase acceptor sequence (attB sequence: CGGTGCGGGTGCCAGGGCGTGCCCTTGGGCTCCCCGGGCGCGTACTCCAC) to disrupt the notch3 locus with early stop codon insertion. Next we rescue the null with phiC31 insertion of hNOTCH3 expression cassette, which is a proven method for inserting large cargo at site specific loci in zebrafish. The cassette cargo contains a self cleaving T2A peptide prior to the human hNOTCH3 coding sequence. This enables expression of the human transgene to be driven by the promoter elements of the zebrafish notch3 gene and yet avoid chimeric protein formation with the vestigial notch3 coding fragment. To create a clinical variant model, the same plasmid used for human hNOTCH3 insertion is modified to contain a clinical variant (C174Y). Heterozygotic animals are expected to be generated (trangene/+). Both the attB containing animal (“knock-out”) and the hNOTCH3(C174Y) are expected to not isolate as homozygous due to essential nature of NOTCH3, while the hNOTCH3(wt) is hoped to remain viable as homozygote. Once these goals and expectations are met, the animals can be used to explore CADASIL associated pathologies. For instance, viral particles expressing the S-peptide receptor binding domain can be exposed to normal and variant zebrafish to monitor for different rates of viral entry.

The uses of a humanized zebrafish model range from diagnostic to drug discovery applications.

Researchers at various institutions can use the animals to determine which human pathologies are conserved. For mechanism-directed drug development, researchers can use in silico methods to discover small molecules that can be screened for specifically restoring the cysteine balance and promoting normal gene function. For pathway-directed drug development, researchers can use the animals for RNA-seq experiment to discover biomarkers consistent with disease and then harvest these genes with high expression response to create fluorescent reporters of pathogenic activity. The end result of this project funding is a platform for rapid assessment and drug discovery in CADASIL-associated disease.

  1. 1918 Pandemic (H1N1 virus). 16 Jun 2020 [cited 28 Aug 2020]. Available: https://www.cdc.gov/flu/pandemic-resources/1918-pandemic-h1n1.html
  2. Paces J, Strizova Z, Smrz D, Cerny J. COVID-19 and the immune system. Physiol Res. 2020;69: 379–388.
  3. Gao Z, Xu Y, Sun C, Wang X, Guo Y, Qiu S, et al. A Systematic Review of Asymptomatic Infections with COVID-19. J Microbiol Immunol Infect. 2020. doi:10.1016/j.jmii.2020.05.001
  4. CDC COVID-19 Response Team. Preliminary Estimates of the Prevalence of Selected Underlying Health Conditions Among Patients with Coronavirus Disease 2019 – United States, February 12-March 28, 2020. MMWR Morb Mortal Wkly Rep. 2020;69: 382–386.
  5. Onweni CL, Zhang YS, Caulfield T, Hopkins CE, Fairweather DL, Freeman WD. ACEI/ARB therapy in COVID-19: the double-edged sword of ACE2 and SARS-CoV-2 viral docking. Crit Care. 2020;24: 475.
  6. Clark A, Jit M, Warren-Gash C, Guthrie B, Wang HHX, Mercer SW, et al. Global, regional, and national estimates of the population at increased risk of severe COVID-19 due to underlying health conditions in 2020: a modelling study. Lancet Glob Health. 2020;8: e1003–e1017.
  7. Bucher J, Boelle P-Y, Hubert D, Lebourgeois M, Stremler N, Durieu I, et al. Lessons from a French collaborative case-control study in cystic fibrosis patients during the 2009 A/H1N1 influenza pandemy. BMC Infect Dis. 2016;16: 55.
  8. Colombo C, Burgel P-R, Gartner S, van Koningsbruggen-Rietschel S, Naehrlich L, Sermet-Gaudelus I, et al. Impact of COVID-19 on people with cystic fibrosis. Lancet Respir Med. 2020;8: e35–e36.
  9. Cosgriff R, Ahern S, Bell SC, Brownlee K, Burgel P-R, Byrnes C, et al. A multinational report to characterise SARS-CoV-2 infection in people with cystic fibrosis. J Cyst Fibros. 2020;19: 355–358.
  10. Manti S, Parisi GF, Papale M, Mulè E, Aloisio D, Rotolo N, et al. Cystic Fibrosis: Fighting Together Against Coronavirus Infection. Front Med. 2020;7: 307.
  11. Wang MM. CADASIL. Handb Clin Neurol. 2018;148: 733–743.
  12. Mizutani K, Sakurai K, Mizuta I, Mizuno T, Yuasa H. Multiple Border-Zone Infarcts Triggered by Influenza A Virus Infection in a Patient With Cerebral Autosomal Dominant Arteriopathy Presenting With Subcortical Infarcts and Leukoencephalopathy. Journal of Stroke and Cerebrovascular Diseases. 2020. p. 104701. doi:10.1016/j.jstrokecerebrovasdis.2020.104701
  13. Williams OH, Mohideen S, Sen A, Martinovic O, Hart J, Brex PA, et al. Multiple internal border zone infarcts in a patient with COVID-19 and CADASIL. Journal of the Neurological Sciences. 2020. p. 116980. doi:10.1016/j.jns.2020.116980
  14. Zhang H, Penninger JM, Li Y, Zhong N, Slutsky AS. Angiotensin-converting enzyme 2 (ACE2) as a SARS-CoV-2 receptor: molecular mechanisms and potential therapeutic target. Intensive Care Med. 2020;46: 586–590.
  15. Paz Ocaranza M, Riquelme JA, García L, Jalil JE, Chiong M, Santos RAS, et al. Counter-regulatory renin-angiotensin system in cardiovascular disease. Nat Rev Cardiol. 2020;17: 116–129.
  16. Ichihara A, Kobori H, Nishiyama A, Navar LG. Renal renin-angiotensin system. Contrib Nephrol. 2004;143: 117–130.
  17. Nishiyama A, Kobori H. Independent regulation of renin-angiotensin-aldosterone system in the kidney. Clin Exp Nephrol. 2018;22: 1231–1239.
  18. Almeida-Santos AF, Kangussu LM, Campagnole-Santos MJ. The Renin-Angiotensin System and the Neurodegenerative Diseases: A Brief Review. Protein Pept Lett. 2017;24: 841–853.
  19. Henriksen EJ, Prasannarong M. The role of the renin-angiotensin system in the development of insulin resistance in skeletal muscle. Mol Cell Endocrinol. 2013;378: 15–22.
  20. Bouley R, Pérodin J, Plante H, Rihakova L, Bernier SG, Maletínská L, et al. N- and C-terminal structure-activity study of angiotensin II on the angiotensin AT2 receptor. Eur J Pharmacol. 1998;343: 323–331.
  21. Cha SA, Park BM, Kim SH. Angiotensin-(1-9) ameliorates pulmonary arterial hypertension angiotensin type II receptor. Korean J Physiol Pharmacol. 2018;22: 447–456.
  22. Coupland K, Lendahl U, Karlström H. Role of NOTCH3 Mutations in the Cerebral Small Vessel Disease Cerebral Autosomal Dominant Arteriopathy With Subcortical Infarcts and Leukoencephalopathy. Stroke. 2018;49: 2793–2800.
  23. Paternoster L, Chen W, Sudlow CLM. Genetic determinants of white matter hyperintensities on brain scans: a systematic assessment of 19 candidate gene polymorphisms in 46 studies in 19,000 subjects. Stroke. 2009;40: 2020–2026.
  24. Zhang Z, Xu G, Liu D, Fan X, Zhu W, Liu X. Angiotensin-converting enzyme insertion/deletion polymorphism contributes to ischemic stroke risk: a meta-analysis of 50 case-control studies. PLoS One. 2012;7: e46495.
  25. Choi JC. Genetics of cerebral small vessel disease. J Stroke Cerebrovasc Dis. 2015;17: 7–16.
  26. Cuadrado-Godia E, Dwivedi P, Sharma S, Ois Santiago A, Roquer Gonzalez J, Balcells M, et al. Cerebral Small Vessel Disease: A Review Focusing on Pathophysiology, Biomarkers, and Machine Learning Strategies. J Stroke Cerebrovasc Dis. 2018;20: 302–320.
  27. Renna NF, de Las Heras N, Miatello RM. Pathophysiology of vascular remodeling in hypertension. Int J Hypertens. 2013;2013: 808353.
  28. Xu H, Du S, Fang B, Li C, Jia X, Zheng S, et al. VSMC-specific EP4 deletion exacerbates angiotensin II-induced aortic dissection by increasing vascular inflammation and blood pressure. Proc Natl Acad Sci U S A. 2019;116: 8457–8462.
  29. Mondaca-Ruff D, Riquelme JA, Quiroga C, Norambuena-Soto I, Sanhueza-Olivares F, Villar-Fincheira P, et al. Angiotensin II-Regulated Autophagy Is Required for Vascular Smooth Muscle Cell Hypertrophy. Front Pharmacol. 2018;9: 1553.
  30. Walcott BP, Peterson RT. Zebrafish models of cerebrovascular disease. J Cereb Blood Flow Metab. 2014;34: 571–577.
  31. Ma Y, Zhang L, Huang X. Genome modification by CRISPR/Cas9. FEBS J. 2014;281: 5186–5193.
  32. Zhang J-H, Adikaram P, Pandey M, Genis A, Simonds WF. Optimization of genome editing through CRISPR-Cas9 engineering. Bioengineered. 2016;7: 166–174.
  33. Pawelczak KS, Gavande NS, VanderVere-Carozza PS, Turchi JJ. Modulating DNA Repair Pathways to Improve Precision Genome Engineering. ACS Chem Biol. 2018;13: 389–396.
  34. Deneault E, White SH, Rodrigues DC, Ross PJ, Faheem M, Zaslavsky K, et al. Complete Disruption of Autism-Susceptibility Genes by Gene Editing Predominantly Reduces Functional Connectivity of Isogenic Human Neurons. Stem Cell Reports. 2018;11: 1211–1225.
  35. Soldner F, Laganière J, Cheng AW, Hockemeyer D, Gao Q, Alagappan R, et al. Generation of isogenic pluripotent stem cells differing exclusively at two early onset Parkinson point mutations. Cell. 2011;146: 318–331.
  36. Deneault E, Faheem M, White SH, Rodrigues DC, Sun S, Wei W, et al. or human iPSC-derived neurons from individuals with autism develop hyperactive neuronal networks. Elife. 2019;8. doi:10.7554/eLife.40092

How to Model Splice Variation in Animal Models

I frequently get asked:

How do you model splice variations in your animal model systems?

Splice variation is an important consideration in genomic analysis of patient variations and it is often overlooked (PMID: 29680930).  It is estimate that 15%–60% of human disease mutations are due to splicing defect ( PMID: 29304370). So, with close to 40% of disease causing variation likely being attributable to splicing defects, it becomes an important variation to be able to model in functional studies to determine if the variant is pathogenic.

But let’s first look at the process of splicing and what is known.

image credit (Abramowicz and Monika, J Appl Genet. 2018; 59(3): 253–268. PMID: 29680930)

This complex process is managed in a complex way.  Certain cell types will favor one form of splicing, while other tissues will select other forms. This is the natural splice isoforms variation that gives us more than just the number of genes in the genome to control biological output. In fact, this is part of the explanation for why a C. elegans nematode or a zebrafish, with roughly the same number of genes as a human, have such different levels of output complexity. Currently the number of functional isoforms in humans may be an order of magnitude more than what occurs in the nematode. Furthermore, the ways splice variation can take place gets bewildering quick.

Image credit: (Park et al., Am J Hum Genet. 2018 Jan 4; 102(1): 11–26. PMID: 29304370)

How is splicing observed in the patient?

Layer on top of this the aberrant spice variations that can cause disease, and we have a tough interpretation problem. Thankfully RNAseq is providing a huge amount of diagnostic discovery for splice variation. We can compare the splicing patterns in healthy populations with a patient suspected of a genetic disease and visualize where the splicing is going wrong (PMID:28424332).

image credit: (Cummings et al., Sci Transl Med. 2017 Apr 19;9(386):eaal5209 PMID: 28424332)

Modeling Splice Variation in Animal Models

To reduce the complexity of biology and yet bring more comparative biology relevance, often we can take a human cDNA sequence and use it to rescue the function of the animal’s version of the gene.  To do this, we use CRISPR to remove the animal’s version of the gene (a gene “knock out”).  Next we take a human cDNA sequence optimized for expression in the animal and either replace the deleted locus or express the sequence in trans (at a safe harbor site using the promoter that is either endogenous to the removed gene, or a promoter well established for appropriate tissue expression).  In C. elegans, we have been pleasantly surprised that more than half the time for orthologs of at least 30% identity, we can get significant rescue of the loss of function seen in the knock out.  In zebrafish, we have started applying the same techniques of gene replacement. The result, a set of gene humanized animals where the conservation of biology means we are looking at functional outputs that are highly similar

Missense variations are conceptually easy to model.  An amino acid change that is pathogenic (ex R235Q in STXBP1) is is installed with CRISPR using a simple donor homology that instructs the cell’s HDR to alter the DNA coding for Q (glutamine) into a code for R (Arginine) in our “wildtype” humanized locus.  

But how do we mimic a splice variation?

It is actually quite simple. We create a donor homology that makes any splice form of interest.  We are not interested in the mechanism to answer “if” it occurs – RNAseq already answers that. We are after functional consequence.  We want to answer “does a particular splice form in question have a measurable defect compared to the normal splicing.” 

Let’s look at one of the patient examples in detail.

image credit: (Cummings et al., Sci Transl Med. 2017 Apr 19;9(386):eaal5209 PMID: 28424332)

In the red we have 4 patients with a collagen gene splice defect suspected of involvement in their diagnosis for Ullrich Congenital Muscular Dystrophy. Since all persons have two copies of the COL6A1 gene, we can see that one copy is splicing normally while the other copy is defective and its splicing brings in a pseudoexon. “The resulting inclusion of 24 amino acids occurs within the N-terminal triple-helical collagenous G-X-Y repeat region of the COL6A1 gene, the disruption of which has been well established to cause dominant-negative pathogenicity in a variety of collagen disorders” (PMID: 28424332)

Creating Knock-in for Animal Model of Disease

In regards to disease modeling of splice variations, we use a cDNA rescue approach.  The variation seen in the patient is made as a plasmid coding for expression of a modified cDNA.  This cDNA contains the human gene code that is suspected of creating an aberrant spice variation. Using CRISPR techniques, the segment coding for human DNA is inserted into the genome, typically at the orthologous locus of the animal.

Modeling in the C. elegans nematode. 

To model the COL6A1, we would first seek to understand the phenotype from loss of function of the animal’s ortholog version of the human gene. For COL6A1, this is the C16E9.1 gene in C. elegans. This gene is not well studied in the nematode, but does show high expression in the alternative life state of dauer.

The first step is to make a gene knock-out to remove the C16E9.1 gene from the worm genome. Next, a series of functional assays are run to determine if a functional defect can be detected for the C16E9.1 knock-out as a loss-of-function allele. For essential genes, the ultimate manifestation of loss of function is lethality as a homozygote. In other genes critical genes will often manifest with functional defect after a battery of functional screens are performed. Once a defect in activity is observed, human cDNA can be introduced to see if rescue of function can be obtained. When rescue is obtained with human cDNA, we know we are looking at conserved biology for gene function between the animal and humans.

Once we have rescue of function, the fun begins. We can use CRISPR to put in the exact content that RNAseq indicates is occurring in the human gene. The pseudoexon seen in one copy of the patient’s chromosome pair can be made in the animal. Often if the patient variant is problematic from a loss of function perspective where haploinsufficiency drives disease. When a defect is made as a homozygote in the animal, the effect is usually a severe phentoype (often lethal) and is similar to what is seen in the gene knockout. Yet in the specific case from above with the pseudoexon in COL6A1, we are dealing with a dominant negative effect, so the defective splice not only disrupts this protein, it also causes the good copy to fail to function properly. Animals homozygous for the pseudoexon defect may actually have a less strong defect phenotype than when the animals are made as heterozygotes. Creation of the patient’s heterozygous condition is achieved by crossing the splice-variant-containing humanized animal model into the wild type humanized animal model and examining the cross progeny for defects in activity.

Modeling in the Zebrafish. 

We can do similar modeling in zebrafish using the Tol2 system. In zebrafish there is one ortholog for the COL6A1. The col6a1 zebrafish gene has 55% sequence identity and 70% sequence similarity to humans. Like the work in the C. elegans nematode, we can remove the native gene and look for functional consequences. CRISPR techniques are used to create a knockout by inserting a stop codon early in the gene. If designed right, this results in loss of all expression for col6a1. Next we can measure the functional consequence of the gene knock-out by first trying to see if the animal can be made homozygous. If it is not lethal, the animal can be screened by a battery of assays to determine if a functional defect exists. Finding either lethality as homozygote, or observing a functional defect, allows testing for capacity of human COL6A1 cDNA to rescue function. A gene insertion approach using Tol2 is used to bring in the cDNA with an appropriate tissue-specific promoter.  Rescue of function in specific tissues, for instance with the use of the 195 bp unc45b promoter for skeletal muscle expression (PMID: 27295336), will help elucidate the important roles of COL6A1 in dystrophy diseases.

The pseudoexon insertion defect seen in COL6A1 is a dominant negative variation. So, when a single copy of this gene is brought into the animal, it will have the capacity to suppress the activity of the unmodified copy of the gene.  By inserting the cDNA with the patient into a safe harbor site we create a pseudo heterozygote.  The dosage of the cDNA comes from two chromosomal positions while the wildtype locus provides expression of two copies of the normal gene.  If the cDNA is dominant negative on its effect on the zebrafish gene, then defect of gene function will manifest.

Recap of Splice defect Modeling in Animal Models

In summary, the ability to model splice variants is done from a cDNA level. A modified cDNA rescue construct containing the human gene of interest is designed in three forms:

Positive Control (blue):  The humanized wildtype cDNA provides a reference of the normal gene seen in healthy individuals.  

Negative Control (red): A knockout deletion of the animal’s gene provides reference for full loss of function of the gene. 

Test (yellow): A variant is tested for its functional activity. A range of activities is expected and depends on the pathogenic variant’s mechanistic role in disease pathology.  It may be a dominant negative that creates a pathology worse than the loss of function allele because it binds to and causes bad behavior from the remaining good copy of the gene.  Alternatively, the variant may cause loss of function. This will be either recessive and manifest as a homozygous, or it will be dominant and manifest by haploinsufficiency as a heterozygote.  Finally, the variant of interest may cause a gain of function, which is typically manifest only the heterozygote.

COVID19 Pandemic: Problems and Solutions – What Can We Do?

Like many responsible citizens of the USA, I am hunkered down in self quarantine.  Trying to come to grips with new uncertainty in the world and how it will have impact on my life.  I find myself wanting to help. To do what I can to help stop this beast we call COVID19. But How?

Collective Behavior Modification is Part of the Answer.

What we are dealing with here mutually is a form of trauma.  We are experiencing the Kuber-Ross grief cycle. Each of us are oscillating somewhere along its spectrum.


image credit: adapted from psycom.net/depression.central.grief

Many of us are stuck in denial.  Although as of late, I seem to be bouncing from anger to bargaining to depression and back again (note, this entire piece for the blog is an elaborate form of bargaining!!!) 

I want to get to acceptance and start doing constructive things.

Shedding the denial by attempting to understand the size of the problem.

If you somehow have been in a timewarp, or you have been tunneling under a rock for the past few months, COVID19 is a raging pandemic that is threatening to kill our parents/grandparents. And it is causing major economic crises throughout the world. Some of our leadership say this will pass in just a few weeks from now, but when you dive into the data, that suggestion is just plain ludicrous fantasy. It will take many months of concerted efforts by everyone to tackle the COVID19 problem. If we choose to ignore what we need to do, the ramifications are immense.

Some are calling it a war, so lets look at it from that lens.

The civil war was an extremely traumatic event for my home country of the USA.  We are still dealing with its baggage to this day and its death toll was immense. World War II was a big effort, industries stopped what they were doing and channeled vast resources to the war effort. Many a dad and mum did not come home, but the concentrated effort of its population probably helped reduce loss of life.  When we look at the loss of life from World War II we have a significant 290 deaths per 100,000. Everyone knew someone who did not come home. Yet, that death rate pales in comparison to what unmitigated COVID19 could do to us in the coming months if unchecked. More than twice as many dead if we do nothing.

Is All This Stuff Real?  – Yes, ….This is Getting Real, …Real Quick

A recent article by a very respected group of researchers paints an alarming picture. Ferguson et al released a study March 16 that describes 3 scenarios each with different consequence on the health system measured in critical care beds occupied per 100,000 population.


image credit: adapted from Ferguson et al.

Scenario 1. Do Nothing. 

In this scenario (black line) we get the 2.2 million dead in the USA. The surge capacity of hospital beds is overwhelmed 25x.  Many die that could have been saved, if we had the resources.

Scenario 2. Mitigation (“soft”)

In the mitigation scenario we have multiple options. We can close schools and universities. This surprisingly does not have much effect (green line). The algorithm takes into account “Household contact rates for student families increase by 50% during closure. Contacts in the community increase by 25% during closure” which offsets gains of social distancing achieved with school closure.  Case isolation by itself has a stronger effect (orange line). For Case Isolation, symptomatic cases are asked to stay home for 7 days. This reduces “non household contacts by 75% for this period. Household contacts remain unchanged. Assumes 70% of households comply with the policy.” Add to Case Isolation a Home Quarantine where family sequesters for 14 days (assume 50% comply), and we get a boost that has more than halved the lethality rate. Finally add in social distancing in the greater than 70 year olds, where this group sequesters themselves away from close contact by avoiding crowd gatherings, maintaining 6 feet distance from strangers, avoiding restaurants and the like, and we get additional decrease in hospitalization (blue line). Yet even this multi-step mitigation effort is not enough. It is too soft to have the needed impact. Even with the multiple mitigation measures in place, the capacity of the health care system is still overwhelmed by more than 8x. More severe and hard measures are needed.

Scenario 3 Suppression (“hard”)

In mitigation, we are creating a decrease in the transmission number (R0 = R “naught”), which is the average number of persons infected by a person who is actively shedding the COVID19 virus.  The do nothing R0 number is 2.4. This means an infectious person spreads COVID19 to an average of 2.4 persons. Mitigation decreases the R0 number, but does not drive it down to 1 or below. To get R0 below 1 where the infected are infecting less than one person, bigger steps need to be taken by the entire population.  To implement effective suppression, we do the Case Isolation, as seen in mitigation, but now we add in General Social Distancing. We ask everyone to avoid getting together in groups exceeding 10 persons. Each of us should maintain 6 foot distance when talking to others. We should capture a sneeze or cough in the elbow or a tissue, and wash hands way more frequently than usual. Further, we should avoid touching our hands to face as much as possible.  It is expected that “all households reduce contact outside the household, school or workplace by 75%. School contact rates are unchanged, workplace contact rates reduced by 25%. Household contact rates are assumed to increase by 25%.” Yet this his is still not enough to get to R0 down to 1, so we explore two more steps as options.

image credit: adapted from Ferguson et al.
  • Option 1. (Case Isolation and General Social Distancing) + Household Quarantine
  • Option 2. (Case Isolation and General Social Distancing) + School and University Closure

We can see that adding in Household Quarantine, during the 5 months that suppression measures are in place, gets us just to the hospital bed surge capacity (red line). Yet in contrast, School and University Closure has an even more pronounced effect.  During the 5 month period, we are well below the stress capacity of the medical services. We will know the success of our suppression approaches when there is no capacity problem being detected.

Herd Immunity – Resisting the Urge to Celebrate Too Soon.

Unfortunately, when the suppression measures are lifted, the COVID19 is due to come roaring back this fall. The advantage of option 1 (Household Quarantine) is it allows for better development of Herd Immunity.  From wikipedia we have the definition of Herd Immunity as occurring “when a large percentage of a population has become immune to an infection, whether through previous infections or vaccination, thereby providing a measure of protection for individuals who are not immune”  With an R0 of 2.4 for COVID19, roughly half of a population needs to be exposed, either by recovering from infection or use of effective vaccine, and R0 is driven down to 1 or less.

Pharmaceutical Interventions – what can we do?

Chloroquine. Currently there are no approved treatments as a vaccine or drug therapy against COVID19. Yet, if we had them in place, we might be able to more quickly get control of this infectious disease and be spared the estimated 12 to 18 months of various mitigation and suppression techniques – approaches that slowly build herd immunity at the least amount of deaths. Scientists and the pharmaceutical industry are rallying quickly get pharmaceutical interventions in place.  And they have made some interesting findings. Hydroxychloroquine, less toxic than Chloroquine, has undergone a small clinical trial on COVID19 patients with very encouraging results. Gautret et al observed a dramatic 2-3 fold faster clearance of SARS-CoV-2 (the “official name” of the COVID19 virus) relative to untreated controls. The results became even more dramatic for another small group of patients that received both hydroxychloroquine and azithromycin – complete clearance in all patients was observed at day 4 of the 6 day observation window.  Yet the numbers tested are tiny. So repeating this in larger populations will tell us if they are onto something. Nevertheless, it is a promising start. The current hypothesis is that this antimalarial drug blocs viral envelope fusion by altering the pH of the endosome and thereby slowing down the activity of the acid proteases present (cathepsins or possibly TMPRSS2).  Yet it is important to consult a doctor first, because self medication has resulted in unnecessary death.

image credit: adapted from PLoS Pathog, 10 (11), e1004502 2014

The action of chloroquine may be multimodal.  In 2005, it was demonstrated chloroquine in a SARS-CoV infection of a cell line caused incomplete glycosylation of ACE2 and that it can have “an antiviral effect during pre- and post-infection conditions suggest that it is likely to have both prophylactic and therapeutic advantages.

Camostat mesilate. In a drug targeting approach, Hoffmann et al monitored the classic endosomal-lysosomal entry for coronaviruses with an endosomal fusion assay. They found entry into the cytoplasm to be mediated by the activity of TMPRSS2 and cathepsin proteases. First, these authors made a comparison between SARS-CoV (the coronavirus causing the pandemic of 2012) and the SARS-CoV-2 (the coronavirus causing the current COVID19 pandemic) and found S protein In COVID19 is more prone to cleavage at the S1/S2 site. Next, they looked at inhibitors of cathepsin (E-64d) and TMPRSS2 (camostat) and found that, depending on cell type, inhibition by either protease could interfere with early fusion. When they looked at a lung cell line (Calu-3), they found that camostat could strongly inhibit early fusion. The site of cleavage for TMPRSS2 protease is the same site for furin and loss of this ability to cleave S-protein is critical for viral entry into the cell. Further Camostat is nearly established for efficacy and safety in humans for treating pancreatitis.  In summary, these researchers found “SARS-CoV-2 can use TMPRSS2 for S protein priming and camostat mesylate, an inhibitor of TMPRSS2, blocks SARS-CoV-2 infection of lung cells.”   So we have two very good candidate molecules for use in suppressing COVID19.

image credit: adapted from Hoffmann 2020

Smoking Gun – But Where is the Expression?

Although camostat can have a dramatic impact on early fusion and it appears to be acting on TMPRSS2 serine protease, what has puzzled me is the tissue specificity. If the primary mode of infection is the lung, then it stands to reason that lung tissue should have high expression of the protease. Yet when one looks at the tissue-specific expression profiles on the Human Protein Atlas (HPA), the expression of TMPRSS2 is absent in the lung.

image credit: https://www.proteinatlas.org/

TMPRSS2 is related to TMPRSS11D 55% sequence similarity and 40% identity). The  Pharos Drug Database indicates the TMPRSS11D gene may also be involved in coronavirus fusion in the cell.  Like TMPRSS2, the Pharos database indicates TMPRSS11D protease “plays a role in the proteolytic processing of ACE2.” Intriguingly, the TMPRSS11D gene has a signature for expression for being in lung tissue via Human Protein Atlas (HPA) (dark green bar). Further, examining ligands for these two genes via the Pharos Drug Database indicates both proteins share two ligands (compound 5 and its derivative CHEMBL1809251). Since these shared molecules have similar binding affinities between the proteins, it may be the  topography of their active sites is similar. Although they both can cleave ACE2, it remains to be shown if they also have similar activities on the S protein. Yet if true, then we have two enzymatic targets for therapeutic development.

image credit: https://www.proteinatlas.org/

ACE2 expression also puzzled me. Its protein expression overview in Human Protein Atlas (HPA) shows no lung expression but expression in the gut is off-the-charts. Auguring in on two papers indicates that its expression does occur in the lung, but only in a small subset of lung cells . These references indicate many other tissues have high expression of of ACE2 (gut and throat). Further examination of tissue localization for SARS-CoV-2 indicates the following tissues exhibit high infection. We have our expected lung (alveolar epithelial cells) but we also have gut (mucosal enterocytes of the intestine, stomach, trachea/bronchus, distal convoluted renal tubule, sweat gland, parathyroid, pituitary, pancreas, adrenal gland, liver and cerebrum) and many of these tissues also express ACE2.

Using Molecular Dynamics Modeling to Dock Candidate Compounds

image credit: adapted from PyMOL rendering

If an enzyme has a good crystal structure, one can do simulated docking of compounds to come up with categories of interacting molecules that can be used as hits for exploring their capacity to block viral entry into target cells. Regrettably crystal structures are lacking for both TMPRSS2 and TMPRSS11D. Not true for ACE2 gene, since the finding of its association to SARS in 2003, multiple crystal structures are available (6LZG, 6M0J, 6M17, 6VW1). In one recent structure, we can see the binding interface between CoV-2 and ACE2. One approach might be to design an interference molecule that “cloaks” the ACE2 molecule. Ideally, it would interfere with S protein binding but not block normal enzymatic activity. 

image credit: adapted from PyMOL rendering

Another possible target for drug development is the main protease (“Mpro”). The main protease (also called “3CLpro”) is a protease that helps process the long protein polymer made from the viral genome into functional fragments.  The main protease as been derived for its structure at the molecular level. A peptidomimetic α-ketoamides as broad-spectrum inhibitors has been designed and if safety profiling indicates it has low toxicity against human proteases, screening other drug candidates for interaction could prove to be therapeutically useful.

Time is of the essence – the evolving landscape

Finding new therapeutic approaches quickly is important because as COVID19 spreads, the virus is able to explore diversity through mutation. Increasing levels of heterogeneity can be expect for a single stranded RNA virus – errors in the genome during the viral replication cycle will accumulate.  A recent study was made available on the web that looks at genetic diversity among COVID19 strains.  Especially disconcerting is that as the virus spreads, different strains are arising with different mutational lineages.  Mutations in the S-peptide binding to ACE2 or in the S1/S2 cleavage site could render a new lineage that is more virulent.  This is important because the binding affinity (Kd) of SARS-CoV-2 is nearly 5x more than Sars-CoV.


image credit: nextstrain.org/ncov

Fight Viral Diversity with Human Diversity

The natural diversity of human populations might offer some defense against viral evolution.  The Gnomad database is a good resource for examining the diversity in humans. If we look at the binding interface between S-peptide and human ACE2, we cans see if there are humans with variations at the binding interface which may disrupt COVID19 infections. There are multiple EM structures to reference for examining the binding interface (6LZG, 6M0J, 6M17, 6VW1). Using the 6M17 structure, Yan et al showed the binding interface is in close proximity to many residues in ACE2 (Gln24, Asp30, His34, Tyr41, Gln42, Met82, Lys353, and Arg357).


image credit: Yan et al. Science  04 Mar 2020:

Many gene variants seen in human populations exist at this binding interface. One intriguing residue is Lys26Arg. This genetic variation exist in 600 for every 100,000 persons. Although the change may not seem to be too drastic – a positive charged amino acid is substituted with a similarly charged amino acid.  Yet we know that it is arginine, and according to a prior blog post, the arginine amino acid hold special privilege for its involvement in population variation analysis.

We can float two hypothesis regarding this Lys26Arg variant  

Hypothesis A – Resistance: Persons with the Lys26Arg human variant might have an ACE2 protein with disrupted interaction to the S-protein spike. The S-protein becomes highly compromised for tricking this ACE2 protein into helping it get inside the cell.  


Hypothesis B – Sensitivity: Persons with the Lys26Arg human variant might have an ACE2 protein exhibiting stronger interaction to the S-protein spike. The S-protein can use this ACE2 protein to get inside the cell more efficiently.

The Problem of Heterogeneity

We are diploid organisms which means we have two copies for every gene.  In regards to hypothesis A, this mean most of the persons carrying the Lys26Arg have only one copy.  The other copy is the common natural variant (“wildtype”). Because they harbor a wild type variant, these persons at best would about 50% less susceptible. Thus hypothesis A for COVID19 resistance would be a subtle effect.  If on the other hand, the variant had 10x has more binding to S-protein, then persons carrying this variant could be more susceptible at greater than 2 fold effect. Systems that could measure these binding effects in diploid animal formats would be elucidating for which hypothesis dominates for a given variant.

In summary, there are promising targets for developing therapeutics and vaccines. One target is the interaction of SARS-CoV-2 with ACE2. Another target is the activity of the human proteases (TMPRSS2, cathepsins and possibly TMPRSS11D) that process and cleave the S2 fragment of the spike (S-protein) allowing it to have easier access into the cell. And finally, the third target is the main protease, the enzyme that processes the polypeptide made from the mRNA transcript of the COVID19 genome.

Arg…What up with that?!! Arginine is Enriched in Pathogenic Variants

You know when that hunch seems to get reinforced over and over again, then your mind starts speculating it as a fact.

!Danger! Will Robinson… it’s time for a serious fact check.

My hunch was that the amino acid arginine (Aka: “Arg” or “R) seems to be showing frequent association with pathogenicity. It started with the observation that many of the established pathogenic variants in the coding sequence of STXBP1 seem to involve a preference for arginine. Extracting from ClinVar for missense that are pathogenic and likely pathogenic gives the following table:

Indeed arginine (R) is disproportionately represented. Assuming all amino acids as equals, then there should be 4.3 for each amino acid. Disproportionally low are things that make sense. Like methionine (M), only one codon (ATG) instructs for insertion of this amino acid in a sequence. Similarly tryptophan (W) also has only one codon (TGG). These two amino acids should be represented below the average. A little bit oddly, we have similar low levels from lysine (K), phenylalanine (F) and glutamate (Q) who each have two codons. If codon dosage was key to variant proportioning, then these should have been seen at least 2x more than M and W, so perhaps something more than codon dosage mediates amino acid choice in creating pathogenic variations.

Arginine has 6 codons which still could drive its outsized proportion in the graph. Yet Serine (S) and Leucine (L) also have 6 codons. But respectively they are at 7 and 3 for being involved in pathogenicity. Only mighty arginine accounts for 13 of the 43 pathogenic variants in STXBP1 (30%). Tempering my enthusiasm is the observation that for 3 amino acid positions R292, R406 and R451, we have multiple changes being called pathogenic. Yet no other amino acid in the STXBP1 pathogenics has this changling capacity, so why is it that arginine is at high proportion in the assigned pathogenics – perhaps it is just a consequence of a biased investigator focus specific to STXBP1 and they fixed their gaze onto the repeating de novo clinical variants at positions 292, 406 and 451.

Is arginine involved in fragility elsewhere in the genome?

To normalize for possible investigator bias and find a method that can be applied to other portions of the genome, I took advantage of the Ensembl database to list and rank a gene’s codon sequence variants by bioinformatics analysis. Ranking on CADD was used to list protein coding sequence variations by their severity.

Ensembl allows us to identify which variations are theoretically likely to be disruptive of protein function. The choice to rank by CADD (stand for Combined Annotation-Dependent Depletion) allows us to use a sophisticated algorithm that avoids investigator bias because it intentionally avoids using “known” pathogenicity databases when it creates it ranking. A key test is to see if CADD can independently observe the pathogenicity known to exist in STXBP1. To construct the test, we compare the top scoring CADD variants with the lowest scoring CADD variants.

With CADD, we get an independent call for possible pathogenicity that still picks up what you might expect. Nearly half the calls in the Top-30 CADD pull up known pathogenicity and no benign calls are found. In the Bottom-30 CADD we get one known benign call and no pathogenics.

Healthy population data also is consistent. STXBP1 is autosomal dominant. That means you only need one of your two chromosomal copies to be defective and disease will occur. Selection pressure has been very tight on autosomal dominant genes. Variants in healthy population cannot occur at higher than the known frequency of the disease in the population. Published frequency in STXBP1 for causing early-infantile epileptic encephalopathy is 1/90,000. The largest healthy population database is in GnomAD. At 141,456 individuals, and the fact that STXBP1 needs to distribute across at least 43 pathogenic alleles, the likeliness of even one pathogenic variant being in healthy populations is pretty close to zero. Some of our Top-30 CADD have 1x or more frequency in healthy populations. Most of these are unassigned. For these unassigned that are seen at 1x or more, the disease frequency argument strongly implicates that they are benign variants.

So the CADD is not perfect, the top scoring hits are a mix of known pathogenic and probably benign. But the bottom scoring CADD seems to be more efficient at pulling out benign. In the Bottom-30 CADD, only one variant, I271V, is labeled Likely Benign by ClinVar, yet nearly everyone of these alleles (27 of 30) is seen in healthy populations, so they too are probably benign.

At this point in the analysis, we can pinpoint an anomaly. Y264C is labeled in ClinVar as a Likely Pathogenic. But from the population frequency argument, this assignment is highly unlikely. Y264C has been observed to occur in healthy populations. So a a bare minimum, it should be downgraded to a VUS, but probably be called a Likely Benign for causing early-infantile epileptic encephalopathy.

Finding Arginine-associated Fragility Throughout the Genome

This top-30 / bottom-30 approach was applied to a large set of genes. As a form of internal control, we add isoleucine (I) in the screen. With less conviction, I have felt this amino acid was associating with benign variants. If true, it should show an enrichment in the Bottom 30 CADD scores. So in my gene set experiment, I measured 4 bins. 2 bins for how many arginine and isoleucine in the Top 30 and 2 bins for how many arginine and isoleucine in the Bottom 30.

30% of top 30 CADD scoring variants contain arginine???!!!

An assumption of even distribution of amino acids, combined with an even more absurd assumption of an average 3.05 codons per amino acid, gives us 4.3% as average amino acid fraction per each 30 (dashed line). Arginine is 7.2x more than this average number. Yet, we need to account for the fact arginine uses about 2x more than the average codon usage. A a result Arginine bias in the Top 30 is about 3.5x more than expected. For isoleucine, the enrichment in the bottom 30 appears to be about 2x more than expected.

Test dataset – 30% arginine in Top-30 CADD prevails

The noisiest data in the Top-30 CADD appears to be the Arginine data. A cumulative trending plot was used to see how many genes were need before the trend to 30% becomes apparent. After assessing 7 genes the trend starts to stabilize. A new set of 7 genes were chosen. This time the genes were chosen from the Undiagnosed Disease Network (UDN). The UDN recently listed 54 genes as in desperate need for animal modeling to provide gene function studies. A sub-selection of these were identified as having good sequence similarity to genes in the animal models which we hold dear to our heart and expertise (zebrafish and C. elegans). The Top-30 / Bottom-30 CADD selection was applied to these genes and plotted for Arg and Leu enrichment. 30% prevails for arginine – it occurs at least 3.5x more than expected for being the top CADD variants as hypersensitive to substitution.

This all assumes that the representation of amino acids is uniform across all proteins. But the that is a reach. Louis Gross at University of Tennessee, Knoxville, has observed the amino acid distribution in vertebrates has some anomalies.

Most notable anomaly is arginine. 6 codons are use by arginine, but the observed frequency is low at 4.2%. To illustrate how low, they calculated the expected frequency for each amino acid biasing only for the GC richness of vertebrate genomes.

The expected frequency for arginine is quite high at about 10.5% due to its GC richness in its codons. Yet the actual observed frequency is quite low at about 4%. Based on this observed frequency, we bounce back – we now assess that we are observing arginine in the top 30 at 8x more than expected. No explanation for the anomaly and it just became more pronounced!

Taking a different approach, we can ask what percentage of ALL known pathogenic and likely pathogenic variants in a gene involve arginine substitution. 7 genes analyzed and we get the same 30% for arginine. Yet the calculations are that it should be below 4%. 8x more than expected prevails.

Are your arginines special too?

This analysis has uncovered a unique phenomenon. It appear everyone’s arginines are special. Exactly why arginine has this special status is not entirely clear. It is highly likely arginine has been strongly selected against its random incorporation during evolution. As a result of this strong negative selection (much more than what is happening for all other amino acids), arginine’s frequency in all proteins is much lower than predicted. The observed pathogenic sensitivity may be a read out of this hyperselectivity of evolution. Basically, arginine’s use in any given protein is very particular. A possible driver for this is arginine’s amazing capacity to bring high order to neighboring side chains in most protein structures. When it is gone, chaos reigns. When it is introduced where it should not be, chaos still reigns.

Arginine is special. I suggest we need to ditch Douglas Adam’s “42”.

Instead, we make like a pirate and just say:

“Arrrrrrrrg”

VUS at 44% in ClinVar Assessments and Growing

How prevalent are Variants of Uncertain Significance?

ClinVar database for variant interpretation was analyzed for its levels of ACMG-AMP assessments. With help from the data dumps from ClinVar Miner, the yearly distribution of assessments was plotted. Since 2016 and shortly after the ACMG-AMP guidelines came out in 2015, the number of assessments assigned to the VUS category has grown rapidly. These are the variants that clinical genetics researchers have examined, but cannot decide if they are pathogenic or not.

How big will the VUS problem get?

To estimate how large the VUS problem will become, we must first understand how big is the human genome. Controversy abounds, but current estimate are there are 21,306 protein coding genes and 21,856 non-coding genes. To be conservative, and for simplicity sake, let us use 20,000 genes as the number. The next question is how many of these are disease associated. When we look to ClinVar the number of “genes with variants specific to one protein-coding gene” we get 7221 genes. More conservatively, we can look to ClinVar’s “gene_condition_source_id” which list 4242 genes as being associated with a diagnostic condition. This lower number is reinforced by OMIM in which the “Total number of genes with phenotype-causing mutation” is 4162 genes. These list have been growing rather steady at 5% per year, so in a few years the likely number of gene-disease associations will probably approach 5000 genes, or roughly 1/4 the human genome.

VUS problem may eventually approach 7 Million variants

A recent attempt to preload the human genome with pathogenicity assessment potential has been made. InterVar database applied ACMG-AMP guidelines to ~80,000,000 amino acid positions in the genome to provide a database for easier variant interpretation. Since at least 20% of these positions are likely to be in genes with known disease association, there are roughly 16,000,000 variants that will eventually occur in patient-derived genome sequencing. If the current trend of 44% VUS translates across that number, then there will be close to 7,000,000 variants in need of functional studies to resolve their pathogenicity.

A novel animal model systems for rapid variant interpretation

The team at Nemametrix just produced a wonderful set of preliminary data that we showed at the recent American Society of Human Genetics. It shows it is possible to use a training set of known benign and pathogenic alleles in a gene to “teach” a ML algorithm to determine if pathogenicity is present in a VUS. When applied to the STXBP1 gene, a set of 5 benign and 5 pathogenic was sufficient to train for segregation in an LDA plot and the Y75C was assessed as pathogenic.

Once this type of system is trained with a set of known pathogenic and benign variants, the assessment of pathogenicity can be achieved in a soon as 10 days from start of a VUS transgenesis project.

Total Domination – Uncovering the Phenomenon of 1:2 Dominant vs Recessive ratio for Variation in the Genome

In a prior blog post, the presence of dominant alleles in my genome gave me pause when trying to interpret the data from sequencing my DNA. Dominant alleles can be the cause disease when only one pathogenic variation occurs in only one gene copy of the chromosome pair. Contrast this to a recessive allele where you must get a defect in both chromosome copies of the gene to cause disease. In the recessive condition, if you only have one defective copy, you can expect to remain healthy, but you are a carrier of a disease allele. With the lack of immediate consequence to being a carrier status, many more individuals should be walking around with variations that are recessive towards disease. In fact, the CFTR gene variation (p.Arg117His) for Cystic Fibrosis that was highlighted for me in my Veritas Genomic sequencing report is quite common. It occurs globally at 1 per 2,500 persons, and that increases close 1 per 1,000 for northern europeans, which is a dominant portion of my ancestral genomic composition. In contrast, the CACNA1S variant (p.Arg419His) that most concerns me in my genome, has a prevalence of 1 in 25,000. Thats of low enough to be Rare Disease in Europe, but still probably way to high for disease manifestation rates.

Rare domination in CACNA1S needs to be rare enough to cause Hypokalemic Periodic Paralysis.

Dominant disease causality with the Arg419His variation in CACNA1S is unlikely because it is too frequent for the 1 per 100,000 population frequency for the disease of Hypokalemic Periodic Paralysis. Yet there are two variations known to be causative in CACNA1S, Arg528His and Arg1239His. Arg528His occurs at close to 1/100,000, while Arg1239His has yet to be detected in healthy populations. Clearly the Arg11239His is low enough population frequency to be causative for Hypokalemic Periodic Paralysis. Yet for my Arg419His, the frequency is too high for it to be causative. A variant effect that is Autosomal Dominant (AD) is extremely unlikely for my lone Arg419His allele.

If dominant alleles need to be rare in the population, how frequent is dominant status for variants of a disease?

The frequency of Autosomal Dominance (AD) for any given disease gene appears to be quite high. It is estimated that there are about 7000 Rare Diseases. If we assume the On-line Mendelian Inheritance in Man (OMIM) already represents most of these genes, then rare disease variants will map to the 4346 gene entries in OMIM with published allelic variations. Next, I listed these variations in blocks of 100 to reveals the number of genes for which they are known to exclusively Autosomal Dominant (AD) or Autosomal Recessive (AR), or some kind of hybrid.

When one runs down the inheritance pattern and tabulates them per gene, the first 100 variants have about twice as many genes in the AR category when compared to the AD category.

Running thru the another 400 more variants in the 100 variant blocks shows the trend continues – Dominance of a genetic conditions occurs for about 1/3rd of the disease genome.

Axiom for the individual : “I am not very dominating, but there are lots out there who are.”

So at the individual basis, it appears the AD status of pathogenic or likely pathogenic variants in your genome is very rare. Yet, at a population level, a large proportion of Rare Disease is caused by Autosomal Dominant variation. Rare disease calculate to occur at about 1 per 15 persons. So, for about 1 in 50 (150 million persons), their disease casing variation is likely to be Autosomal Dominant.