Naert et al., 2020 - Maximizing CRISPR/Cas9 phenotype penetrance applying predictive modeling of editing outcomes in Xenopus and zebrafish embryos. Scientific Reports   10:14662 Full text @ Sci. Rep.

Figure 1

Theoretical models of how gRNA-specific efficiencies and frameshift gene editing outcome probabilities influence the cellular composition and percentage of protein knockout cells in a mosaic F0 animal model. (A) There is a non-linear relationship between gRNA-specific probability of obtaining a frameshift gene editing outcome (x-axis) and the probability of obtaining a biallelic frameshift gene editing outcome in a single cell (y-axis). E.g. upon a gRNA-specific frameshift frequency of 80%, the probability of a single biallelic edited cells to be biallelic frameshift mutant is 64% (0.80*0.80). (Grey demarcation). (B) Examples of theoretical outcomes of gene editing (presuming 100% on-target efficiency) in an F0 mosaic varying one parameter: gRNA-specific probability of frameshift editing. (C) Examples of theoretical outcomes of gene editing in an F0 mosaic varying two parameters: gRNA-specific probability of frameshift editing and gRNA-specific on-target efficiency. E.g. for a 100% efficient gRNA with an 80% gRNA-specific probability of frameshift editing, we expect 64% of the cells to be biallelic frameshift mutant (see grey demarcation in A). Please note, blue circles represent cells that are biallelic gene edited, but retain at least one in-frame mutation and cannot be considered complete protein knock-out. (D) Flowchart representing the pipe-line for investigating the correlations between experimentally observed in vivo gene editing outcomes and gene editing outcomes projected by computational prediction models.

Figure 2

The InDelphi prediction model, trained in mESC cells, accurately predicts CRISPR/Cas9 gene editing outcomes and outperforms several other prediction models in X. tropicalis embryos. (A) Scatter plot with model-predicted cumulative frameshift gene editing frequencies correlated to experimentally observed cumulative frameshift gene editing frequencies, for each sgRNA (n = 28) separately, in X. tropicalis embryos. Black demarcated lines show the perfect correlation r = 1. Light-grey shows the standard error of the best-fit linear regression line. (B) Scatter plot with model-predicted INDEL patterns correlated to experimentally observed INDEL patterns, for all gRNAs simultaneously. Black lines show linear regression models of all correlations. Black demarcated lines show the perfect correlation r = 1. (C) Correlations between model-predicted and experimentally observed INDEL patterns, for each gRNA separately. Error bars represent mean ± SD. (***p < 0.001; **p < 0.01; *p < 0.05; ns = not significant; Shapiro–Wilk (p > 0.05); Levene (p < 0.05); One-way Welsh ANOVA to adjust for unequal variances (p < 0.001), with Games-Howell multiple comparisons) (Table S2). (D) Violin plots of the residuals (predicted frequency—observed frequency) between model-predicted and experimentally observed frequency of + 1 insertion gene editing outcome. (E) The SEM of the mean residual difference (predicted frequency—observed frequency) between model-predicted and experimentally observed frequency of all deletion variants modeled.

Figure 3

The InDelphi-mESC model accurately predicts CRISPR/Cas9 gene editing outcomes in X. tropicalis, X. laevis and zebrafish embryos which can be exploited to identify high-frameshift frequency gRNAs. (AF) Scatter plot with InDelphi-mESC-predicted cumulative frameshift gene editing frequencies correlated to experimentally observed cumulative frameshift gene editing frequencies, for each sgRNA separately, in X. tropicalis (n = 14) (Panel A), in X. laevis (n = 6) (Panel B) and in zebrafish (n = 15) embryos (Panel C). Scatter plot with InDelphi-mESC-predicted INDEL patterns correlated to experimentally observed INDEL patterns, for all gRNAs simultaneous, in X. tropicalis (n = 14) (Panel D), in X. laevis (n = 6) (Panel E) and zebrafish (n = 15) (Panel F) embryos. Black demarcated lines show the perfect correlation r = 1. Light-grey areas show the standard error on the best-fit linear regression line. Black lines show linear regression model. (G) Correlations between model-predicted INDEL patterns to experimentally observed INDEL patterns, for each gRNA separately. Correlations for X. tropicalis embryos (n = 14) (dark blue) and X. laevis embryos (n = 6) (middle blue) analyzed by Sanger sequencing and sequence trace decomposition. Correlations for zebrafish embryos analyzed by targeted amplicon sequencing (TAS) (n = 15) (light blue). (H) Using the distribution of the expected probability of frameshift frequency for a large dataset of SpCas9 human target sites in mESC cells from Shen et al. 2018 (black line—monoallelic)27, we draw the derivative distribution of the probability of a randomly designed gRNA to generate biallelic frameshift editing. This distribution is shown for different editing efficiencies within the F0 mosaic animal: 100%, 50% and 25% (in reducing intensities of blue—100 circles, each circle represents a cell within a total mosaic of a 100 cells). E.g. The probability of a randomly designed gRNA to yield more than 80% biallelic frameshift mutant cells in a developing mosaic, assuming 100% efficiency, is the area under curve highlighted in pink and represents only a 3.24% probability.

Figure 4

Integrating CRISPRscan and the InDelphi-mESC model allows identification of efficient high frameshift frequency gRNAs in X. tropicalis. (A) Scatterplot with marginal histograms demonstrating for 339,693 gRNAs across the coding sequence for 4,860 X. tropicalis genes the relationships between calculated CRISPRscan score, InDelphi-mESC predicted frequency of MMEJ repair and InDelphi-mESC predicted knockout-score (KO-score). KO-score is defined as the predicted percentage of cells with biallelic out-of-frame mutations within the pool of all mutant cells (i.e. in-frame and out-of-frame; mono- and bi-allelic) in the mosaic mutant embryo and is calculated as the square of the frameshift frequency predicted by InDelphi-mESC. For each gene (n = 4,860), the gRNA with the highest predicted KO-score (Highest-in-class) is highlighted in blue, while the gRNA with the lowest predicted KO-score (Lowest-in-class) is highlighted in orange. Demarcations illustrate those quadrants where gRNAs suffice to certain cutoff thresholds. Ideally, designed gRNAs fall within the aquamarine demarcation (high predicted KO-score, high CRISPRscan score), but not the orange (low predicted KO-score, high CRISPRscan score) or purple demarcation (high predicted KO-score, low predicted CRISPRscan score). (B) Violin plot illustrating that highest-in-class gRNAs and lowest-in-class gRNAs have a higher predicted percentage of repair by microhomology-mediated end joining than a random selection of guides. (****p < 0.001—Table S2). (C) No distinct difference in calculated CRISPRscan scores between highest-in-class gRNAs, lowest-in-class gRNAs and a random selection of gRNAs. (D) Comparison of three pairs of gRNAs targeting the second exon of the tyrosinase gene responsible for pigmentation in X. tropicalis. As these three pairs of guides have very similar genome editing efficiencies, as determined by targeted amplicon sequencing, the impact of differential predicted KO-scores on phenotypic penetrance is revealed. (D, E) Phenotypic scoring is based on retinal pigmentation at Nieuwkoop-Faber stage 38 and a trend is observed where guides with higher predicted KO-scores yield a higher phenotypic score under very similar genome editing efficiencies.

Acknowledgments:
ZFIN wishes to thank the journal Scientific Reports for permission to reproduce figures from this article. Please note that this material may be protected by copyright. Full text @ Sci. Rep.