FIGURE SUMMARY
Title

A conserved role for the ALS-linked splicing factor SFPQ in repression of pathogenic cryptic last exons

Authors
Gordon, P.M., Hamid, F., Makeyev, E.V., Houart, C.
Source
Full text @ Nat. Commun.

SFPQ regulates the formation of cryptic last exons (CLEs).

a, b Scatter plot showing expression values of genes in sfpq−/− and siblings, analyzed using Cufflinks (a) or Whippet (b) pipelines. FPKM fragments per kilobase of transcript per million, TpM transcripts per million. c Alternative last exons is a major category of SFPQ-controlled events. CE cassette exon, FE first exon, LE last exon, SD splice donor, SA splice acceptor, RI retained intron. d Majority of SFPQ-regulated last exon events are cryptic. e Sashimi plots showing example CLE formation in nbeaa and epha4b. Top tracks: plot of read coverage from siblings (upper) and sfpq−/− (lower). Bottom tracks: isoforms discovered for each gene. f Genes expressing CLE-containing isoforms tend to be downregulated in sfpq−/−. Two-sided Fisher’s exact test was performed. g Normal long isoforms (annotated isoforms) from CLE-expressing genes tend to be downregulated in sfpq−/−. Two-sided Fisher’s exact test was performed.

CLEs are cleaved and polyadenylated.

a, b Representative coverage plots from RNA-seq (light blue) and 3′ mRNA-seq (red) experiments showing cryptic last and constitutive last exons. Only clustered reads are shown in the 3′ mRNA-seq profile and consensus polyadenylation signals (PASs) are marked by red arrowheads. c Four-way dot plot representing the change in relative cleavage site usage (sfpq vs sib) for CLEs (x-axis) against its corresponding constitutive last exons (y-axis; control). A positive value on each axis represents an increased CS usage in sfpq−/− null. Genes showing significantly changing CS usage in both CLE and control (FDR < 0.05) are colored red and the total number of significantly regulated genes is indicated for each quadrant. d Metaplot of the normalized change in 3′ mRNA-seq coverage within regions surrounding CLEs (red) and constitutive last exons (blue; control). e Sanger sequencing of 3′RACE PCR products of CLE isoforms. PAS hexamers are shown within red boxes.

Molecular properties of CLEs.

a, b CLE containing (n = 109) and non-CLE (n = 1557) containing introns from CLE-expressing genes were scored by their relative position (a) and relative length (b), and the distributions of these scores were plotted. Note that CLE-containing introns show no gene position bias but tend to be among the longest introns in the gene. The box bounds represent the first and third quartiles and the black lines at the middle of the boxes show the medians. Top and/or bottom whiskers represent 1.5x of the range between the third and the first quartiles (interquartile range). Circles represent outliers. Two-sided Wilcoxon rank sum test was performed. c CLE-containing introns (n = 109) are longer compared to all other introns in the zebrafish transcriptome (n = 209,012). The box bounds represent the first and third quartiles and the black lines at the middle of the boxes show the medians. Top and/or bottom whiskers represent 1.5x of the range between the third and the first quartiles (interquartile range). Circles represent outliers. Two-sided Wilcoxon rank-sum test was performed. d CLEs tend to occur relatively close to the 5′ end of their host introns. e CLEs are found within 10 kb of the upstream constituitive exon. f Metaplot showing the conservation score of sequences surrounding conserved (blue) and non-conserved (red) CLEs. In all, 280 bp of surrounding intron/CLE junction sequence (250 bp intron and 30 bp exon) were binned into 10 bp windows and the mean PhastCons score for each bin is shown ±SEM.

SFPQ directly binds to CLE-adjacent RNA sequences.

a Metaplot showing the distribution of predicted SFPQ-binding sites surrounding CLEs (red) and constitutive last exons of each CLE-containing gene (blue). In total, 200 bp of surrounding intron/CLE junction sequence (150 bp intron and 50 bp exon) were binned into 50 bp windows and the mean number of predicted motifs is shown ±SEM. bd Top: location of SFPQ-binding motifs predicted using the MEME suite. Bottom: RT-qPCR quantitation showing the relative enrichment of SFPQ-interacting regions surrounding CLEs. RT-qPCR primer pairs were designed for each 100-bp sequence window demarcated by alternating gray and white areas. Abundance of SFPQ- or IgG(control)-crosslinked RNAs were normalized to input and the mean value from three replicates were shown ±SD. Source data are provided as a Source Data file.

CLEs have functional impacts.

a Deletion of the b4galt2 CLE using CRISPR/Cas9 rescues expression of downstream exons. Left: cut sites of the b4galt2 sgRNAs. CLE is indicated by capital letters. Center: PCR verification of Cas9 cleavage after injection of sgRNAs. Representative image; experiment performed five times. Right: RT-qPCR quantitation of the relative expression of the downstream b4galt2 exons in sfpq−/− embryos compared to siblings (±SD); n = 3 biologically independent replicates. Two-tailed unpaired t-test was performed. b In-situ hybridization of the epha4b CLE at 24 hpf, displaying strong expression in the midbrain and hindbrain of sfpq−/− embryos. c In-situ hybridization of rfng shows rhombomere boundary defects at 22ss after injection into WT embryos of the epha4b cryptic transcript or a mutated transcript with an early stop codon. Loss of boundaries seen in 8/10 embryos. d Left: in-situ hybridization of rfng shows rhombomere boundary defects of sfpq−/− embryos are rescued by injection of the epha4b cryptic splice junction morpholino but not a mismatch morpholino. Rhombomere boundaries are numbered. Right: quantification of staining in rhombomeres in three lateral view samples for each condition. Representative images; defect seen in 13/15 embryos. e Left: in-situ hybridization of DeltaA shows a loss of discrete neuronal clusters in sfpq−/− which is rescued by injection of the epha4b cryptic splice junction morpholino but not a mismatch morpholino. Right: quantification of number of DeltaA clusters in each condition. Each data point represents one embryo; embryos examined over two independent experiments. Two-tailed t-test was performed, ***p = 0.0005. ce Upper: lateral view. Lower: dorsal view. Source data are provided as a Source Data file.

The CLE-repressing function of SFPQ is conserved in mouse and human.

a Bar plot of the proportion of unannotated (cryptic) last exons from mouse cortical neurons grouped based on their regulation by Sfpq. Two-sided Fisher’s exact test was performed. b Introns from mouse CLE-expressing genes were scored by their relative length and the distribution of these scores were plotted. The box bounds represent the first and third quartiles and the black lines at the middle of the boxes show the medians. Top and/or bottom whiskers represent 1.5x of the range between the third and the first quartiles (interquartile range). Circles represent outliers. Two-sided Wilcoxon rank-sum test was performed. c Metaplot of the mean number of Sfpq CLIP peaks (±SEM) in region surrounding cryptic last exons or constitutive last exons of the same gene (control). d, e CLIP-seq peak distribution (top) and RNA-seq coverage plots (mid) from representative CLE-containing introns. The Y-axis scale of the read coverage plots is optimized for CLE/intronic reads. Sequence homology of flanking exons from orthologous CLE-containing genes is compared and the relative position of zebrafish CLEs is shown at the bottom. f Heatmap illustrating increased inclusion (ΔPSI) of 68 CLEs upregulated in ALS-mutant background at different stages of neuronal differentiation. ΔPSI values of non-significant events are set to 0. Induced pluripotent stem cells (iPSC); neural precursors (NPC); “patterned” precursor motor neurons (ventral spinal cord; pMN); post-mitotic but electrophysiologically immature motor neurons (MN); electrophysiologically mature MNs (mMN). g Representative RNA-seq coverage plots from ALS-derived iPSC dataset of CLEs upregulated in VCPmu samples.

Acknowledgments
This image is the copyrighted work of the attributed author or publisher, and ZFIN has permission only to display this image to its users. Additional permissions should be obtained from the applicable author or publisher of the image. Full text @ Nat. Commun.