FIGURE SUMMARY
Title

Differential allelic representation (DAR) identifies candidate eQTLs and improves transcriptome analysis

Authors
Baer, L., Barthelson, K., Postlethwait, J.H., Adelson, D.L., Pederson, S.M., Lardelli, M.
Source
Full text @ PLoS Comput. Biol.

Transcriptome analysis of homozygous mutants compared to their wild-type siblings and the impact of non-isogenic genetic backgrounds on gene expression.

A) Experimental selection of progeny homozygous for a mutant allele of a gene-of-interest (GOI, mutation-bearing chromosome indicated in red) necessarily involves increased homozygosity for alleles of genes linked to that mutation (i.e. on the same chromosome). The rates of transcription or transcript degradation for these alleles may differ significantly from their corresponding alleles on the homologous wild-type chromosomes (shaded differentially to illustrate that these wild-type chromosomes are not isogenic). B) Differential expression of alleles of a linked bystander gene, LG, between wild-type and mutant chromosomes due to a functional interaction between the GOI and the LG. C) eQTL-driven differential expression of LG between wild-type and homozygous mutant chromosomes in the absence of a functional interaction between the GOI and the LG. The expression of LG differs independently of the GOI genotype when LG’s different alleles are eQTLs. D) Breeding to produce genotype groups (e.g., homozygous GOI mutants for comparison to wild-type) selects for differential representation between those groups for the alleles of neighbouring LGs. When those alleles are eQTLs, they can show differential expression between the groups that is not a phenotypic effect of the mutation but can, mistakenly, be inferred as such. Zebrafish icons used in this image were obtained from https://bioicons.com and have been modified from DBCLS https://togotv.dbcls.jp/en/pics.html, licensed under CC-BY 4.0 Unported https://creativecommons.org/licenses/by/4.0/.

Manhattan plots highlighting CC-DEGs from differential expression testing in: A) Brain transcriptomes from a comparison of zebrafish psen1W233fs/+ vs. psen1+/+ siblings at 6 months of age. psen1W233fs is a dominant fAI-like frameshift allele. B) 3-month-old cerebral cortex transcriptomes from a comparison of homozygous male mice bearing humanised APOE2 or APOE3 alleles. Non-random accumulations of DEGs are supported by Bonferroni-adjusted Fisher’s exact test p-values for enrichment of DE genes on the mutant chromosome, A) psen1W233fs/+ vs. psen1+/+: p = 7.41e-7, B) APOE2/2 vs. APOE3/3: p = 2.03e-10. Genes are plotted along the x-axis based on their genomic positions along the chromosomes distinguished by alternating shades of grey. Genes on the chromosomes containing the mutations are highlighted in red. The approximate locations of the mutated genes are indicated with small black arrows on the x-axis. The raw p-values are plotted along the y-axis at the -log10 scale such that the most significant genes exist at the top of the plot. The cut-off for gene differential expression (FDR-adjusted p-value < 0.05) is indicated by a dashed horizontal line. Genes classified as differentially expressed under this criterion are represented as diamonds with a black outline.

Computational workflow for the calculation of DAR starting with raw RNA-seq short read data.

Raw RNA-seq reads must consist of at least two experimental groupings to allow the calculation of DAR between them.

The relationship between DE genes and DAR along the entirety of mouse Chromosome 7 between male APOE2/2 and APOE3/3 mouse cerebral cortices (at 3 months of age).

The plot contains four sets of information represented by separate tracks horizontally. Track A represents the axis of Chromosome 7. The position of the APOE gene is marked and labelled in bold red. Track B displays differentially expressed genes according to their positions along the chromosome. 99 of 1126 total genes (8.79%) on Chromosome 7 that were expressed in the dataset were classified as DE (FDR < 0.05). Track C shows the trend in DAR as a connected scatterplot with each point in black representing the DAR value at a single nucleotide variant position (elastic sliding window size = 11 variants). Positions of the DE genes shown in track B are indicated by light red lines.

The relationship between DE genes and DAR along the entirety of zebrafish Chromosome 17 in brain transcriptome comparisons of 6-month-old wild type fish against sibling fish heterozygous for either of two different psen1 mutations.

Track A represents the axis of Chromosome 17. The position of psen1 is marked and labelled in bold red. Tracks B and C show the results from the EOfAD-relevant psen1T428del/+ vs. psen1+/+ comparison, while tracks D and E show the fAI-relevant psen1W233fsl/+ vs. psen1+/+ comparison. Tracks B and D display DE genes from their respective comparisons, while tracks C and E show the trend in DAR as a connected scatterplot with each point representing the DAR value at a single nucleotide variant position (elastic sliding window, n = 11 variants). The positions of DE genes are highlighted on the DAR tracks in a light red colour.

The relationship between DE genes and DAR along Chromosome 24 between nagluA603Efs/A603Efs and naglu+/+ 7 dpf larval zebrafish.

Track A represents the axis of Chromosome 24. The position of naglu is marked and labelled in bold red. Track B displays DE genes according to their positions along the chromosome. Track C shows the trend in DAR as a connected scatterplot with each point representing the DAR value at a single nucleotide variant position (elastic sliding window, n = 11 variants). Positions of the DE genes shown in track B are indicated by light red lines.

The impact of gene exclusion by DAR thresholding on the outcomes of functional enrichment analysis using KEGG gene sets.

Panel A displays the outcomes of ROAST, while panel B displays the outcomes of GSEA. Gene sets are displayed only if they were found in the top ten most significant gene sets for at least one DAR threshold. The relative ranking between the displayed gene sets is represented along the y-axis for each threshold indicated on the x-axis. Filled dots indicate that the gene set was classified as significantly enriched (FDR-adjusted p-value < 0.05) at the respective threshold. The numbers inside the dots show the overall ranking of the gene set among all 186 KEGG gene sets tested. Panel C displays the proportion of DE (grey) and non-DE (black) genes that were removed at each threshold. The number of genes this equates to is displayed above each bar of the chart.

Comparison of GSEA results for KEGG gene sets that achieved significance before and/or after use of DAR to weight the gene-level ranking statistic in the nagluA603Efs/A603Efs vs. naglu+/+ 7 dpf sibling larval transcriptome dataset. An asterisk indicates that a pathway was determined to be significantly enriched (FDR-adjusted p-value < 0.05).

Comparison of over-representation analysis results for GO terms that achieved significance when using either transcript length or DAR as bias data for GOseq analysis of the nagluA603Efs/A603Efs vs. naglu+/+ 7 dpf sibling larval transcriptome dataset

All three displayed GO terms showed statistical significance when transcript length or DAR was used as bias data. However, DAR showed greater significance for all terms.

Acknowledgments
This image is the copyrighted work of the attributed author or publisher, and ZFIN has permission only to display this image to its users. Additional permissions should be obtained from the applicable author or publisher of the image. Full text @ PLoS Comput. Biol.