sim2

Preparation
Accuracy of abundance estimates
Variability of transcript and gene TPMs
Correlation between gene count estimates
Differential transcript expression (DTE) analysis on gene and transcript level
- Comparison of DTE performance on gene and transcript level
  - Exploration of power difference between gene- and transcript-level analyses
- Evaluation of logFC estimation accuracy (transcripts)
  - All transcripts
  - Only transcripts with true logFC != 0 and true logFC != Inf
Differential gene expression analysis (edgeR)
Comparison of DTE analysis aggregated on gene level to DGE
DTU analysis with DEXSeq
Comparison of significant genes from DTE, DTU, DGE
Differential gene expression analysis (DESeq2)
Differential gene expression analysis (limma-voom)
- Comparison of significant genes found with different matrices
- Comparison to truth
Help functions
Session info

The sim2 data set consists of synthetic human, paired-end, 100bp reads from two conditions, each with three samples. The basis for the simulation is the human chromosome 1 from Ensembl GRCh37.71. We introduce differential gene expression, differential transcript expression and differential isoform usage.

sim2

Preparation

Reference directories and packages

Reference file preparation

Reference file subsetting

Generation of incomplete annotation

Preparation of RSEM files for simulation

Generation of STAR index for alignment

Simulation

Metadata definition

Definition of paths to reference files

Definition of paths to generated files

Definition of gene-to-transcript mapping

Summarization of Salmon results and offset estimation

Complete annotation

Incomplete annotation

Estimation of coefficients of variation for all samples

Generation of exon bin counts with DEXSeq

Accuracy of abundance estimates

All features

Only features with positive true and estimated TPMs

All features, after replacing missing estimates with 0

Consider groups of paralogous genes only

Variability of transcript and gene TPMs

Correlation between gene count estimates

All genes

Only genes with positive estimates with all methods

Example of genes with a big difference between simple sum and scaled TPM

Differential transcript expression (DTE) analysis on gene and transcript level

Comparison of DTE performance on gene and transcript level

Exploration of power difference between gene- and transcript-level analyses

Evaluation of logFC estimation accuracy (transcripts)

All transcripts

Only transcripts with true logFC != 0 and true logFC != Inf

Differential gene expression analysis (edgeR)

Diagnostics

Comparison of significant genes found with different matrices

Comparison to truth

Comparison of logFC estimates

All methods

simplesum vs scaledTPM

Evaluation of logFC estimation accuracy (genes)

All genes

All genes, stratified by presence of differential isoform usage

Only genes with true logFC != 0

Only genes with true logFC != 0, stratified by presence of differential isoform usage

Comparison of DTE analysis aggregated on gene level to DGE

DTU analysis with DEXSeq

Application of DEXSeq to Salmon transcript counts

DTU analysis on exon bin counts, with DEXSeq

Comparison to truth

Comparison of significant genes from DTE, DTU, DGE

Differential gene expression analysis (DESeq2)

Diagnostics

Comparison of significant genes found with different matrices

Comparison to truth

Comparison of logFC estimates

All methods

simplesum vs scaledTPM

Evaluation of logFC estimation accuracy (genes)

All genes

All genes, stratified by presence of differential isoform usage

Only genes with true logFC != 0

Only genes with true logFC != 0, stratified by presence of differential isoform usage

Differential gene expression analysis (limma-voom)

Comparison of significant genes found with different matrices

Comparison to truth

Help functions

Session info