A two-parameter generalized Poisson model to improve the analysis of RNA-seq data
- PMID: 20671027
- PMCID: PMC2943596
- DOI: 10.1093/nar/gkq670
A two-parameter generalized Poisson model to improve the analysis of RNA-seq data
Abstract
Deep sequencing of RNAs (RNA-seq) has been a useful tool to characterize and quantify transcriptomes. However, there are significant challenges in the analysis of RNA-seq data, such as how to separate signals from sequencing bias and how to perform reasonable normalization. Here, we focus on a fundamental question in RNA-seq analysis: the distribution of the position-level read counts. Specifically, we propose a two-parameter generalized Poisson (GP) model to the position-level read counts. We show that the GP model fits the data much better than the traditional Poisson model. Based on the GP model, we can better estimate gene or exon expression, perform a more reasonable normalization across different samples, and improve the identification of differentially expressed genes and the identification of differentially spliced exons. The usefulness of the GP model is demonstrated by applications to multiple RNA-seq data sets.
Figures
![Figure 1.](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670f1.gif)
![formula image](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670i44.jpg)
![formula image](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670i45.jpg)
![Figure 2.](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670f2.gif)
![formula image](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670i48.jpg)
![formula image](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670i49.jpg)
![Figure 3.](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670f3.gif)
![Figure 4.](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670f4.gif)
![Figure 5.](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2943596/bin/gkq670f5.gif)
Similar articles
-
Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads.BMC Genomics. 2015;16 Suppl 7(Suppl 7):S14. doi: 10.1186/1471-2164-16-S7-S14. Epub 2015 Jun 11. BMC Genomics. 2015. PMID: 26099631 Free PMC article.
-
deGPS is a powerful tool for detecting differential expression in RNA-sequencing studies.BMC Genomics. 2015 Jun 13;16(1):455. doi: 10.1186/s12864-015-1676-0. BMC Genomics. 2015. PMID: 26070955 Free PMC article.
-
Identifying differentially spliced genes from two groups of RNA-seq samples.Gene. 2013 Apr 10;518(1):164-70. doi: 10.1016/j.gene.2012.11.045. Epub 2012 Dec 8. Gene. 2013. PMID: 23228854
-
Overview of available methods for diverse RNA-Seq data analyses.Sci China Life Sci. 2011 Dec;54(12):1121-8. doi: 10.1007/s11427-011-4255-x. Epub 2012 Jan 7. Sci China Life Sci. 2011. PMID: 22227904 Review.
-
RNA-Seq: a revolutionary tool for transcriptomics.Nat Rev Genet. 2009 Jan;10(1):57-63. doi: 10.1038/nrg2484. Nat Rev Genet. 2009. PMID: 19015660 Free PMC article. Review.
Cited by
-
Exploring the fragmentation efficiency of proteins analyzed by MALDI-TOF-TOF tandem mass spectrometry using computational and statistical analyses.PLoS One. 2024 May 3;19(5):e0299287. doi: 10.1371/journal.pone.0299287. eCollection 2024. PLoS One. 2024. PMID: 38701058 Free PMC article.
-
Artifacts and biases of the reverse transcription reaction in RNA sequencing.RNA. 2023 Jul;29(7):889-897. doi: 10.1261/rna.079623.123. Epub 2023 Mar 29. RNA. 2023. PMID: 36990512 Free PMC article. Review.
-
A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies.Genes (Basel). 2021 Dec 2;12(12):1947. doi: 10.3390/genes12121947. Genes (Basel). 2021. PMID: 34946896 Free PMC article.
-
A New ℓ0-Regularized Log-Linear Poisson Graphical Model with Applications to RNA Sequencing Data.J Comput Biol. 2021 Sep;28(9):880-891. doi: 10.1089/cmb.2020.0558. Epub 2021 Aug 10. J Comput Biol. 2021. PMID: 34375132 Free PMC article.
-
Alternative splicing: Human disease and quantitative analysis from high-throughput sequencing.Comput Struct Biotechnol J. 2020 Dec 24;19:183-195. doi: 10.1016/j.csbj.2020.12.009. eCollection 2021. Comput Struct Biotechnol J. 2020. PMID: 33425250 Free PMC article. Review.
References
-
- Cloonan N, Forrest AR, Kolle G, Gardiner BB, Faulkner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G, et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat. Methods. 2008;5:613–619. - PubMed
-
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods. 2008;5:621–628. - PubMed