Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jun 6;320(5881):1344-9.
doi: 10.1126/science.1158441. Epub 2008 May 1.

The transcriptional landscape of the yeast genome defined by RNA sequencing

Affiliations

The transcriptional landscape of the yeast genome defined by RNA sequencing

Ugrappa Nagalakshmi et al. Science. .

Abstract

The identification of untranslated regions, introns, and coding regions within an organism remains challenging. We developed a quantitative sequencing-based method called RNA-Seq for mapping transcribed regions, in which complementary DNA fragments are subjected to high-throughput sequencing and mapped to the genome. We applied RNA-Seq to generate a high-resolution transcriptome map of the yeast genome and demonstrated that most (74.5%) of the nonrepetitive sequence of the yeast genome is transcribed. We confirmed many known and predicted introns and demonstrated that others are not actively used. Alternative initiation codons and upstream open reading frames also were identified for many yeast genes. We also found unexpected 3'-end heterogeneity and the presence of many overlapping genes. These results indicate that the yeast transcriptome is more complex than previously appreciated.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Flowchart of experimental and informatics of RNA-Seq method
A) RNA Seq experimental pipeline. B) Informatics pipeline. C) A snapshot of the mapped RNA-Seq reads showing no expression in a deleted gene (LEU2) and an expressed neighboring gene (YCL017C).
Figure 2
Figure 2. Extensive expression of the yeast genome revealed by RNA-Seq
A) The genome distribution of transcribed regions. Colors represent different transcription levels for each base (log2 tag count). B) Distribution of transcribed regions on chromosome VI. C) Histogram of transcribed bases. D) A summary of the transcription level of the transcriptome.
Figure 3
Figure 3. Analyses and mapping of 5′ and 3′ gene boundaries
A) Size differences of 5′-UTR between RNA-Seq and our RACE data (top left) or RNA-Seq 3′-UTR data and cDNA sequencing data(7) (bottom left). Distributions of the size of 5′-UTR (top right) or 3′-UTR (bottom right) is also shown. B) A comparison of 5′-UTR determined by RNA-Seq or by 5′-RACE for gene YKL004W. C) 3′-UTR determined by RNA-Seq based on end tags for gene YDR460W, YDR004W, and YDR461-C, or YDR004W that is also determined by cDNA sequencing (7). Endtag_W and Endtag_C represent RNA-Seq reads that contain polyA tails on either Watson or Crick strands, respectively. D) 3′-UTR determined by RNA-Seq based on sharp expression decrease, comparing to cDNA data(7). End tags information were not used in this case due to low scores. UTR, untranslated region; RACE, rapid implication of cDNA ends
Figure 4
Figure 4. Precise annotation of UTRs using RNA-Seq
New annotations of the UTRs in a previously well annotated region on chrVI (A) and a relatively poor annotated region on the same chromosome (B). In the new annotation, ORFs are denoted by dotted lines, and arrows denote transcription direction. UTRs are denoted by green shaded boxes flanking the ORFs. cDNA transcripts in red are high confident ones and those in blue are low confident ones (7)
Figure 5
Figure 5. Annotation of upstream ATG, uORF and novel transcribed regions
A) RNA-Seq reveals genes that may have upstream start codon (uATG, in red) relative to the existing annotated ATG (blue). B) Some genes have ORFs (uORFs) upstream of the major annotated ORF. GO analysis revealed that they are significantly enriched in DNA binding (molecular function) and anatomical structure and development (biological process). P-values are False Discovery Rate adjusted. C) An example of uORF (boxed and in red). D) Size distribution of novel transcribed regions. E) Novel transcribed regions that have been covered by cDNA sequencing(7) in percentages. F) An example of a novel transcribed region with a polyA signal (shaded in red).
Figure 6
Figure 6. Comparison between RNA-Seq data with qPCR, tiling array and gene expression microarrays
A) Comparison of the transcription level for 34 ORFs determined by RNA-Seq or quantitative PCR (qPCR). B) Comparison of the transcription level for 4,846 ORFs determined by RNA-Seq with published tiling array (16). C) Comparison of the transcription level for 4,422 ORFs determined by RNA-Seq with the published gene expression microarrays (15). Pearson linear correlation coefficients (corr) are shown in A–C. D) Transcription level distribution for 5,099 ORFs by RNA-Seq.

Similar articles

Cited by

References

    1. Snyder M, Gerstein M. Science. 2003;300:258. - PubMed
    1. Gerstein MB, et al. Genome Res. 2007;17:669. - PubMed
    1. Adams MD, et al. Nature. 1995;377:3. - PubMed
    1. Kapranov P, et al. Science. 2002;296:916. - PubMed
    1. Bertone P, et al. Science. 2004;306:2242. - PubMed

Publication types

Associated data

-