Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009:5:266.
doi: 10.1038/msb.2009.21. Epub 2009 Apr 28.

Adaptable gene-specific dye bias correction for two-channel DNA microarrays

Affiliations

Adaptable gene-specific dye bias correction for two-channel DNA microarrays

Thanasis Margaritis et al. Mol Syst Biol. 2009.

Abstract

DNA microarray technology is a powerful tool for monitoring gene expression or for finding the location of DNA-bound proteins. DNA microarrays can suffer from gene-specific dye bias (GSDB), causing some probes to be affected more by the dye than by the sample. This results in large measurement errors, which vary considerably for different probes and also across different hybridizations. GSDB is not corrected by conventional normalization and has been difficult to address systematically because of its variance. We show that GSDB is influenced by label incorporation efficiency, explaining the variation of GSDB across different hybridizations. A correction method (Gene- And Slide-Specific Correction, GASSCO) is presented, whereby sequence-specific corrections are modulated by the overall bias of individual hybridizations. GASSCO outperforms earlier methods and works well on a variety of publically available datasets covering a range of platforms, organisms and applications, including ChIP on chip. A sequence-based model is also presented, which predicts which probes will suffer most from GSDB, useful for microarray probe design and correction of individual hybridizations. Software implementing the method is publicly available.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Gene-specific dye bias and its correction. The degree of GSDB varies from one hybridization to another. Examples of a reference wt (green) versus other wt (red) scatterplot showing very little GSDB (A) or a large degree of GSDB (B). Each dot represents a single probe from the microarray. Green and red dots belong to the 5th and 95th percentiles of the iGSDB, respectively. The numbers along the axis represent normalized fluorescent intensities. The solid black lines mark two-fold up, no change and two-fold down. Boxplot of M-values (log2-ratio Cy5/Cy3) of the probes that suffer from the highest degree of GSDB, before (C) and after applying the correction method (D). The results of five different wt versus reference wt hybridizations are shown that suffer from increasing degrees of GSDB (low–high). These boxplots are derived from hybridizations with different dye orientations, showing that the outliers depend on the dye, rather than on the sample. From left to right the common reference wt sample was labelled with Cy5, Cy3, Cy5, Cy3 and Cy3, respectively, and is indicated with an asterisk (Cy3). The genes represented in these boxplots are identical to those coloured red and green in (A) and (B). Boxplots before (E) and after (F) GSDB correction derived from self versus self hybridizations, whereby only the degree of fluorescent label incorporation was varied for both dyes in each hybridization. A labelling percentage of 1 indicates that both Cy5 and Cy3 were incorporated at a determined efficiency of 1 fluorescent dye per 100 bases of amplified RNA. The correction applied to these arrays is derived from the independent set of 12 hybridizations also used to correct the data shown in (C, D). Scatterplot of self versus self hybridization labelled: at 3% efficiency before (G) and after (H) GSDB correction; at 2% before (I) and after (J) GSDB correction. These scatterplots are from two of the hybridizations depicted in (E) and (F). The coloured dots represent probes from four different external controls, whose RNAs were spiked in to achieve a two-fold molar difference between channels. Each external control is represented by multiple probes on the arrays. Boxplot of M-values before (K) and after (L) applying three different correction methods. Performance of the methods is measured as the change in variance of M-values compared with averaging. Averaging: simple averaging of dye swaps; VERA: (Kelley et al, 2008). This actually results in an overall 3% increase variance compared with averaging. However, the variance of the most extremely affected probes does decrease. GASSCO: the method described here, which results in 25% variance decrease.
Figure 2
Figure 2
GSDB correction for small-scale experiments without additional controls. (A) Cluster diagram of uncorrected expression profiles derived from five yeast strains carrying whole gene deletions of MED2, MED3, MED18, MED20 and a carboxy-terminal truncation mutation of MED8 (Lariviere et al, 2008), all of which encode subunits of the transcription complex Mediator. Med2 protein (Med2p) and Med3p form part of the Tail submodule within this complex. Med18p, Med20p and Med8p form part of the Head submodule. Loss of MED2 results in loss of Med3p from the complex and vice versa, resulting in similar expression profiles. Likewise, loss of MED18 results in loss of Med20p from the complex and vice versa. Loss of part of Med8p protein (med8c) also results in loss of both Med18p and Med20p. Together, these physical relationships underlie the similarity of the ensuing expression profiles (Lariviere et al, 2008). Two biological replicates of each of the five strains were hybridized to a common reference wt RNA sample. G and R refer to the dye orientation, with G indicating that the mutant sample was labelled with Cy3. In this cluster diagram, dye-swap replicates of individual strains do not cluster together. (B) As (A) but after applying GSDB correction based only on these 10 hybridizations. Now the replicates of the individual strains do cluster together. (C) Scatterplot of the self versus self hybridization shown in Figure 1g before (C) and after (D) applying the GSDB correction that is derived from the 10 hybridizations shown in the cluster diagrams.
Figure 3
Figure 3
Correcting GSDB in previously published studies. Examples of GSDB determined from earlier studies that included dye swaps or self versus self hybridizations. Graphs as in Figure 1; left: before correction; right: after correction. In the left panel, the relevant publication, the organism and the platform. In the right panel, the number of hybridizations corrected and the achieved average and maximum performance.

Similar articles

Cited by

References

    1. Chen S, de Vries MA, Bell SP (2007) Orc6 is required for dynamic recruitment of Cdt1 during repeated Mcm2–7 loading. Genes Dev 21: 2897–2907 - PMC - PubMed
    1. Chua G, Morris QD, Sopko R, Robinson MD, Ryan O, Chan ET, Frey BJ, Andrews BJ, Boone C, Hughes TR (2006) Identifying transcription factor functions and targets by phenotypic activation. Proc Natl Acad Sci USA 103: 12045–12050 - PMC - PubMed
    1. Cox WG, Beaudet MP, Agnew JY, Ruth JL (2004) Possible sources of dye-related signal correlation bias in two-color DNA microarray assays. Anal Biochem 331: 243–254 - PubMed
    1. Dobbin KK, Kawasaki ES, Petersen DW, Simon RM (2005) Characterizing dye bias in microarray experiments. Bioinformatics 21: 2430–2437 - PubMed
    1. Dombkowski AA, Thibodeau BJ, Starcevic SL, Novak RF (2004) Gene-specific dye bias in microarray reference designs. FEBS Lett 560: 120–124 - PubMed

Substances

-