Pathway-Centric Tools and Technology™

Understanding Microarray Performance Parameters
Part Two: Amount of Labeled Sample in Microarray Hybridization

Introduction:

The microarray is a popular gene expression-profiling tool allowing the analysis of many genes in a single experiment. Upon choosing and starting to use a microarray platform, several experimental parameters need to be optimized before performing an experiment. Microarray manufacturers optimize appropriate conditions specific to their platform such as hybridization time, temperature, and washing stringency. However, the amount of labeled RNA material used in the experiment depends not only on the array itself but also on the genes of interest in the study. This article discusses some considerations to make when deciding how much labeled sample to hybridize with a microarray using the Oligo GEArray® from SABiosciences as an example.

Generation of a Microarray Signal:

In a microarray experiment, the relative levels of gene expression are directly proportional to the intensity of a hybridization signal. High hybridization intensity at a given microarray spot indicates that the original sample contains a relatively high abundance of the corresponding transcript. Low hybridization intensity means the transcript is relatively rare. Two different steps in the microarray protocol contribute to the intensity of each signal on a microarray: the hybridization of the labeled sample to the microarray element and the detection of the labeled sample on the array.

Microarray manufacturers pre-optimize any chemical reactions involved in detection, and standard imaging devices (such as a fluorescent laser scanner, chemiluminescent CCD camera, or X-ray film) are specifically designed to capture and record the array images. Therefore, careful control of exposure times as well as instrument or software parameters easily optimizes the detection of the signal without requiring a repeat experiment. Optimizing hybridization conditions proves more difficult because determining the results of changing one condition requires a complete repetition of the microarray experiment. Therefore, it is often helpful to perform a pilot experiment by hybridizing replicate microarrays with different amounts of labeled sample.

The hybridization reaction consists of two substrates: the nucleic acid on the array and the labeled nucleic acid in solution. The amount and distribution of the nucleic acid present on the array is fixed by the array platform. The researcher controls the amount of labeled sample incubated with the array. The amount of labeled sample used in a microarray experiment depends on balancing two opposing considerations.

  1. On one hand, the hybridization of more labeled sample to the microarray maximizes the detection of less abundantly expressed genes, increases the present call, and decreases the number of false negatives.

  2. On the other hand, increasing amount of the labeled sample causes signal saturation, decreases the ability to resolve changes in the expression of more abundantly expressed genes, and increases the number of false positives.

To illustrate how microarray results change with increasing amounts of labeled sample, the data from identical microarrays hybridized with different amounts of sample are directly compared to one another in the following experiments. Hundreds to thousands of hybridization reactions are measured simultaneously on the same microarray surface. Every RNA sample contains thousands of transcripts ranging from low to high copy number. Thus, a wide range of hybridization signals intensities are observed on any given microarray. To determine how well the results agree, the intensity values are simply plotted against one another. If the data in these types of plots agree, the data can be fit to a straight line. Incidentally when comparing microarray data from two different experimental samples, points deviating from a line of with a slope of one indicate changes in gene expression.

Example One: Higher Density Array with Smaller Spot Sizes

Figure 1 displays results from a higher density array (480 genes) with smaller array spots. When the data from a microarray hybridized with a small amount of sample (2 micrograms) are compared with larger amounts (4, 6, or 8 micrograms), the signal intensities increase for each gene with increasing sample as expected. At very high amounts of sample (particularly 6 or 8 micrograms), the intensity of some array spots becomes saturated; that is, their intensity reaches a maximum and does not increase any further. Therefore, the curve fit deviates from a straight line at higher intensity values and levels off instead. Saturated microarray signals confound the interpretation of microarray data. They mask differences in the expression of the corresponding genes between experimental conditions. Saturated signals also skew the expression profiles of the other genes on the microarray when using the median value to normalize or standardize the microarray data. Choose an amount of labeled sample that minimizes the number of saturated signals.

Figure 1: More labeled sample saturates the signals from more genes.
XpressRef™ Human Universal Reference Total RNA (Catalog Number GA-004) was converted to labeled cRNA target using the TrueLabeling-AMP™ Linear RNA Amplification Kit (Catalog Number GA-010). Different amounts (2, 4, 6, and 8 µg) of biotinylated cRNA were hybridized with separate Oligo GEArray Human Hematology / Immunology Microarrays (Catalog Number OHS-801). The signal intensities from each spot on each microarray are plotted versus the signal intensities for the microarray hybridized with 2 µg of labeled cRNA target.

 

On the other hand, the higher amount of sample increases the number of genes determined to be expressed (or present) from 53.5 to 79 percent, and decreases the number determined to be not expressed (or absent) from 46.5 to 21 percent as defined and seen in Table 1. Absent calls cannot be interpreted because their expression could either be non-existent or could lie below the limit of detection of the method. Microarray users refer to the latter situation as a false negative. Using larger amounts of labeled sample allows the detection of less abundant messages and reduces false negatives. However, microarray experiments often generate non-specific label. Hybridizing larger amounts of labeled sample also increases the exposure of the microarray to this non-specific label. Microarray users call any contribution of signal intensity to microarray spots by non-specific label as false positives. Therefore, using more labeled sample may also increase the rate of false positive signals. Choose an amount of labeled sample that maximizes the present call (minimizes false negatives) while trying not to introduce too many false positives.

Table 1: But more labeled sample also increases the present call.
The maximum, total, average, and median signal intensities for each microarray result in Figure 1 are displayed. The average background and standard deviation in the background value are also displayed. To define the number and percent of present (expressed) genes and absent (unexpressed) genes, a threshold was defined as the average background plus three standard deviations. Signals above this threshold are present calls; signals below, absent calls.

2 ug  4 ug  6 ug  8 ug
Maximum intensity  52546.00  49122.00  51301.00  51729.00
Total Intensity  961229.00  1351152.00  1833587.00  2110141.00
Average intensity  2002.56  2814.90  3819.97  4396.13
Median intensity  396.00  564.50  761.00  894.50
Background  270.88  370.38  437.63  504.63
stdev of background  35.56  29.14  44.91  27.08
Cutoff  377.55  457.79  572.36  585.85
P calls  257  295  325  378
A calls  223  185  155  102
%P  53.54%  61.46%  67.71%  78.75%
%A  46.46%  38.54%  32.29%  21.25%

 

Example Two: Lower Density Array with Larger Spot Sizes

The amount of labeled material required for the experiment also depends on the array itself. The optimal amount of sample for one array may not be the same for another array. To illustrate this point, Figure 2 displays the results from a lower density array (112 genes) with larger array spots. The spots on these microarrays contain more nucleic acid distributed over a larger area allowing the spot to bind more labeled sample. As a result, no obvious signal saturation occurs for the more abundantly expressed genes at high levels of labeled sample. The curve fit does not deviate form a straight line as dramatically as the higher density, smaller spot array despite the significantly larger amounts of labeled sample (6 to 20 micrograms) used in the hybridization. This array platform seems to tolerate larger amounts of sample without causing undesirable effects such as signal saturation providing the researcher more latitude in choosing the optimal amount.

Figure 2: More labeled sample does not saturate larger spot sizes as easily.

XpressRef™ Mouse Universal Reference Total RNA (Catalog Number GA-005) was converted to labeled cRNA target using the TrueLabeling-AMP™ Linear RNA Amplification Kit (Catalog Number GA-010). Different amounts (6, 10, 15 and 20 µg) of biotinylated cRNA were hybridized with separate Oligo GEArray Mouse Cell Cycle Microarrays (OMM-020). The signal intensities from each spot on each microarrays are plotted versus the signal intensities for the microarray hybridized with 6 µg of cRNA target.

As seen in Table 2, the present call still increases (from 47 to 77 percent) and the absent call still decreases (from 53 to 23 percent) with an increasing amount of sample hybridized to the lower density, larger spot microarray. At the same time, the number of false negatives should also increase, but the number of false positives would also still increase as suggested for the higher density, smaller spot microarray. This analysis cannot tell whether the lower density microarray produces more or less false negatives and false positives than the higher density microarray. At the very least, these results suggest that different array platforms may require a different balance of these considerations.

Table 2: But more labeled sample still increases the present call on larger spot sizes.

The maximum, total, average, and median signal intensities for each microarray result in Figure 2 are displayed. The average background and standard deviation in the background value are also displayed. To define the number and percent of present (expressed) genes and absent (unexpressed) genes, a threshold was defined as the average background plus three standard deviations. Signals above this threshold are present calls; signals below, absent calls.

6 ug  10 ug  15 ug  20 ug
Maximum intensity  47059.00  46656.00  47109.00  47676.00
Total Intensity  504099.00  635073.00  701454.00  821343.00
Average intensity  3938.27  4961.51  5480.11  6416.74
Median intensity  686.00  1203.50  1356.50  2076.50
Average Background  617.71  811.71  1044.57  1549.29
Standard Deviation  32.11  64.69  32.56  42.64
Cutoff  714.05  1005.78  1142.24  1677.22
Present  calls  60  89  83  99
Absent calls  68  39  45  29
Percent Present  46.88%  69.53%  64.84%  77.34%
Percent Absent  53.13%  30.47%  35.16%  22.66%

Summary:

The amount of labeled material used for microarray hybridization strikes a balance between minimizing the number of saturated signals and the maximizing present call without introducing false positives. When deciding on the amount of labeled target to be used, perform a preliminary experiment such as the one described in this article. Start with the recommended range of amounts provided by the manufacturer of the microarray platform, but also take into account the nature of the genes of interest. Genes expressed at low level require more sample to yield a signal above the limit of detection. On the other hand, genes expressed at high level require a smaller amount of sample to avoid saturation of their signals.

The results of such an experiment easily identify the presence of saturated signals and measure the percentage of present and absent calls. However, only methods that verify microarray data such as RT-PCR can identify the frequency of false negatives and false positives. Their precise numbers cannot be predicted by the experimental analysis described here. RT-PCR may be performed for a subset of the genes on the microarray to completely optimize the amount of labeled sample for your experiment. However, such work generally exceeds the scope of a preliminary optimization of a microarray. A good balance between saturation as well as positive and negative calls usually maximizes the number of true positives and true negatives at the same time. Just remember to always verify any interesting microarray with a more rigorous gene-specific assay before continuing with your study or submitting the results for publication.

Related Products:

Oligo GEArray® Microarrays for Human, Mouse, and Rat
Oligo GEArray focused DNA microarrays are carefully designed to provide gene expression information relevant to biological or disease pathways quickly and simply at a cost every laboratory can afford. Because of the focused design, data handling is straightforward and your research project can progress more rapidly with information from well-characterized genes.

TrueLabeling-AMP™Linear RNA Amplification Kit
Our TrueLabeling-AMP™Linear RNA Amplification Kit employs an extremely convenient one-tube protocol that saves a day over conventional oligo array labeling protocols. In addition to being fast and convenient, the TrueLabeling-AMP™ Linear Amplification Kit provides the accuracy and sensitivity you need for reliable gene expression profile analysis.

XpressRef™ Universal Reference RNA from Human, Mouse, and Rat
XpressRef Universal Reference Total RNA is a standardized sample of RNA designed to help streamline and optimize your gene expression studies using microarrays or RT-PCR. The high quality of the RNA insures the successful synthesis of microarray probe or PCR template every time. The broad representation of genes in these RNA samples makes them useful for studying nearly every gene in the human, mouse, or rat genome.

Article: 1      2      3      4