Improving RNA Amplification


RNA amplification is often required for producing sufficient amounts of labeled target nucleic acid for detection by array hybridization. The original T7 RNA polymerase linear amplification method described and patented by Van Gelder and Eberwine1-3 is the basis for sample preparation for most commercial microarray platforms, including Affymetrix GeneChips. Recognizing that there is more to improving RNA amplification than simply increasing yields of labeled cRNA, we set out to ensure that any observed improvements in amplification coincide with accepted standards for high quality expression profiling.



RNA amplification enhancements
Ambion's MessageAmp aRNA Amplification Kit was the first commercially available Eberwine-based aRNA amplification kit. In developing MessageAmp II, Ambion's scientists focused on the reverse transcription and second strand cDNA synthesis steps, both of which are critical for generating high yields of full-length labeled cRNA. This led to the development of ArrayScript, an engineered M-MLV reverse transcriptase, and to the optimization of the second strand cDNA synthesis reaction to synergize with the improvements gained with ArrayScript. One of the principal benefits conferred by ArrayScript and the optimized second-strand synthesis step is that MessageAmp II will generate sufficient labeled cRNA in a single round of amplification for array analysis from as little as 100 ng of total RNA (Figure 1).

Together, the RT and second strand cDNA reactions result in an increase in the conversion of mRNA into longer double-stranded cDNA templates. This increase allows both shortened in vitro transcription (IVT) reaction incubation times and more reliable access to expression information from total RNA samples less than 1 μg. The increase also translates through to two rounds of amplification, allowing more robust amplification of total RNA from 10 ng to as low as 100 pg.

Expectations for yields
Factors such as tissue source, mRNA content (0.1-3%), starting amount of total RNA, and the quality of RNA used, will all affect the yield of cRNA. Collectively these factors can lead to up to 4-fold variation in yields. Figure 2 provides typical yields of cRNA amplified from six different sources using the MessageAmp II aRNA Amplification Kit. RNA inputs ranging from 50-3000 ng were tested from each source. From Figure 2 one can calculate the level of amplification obtained from different tissues at different inputs. The figure is also a useful benchmark for understanding the amount of cRNA produced both from different total RNA inputs and sources.

One or two round amplification?
Based on specific experimental constraints, researchers must decide whether one or two rounds of amplification are appropriate. Some microarray studies demand a high level of amplification (106 fold or greater), which can be achieved by two successive rounds of MessageAmp II reactions. Determining the lowest level of total RNA input that generates enough cRNA for array hybridization will depend on the amount of cRNA required for a particular microarray platform, and whether extra cRNA is desired for replicates, follow up validation with RT-PCR, or for archiving samples.

For consistency within an expression experiment, it is important to decide whether one or two rounds of amplification will be used throughout the study. For example, yields of total RNA may vary within a collection of samples - some samples may yield 1 μg of total RNA such that only one round of amplification is needed, whereas others may only yield 50 ng, requiring two rounds of amplification. Figure 3 may aid in this decision process for amplifying samples below 100 ng. For example, below 100 ng input RNA, the cRNA yields after one round of amplification may not be sufficient for most arrays. Obtaining sufficient cRNA for a microarray hybridization is not only dependent on the source of RNA being amplified but also on two reaction variables: the amount of input total RNA, and the length of the IVT incubation. The yield of cRNA increases with increasing RNA input and IVT time. As a rule, most samples between 100-1000 ng, incubated overnight (14 hr IVT), will produce sufficient cRNA for any microarray platform (at least 10 μg). However when we investigated a range of IVT incubation times (from 4-14 hr), we found 4 hour IVT incubations generate sufficient cRNA from most total RNA inputs greater than 200 ng (Figure 4). A 4 hour IVT can be sufficient for 100 ng samples, and these conditions were used in the microarray studies presented below. (Note that mRNA poor samples will require longer IVT incubations (6-14 hr).) While we recommend a minimum of 4 hours for the IVT incubation, yields obtained from shorter time periods (2 or 3 hr) can be tested and should be adequate for input amounts above 500 ng (e.g. 1000 ng HeLa S3 RNA converted to cDNA and in vitro transcribed for 2 hr yielded 53 μg.

Expression analysis
Three approaches were used to define the reproducibility and consistency of the MessageAmp II protocol. First, the relative consistency of the MessageAmp II protocol was measured by comparing replicate arrays, where amplified samples originated from the same RNA source, either inferior temporal cortex from human brain (brain) or HeLa S3 cells (HeLa). These cRNA samples were hybridized to GeneChip Human Genome Focus Arrays following the Affymetrix guidelines. For a reference series we used the Affymetrix GeneChip 3'-Amplification Reagents One-Cycle cDNA Synthesis Kit and IVT Labeling Reagents (from here on referred to as the Affymetrix protocol) with duplicate 1000 ng reactions of the same RNA. The Affymetrix quality assessment guidelines were met for all arrays in this study. This included 3'/5' ratios (under 2.0), percent Present calls (50-53%), scaling factors, average background, as well as visual inspection of images. This assessment indicated that all arrays were of high quality.

MessageAmp II reproducibility. Signal values for each probe set were converted from .CEL files into a summary of expression values measured for each of the 8793 probe sets (genes). Expression values were estimated using Robust Multi-array Average (RMA)(4-6 as implemented in BioConductor (www.bioconductor.org). The correlation established by this comparison shows that amplification using these protocols with 1000 ng of input total RNA is highly reproducible (Figure 5).

Differential Gene Expression. We also addressed differential gene expression between two different tissue sources (brain and HeLa). To measure the consistency of the MessageAmp II and the Affymetrix standard protocol, we plotted the possible replicates of the ratios for both protocols. Since there are two replicates of input RNA from brain and two from HeLa, for each protocol there are 4 possible ratios: 2 sets of 2 contain unique arrays. We compared the consistency of MessageAmp II and Affymetrix ratio estimates in Figure 6. There is generally good correspondence (higher correlation in the log2 ratios) of one replicate versus another in the MessageAmp II arrays relative to the Affymetrix arrays.

These two analyses (Figures 5 and 6) indicate that the two protocols are reproducible and comparisons made between the protocols do not appear to have any systematic differences above the observed variation between replicates within a protocol.

Present absent call concordance. We used two comparisons to evaluate Present/Absent call concordance: 1) the correspondence of Present (P) and Absent (A) calls comparing the MessageAmp II and Affymetrix standard protocols to address consistency between protocols, and 2) the correspondence within the MessageAmp II and Affymetrix protocols. The P-A calls are based on the Wilcoxon signed rank-based gene expression Present/Absent detection algorithm first implemented in the Affymetrix Microarray Suite, Version 5. This was done separately by RNA type (brain and HeLa). We paired arrays in all possible combinations and summarized the average performance. When comparing two arrays, the percent shared Present (PSP) calls is the total genes called present in both arrays divided by the total number of (unique) genes called present between arrays. An equivalent statistic was used for the percent shared Absent (PSA) calls; for this analysis we ignored Marginal (M) calls. Also included was total concordance, calculated using all call types by simply dividing the number of concordant calls (P-P, A-A, or M-M) by the total number of calls (or genes = 8793). The results, shown in Figure 7, demonstrate again that the correspondence between the MessageAmp II and Affymetrix protocols was equivalent to that seen within Affymetrix replicates.

IVT incubation time vs. Microarray Signals
While a shortened IVT time may produce sufficient cRNA for microarray hybridization, the signal characteristics of the labeled cRNA on a microarray is the critical factor for determining functionality. A series of three IVT time points (4, 6, and 14 hr) using total RNA from brain and HeLa were tested to measure whether any difference was observed between IVT times. Total RNA input amounts used were 100, 200 and 1000 ng. These samples were amplified and labeled as above, then hybridized to GeneChip(r) Human Genome Focus Arrays. Again, for a reference series we used the Affymetrix protocol with duplicate 1000 ng reactions of the same RNA. Fourteen arrays were scanned and signal intensities were analyzed for similarity.

We first measured the simple correlation of the log2 expression of each array versus the chosen standard (1000 ng input; 4 hr IVT). Figure 5b lists the correlation and standard deviation (SD) of the differences with the standard. This data is also expressed as simple plots of log2 (expression) of each array versus the standard (x axis) (Figure 8). These comparisons reveal that the magnitude of the variation one sees among the different IVT times using 1000 ng RNA inputs and the MessageAmp II protocol is similar or less than expected in replicates using the Affymetrix protocol. Additionally, arrays with cRNA from 200 ng total RNA have more under-expressed genes (a broader tail on the left of each plot) for all times measured. This range of differences appears to be consistent with that also seen in Affymetrix replicates (Figure 5b). While the standard deviation (SD) of the differences between the 200 ng RNA inputs appears twice as large, combining replicates with simple averaging should decrease the SD.

Expression Analysis at Lower RNA Inputs
Lower RNA inputs (below 100 ng), which generally require two rounds of amplification, suffer both a decrease in the percent Present calls and an increase in noise. Several studies have characterized and compared single vs. double round amplification [7, 8]. While possible, interpretation of expression profiles between samples that have been processed differently (single vs. double round amplification) is more problematic. Nevertheless, we attempted to measure the difference observed as the input levels are lowered to 10 ng of total RNA (approximately 1000 cell equivalents). This experiment consisted of two variables: RNA input (1000, 100, and 10 ng), and amplification protocol (MessageAmp II vs. Affymetrix). For this analysis, both the 100 and 10 ng samples underwent two rounds of amplification.

Note that while the goal of the study was to compare array data generated after two rounds of amplification, MessageAmp II yields from one round of amplification with 100 ng inputs was actually sufficient for array hybridization. In this case, a single round of amplification would be preferred, as it would obviously improve correlation.

Again, we used 1000 ng input RNA as the standard to which we compared the other RNA input data (Figure 9A). Scatter plots of the log2 (expression) for both 100 and 10 ng RNA versus 1000 ng (within a protocol) is shown in Figure 9B. Obviously the 10 ng has much lower fidelity (correlation) with the standard for both MessageAmp II and Affymetrix.

As done previously, for comparison we used the standard deviation of the differences between two replicates of the standard (0.15 and 0.17). The results suggests that fidelity decreases with decreasing RNA input, but particularly for the 100 ng concentration, the correlation (and standard deviation of differences) is better for the MessageAmp II protocol than the Affymetrix standard protocol (0.3 vs. 0.55). At 10 ng inputs fidelity is lower. Nevertheless, the signal correlation indicates meaningful data can still be obtained (0.91 vs. 0.93). Again, in some cases two rounds of amplification are not needed (e.g. for 100 ng inputs with the MessageAmp II procedure), and in this case a comparison between the 1000 ng standard and 100 ng single round amplification is significantly more accurate and applicable.

Conclusion
The critical criteria for microarray sample preparation methods are: they must be highly reproducible, conserve the original mRNA profile, and be applicable with a reasonable range of total RNA inputs. While it is important that both amplification and labeling methods conserve the original expression profile, we realize that all methods will have some inherent bias [8]. Some of this bias can be minimized with a robust protocol and optimized reagents. The MessageAmp II aRNA Amplification Kit provides such a protocol and the novel reverse transcriptase (ArrayScript) for this purpose. We attempted to monitor correlation and standard deviation between two important factors, amplification protocol and RNA input. The data indicate that the MessageAmp II protocol is highly reproducible and can be meaningfully compared to the Affymetrix standard protocol and the original MessageAmp protocol (see sidebar, MessageAmp II Concordance with Original MessageAmp Kit). We hope to have provided some guidelines for amplification expectations and confidence that a shortened IVT incubation time that produces enough cRNA for a microarray will not significantly effect expression data (compared to longer IVT times). The above examples have been an important part of our research, production, and quality control development for procedures that we use to maintain the highest quality reagents for microarray analysis. We plan on continuing such studies for both new product development and improving our line of RNA amplification products.

References
1. The MessageAmp II aRNA Kit is covered by US patents 5256555, 6586218, 6586219, and a US patent pending.
2. Van Gelder, R.N., von Zastrow, M.E., Yool, A., Dement, W.C., Barchas, J.D. and Eberwine, J.H. Amplified RNA synthesized from limited quan-tities of heterogeneous cDNA. Proc. Natl. Acad. Sci. USA 87:1663-1667.
3. Eberwine, J., Yeh, H., Miyashiro, K., Cao, Y., Nair, S., Finnell, R., Zetell, M., Coleman, P. Analysis of gene expression in single live neurons. Proc. Natl. Acad. Sci. USA 89:3010-3014 (1992).
4. Bolstad, B.M., Irizarry, R.A., Astrand, M. and Speed, T.P. A comparison of normalization methods for high-density oligonucleotide array data based on variance and bias. Bioinformatics 19: 185-193 (2003).
5. Irizarry, R.A., Bolstad, B.M., Collin, F., Cope, L.M., Hobbs, B. and Speed, T.P. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Research 31(4):e15 (2003).
6. Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis, K.J., Scherf, U., Speed, T.P. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4: 249- 64 (2003).
7. Wilson, C.L., Pepper, S.D., Hey, Y., Miller, C.J. Amplification protocols introduce systematic but reproducible errors into gene expression stud-ies. Biotechniques 36(3):498-506 (2004).
8. Gold, D., Coombes, K., Medhane, D., Ramaswamy, A., Ju, Z., Strong, L., Koo. J.S., Kapoor, M. A comparative analysis of data generated using two different target preparation methods for hybridization oligonucleotide microarrays. BMC Genomics 5:2 (2004).


 


© 2008 Advantage Business Media All rights reserved.
Use of this website is subject to its terms of use.
Privacy Policy