Bioscience Technology  
Subscribe to Bioscience Technology








Capital Genomix

9290 Gaither Rd.
Gaithersburg, MD, 20877






Expression Profiling Technology Facilitates Unbiased Differential Expression Analysis And Correlative Gene Discovery

by J.M. Ray, T. Heiland, I. Carey, A.M. Smith, L. Do and W.G. Hearl


click the image to enlarge

Figure 1: Flowchart of GS320 Process
Gene expression profiling has become a valuable tool and standard protocol in most laboratories. A variety of technologies have evolved for differential gene expression profiling. Microarray hybridization,(1-2) reverse transcriptase PCR (RT-PCR),(3) in vitro transcription and Northern blotting require knowledge of the sequences to be analyzed. RT-PCR, in vitro transcription and Northern blotting are low throughput techniques, while microarrays, MPSS(4) and SAGE(5) are used to screen a large number of genes. MPSS and SAGE require global analysis and months of work before producing interpretable data and are limited by the small amount of sequence information obtained for each gene. Differential display,(6) representational differential analysis (RDA)(7) and amplified fragment length polymorphism (AFLP)(8) require cloning and sequencing to identify differentially expressed genes.

The GeneSystem320™ (GS320) gene expression analysis system was developed to quickly generate a comprehensive picture of the state of gene expression in a cell type or sample with high confidence in a short period of time. Based upon technology developed at the MD Anderson Cancer Center (University of Texas) by Dr. Michael MacLeod, the system enables the analysis of virtually any gene or set of genes using a defined set of reagents and a panel of 320 PCR primers. It follows a simple protocol that yields useful data on mRNA expression profiles within a week in the lab.

The original design parameters included the potential for analyzing any chosen gene without extensive development of new primers or alteration of reaction conditions; sensitivity; direct assessment of the statistical significance of measurements; and high potential for gene discovery. The original description of a method satisfying these parameters was published in 1999,(9) and a U.S. patent covering the technique was issued in 2001. The technology is currently available in kit or service form from Capital Genomix, Inc.

The original design parameters included the potential for analyzing any chosen gene without extensive development of new primers or alteration of reaction conditions; sensitivity; direct assessment of the statistical significance of measurements; and high potential for gene discovery. The original description of a method satisfying these parameters was published in 1999,(9) and a U.S. patent covering the technique was issued in 2001. The technology is currently available in kit or service form from Capital Genomix, Inc.

Technology overview
The strategy employed in GS320 is to construct a library of short, defined fragments of each actively transcribing mRNA present in a population, called RAGEtags, which will be used as templates in subsequent PCR reactions. Primers are selected combinatorially from a small, predesigned set that specifically amplifies the RAGEtag of a chosen gene based on the sequence of the RAGEtag fragment. The relative concentration of the corresponding mRNA in the original population is inferred from the level of product formation after PCR amplification. Amplified products that are in GenBank are identified immediately using dedicated software that predicts the size of the amplimers. Amplimers that do not match genes in the database are flagged as unknowns. As with other quantitative PCR methods, care must be taken to ensure that product formation is linear with input mRNA. GS320 results do not routinely give an absolute concentration of a specific message; relative amounts of any given mRNA can be compared between biological samples.

RAGEtag fragment libraries are generated in two orientations, denoted as A/B and B/A, to ensure inclusion of virtually every actively transcribed gene. The orientation refers to the relative location of the 3' most paired Hsp92II and DpnII restriction enzymes' four base recognition sites to each other and the poly(A) tail sequence on the reverse transcribed mRNA (Figure 2). The 4-base recognition sequences occur at random approximately once every 256 bp. This results in RAGEtags that average about 128 bp in length; over 90% of RAGEtags are smaller than 500 bp.

The fragment library of cDNAs isolated by digestion with Hsp92II and and DpnII followed by ligation to the A and B linkers are referred to as A/B RAGEtags. Because these enzymes are frequent cutters, most mRNAs contain at least one of each kind of site; empirically, we find about 5-10% of known genes lack either one or both kinds of restriction site, or have closely spaced or overlapping sites that do not give a specific amplimer. However, about half of all mRNAs do not have a Dpn site to the right of the 3'-most Hsp92II site and therefore the RAGEtag fragment library (called the A/B RAGEtags) prepared using the protocol outlined in Figure 2 will not represent this half of the transcriptome. The second orientation of RAGEtag fragment library addresses cDNA species that have the Hsp92II recognition site in proximity to the poly(A) tail sequence. A B/A RAGEtag fragment library is obtained by reversing the order of restriction digestion. Use of both libraries therefore allows the researcher freedom to assay approximately 90-95% of the transcriptome.

The steps used in GS320 RAGEtag fragment library generation are outlined in Figure 1 and illustrated in Figure 2. Total RNA is isolated from cells or tissue samples using standard protocols. cDNA is synthesized from the mRNA using reverse transcriptase and a biotinylated oligo(dT) primer. The anchored cDNA is immobilized on streptavidin magnetic beads. Two arbitrary genes are diagrammed in Figure 2. To generate an A/B fragment library, the immobilized cDNAs are cleaved with enzyme "A", leaving only the 3'-most "A" fragment attached to the beads, and the cleaved 5'-fragments are washed off and discarded. The immobilized fragments are then cleaved from the beads with enzyme "B", along with other 3'-fragments, and collected. Note that at this point in the preparation, only the RAGEtag fragments contain sticky ends derived from enzyme A; the other fragments of cDNA that contaminate the preparation have "B" restriction cut sites at both ends.

Taking advantage of the unique "sticky" ends left by the "A" and "B" enzymes, the RAGEtags are then ligated to two unique linkers that distinguish the "A" and "B" ends; The linkers are composed of common PCR primer annealing sites and four base overhangs complementary to the ends created by restriction with either Hsp92II (A-linker) or DpnII (B-linker). These linkers provide common "A" and "B" primer binding sites for subsequent PCR analysis. Currently used linkers are 16 nucleotides in length. In generation of A/B RAGEtag fragment libraries, the "A" end linker is biotinylated so that binding to streptavidin magnetic beads can purify the RAGEtag fragments in the next step, eliminating the unwanted cDNA fragments that contain only "B" ends. For generation of B/A RAGEtag fragment libraries the B linker is biotinylated. The library of immobilized RAGEtag fragments can then be used as template in reactions designed to amplify particular gene products. The production of a RAGEtag fragment library from cDNA can be accomplished in less than five hours, and multiple libraries can easily be prepared in parallel. One RAGEtag fragment library is usually sufficient to analyze 8000 to 12000 specific genes.


click the image to enlarge

Figure 2: Illustration of GeneSystem320 process.
Specificity in the PCR amplification step is provided by using primers that extend past the "A" or "B" restriction sites into the gene-specific portion of the isolated RAGEtag fragments. The "A"-end primers contain the "A"-end linker sequence, including the 4 base restriction enzyme recognition sequence and extend 4 nucleotides into the gene specific portion of the RAGEtag. The set of "B"-end primers contain the "B"-end linker sequence and extend 3 nucleotides into the RAGEtag fragments from the opposite direction. Thus, the total set of primers needed for GS320 analysis is 44 or 256 "A"-end primers and 43 or 64 "B"-end primers.

For any particular known gene, the sequence of the corresponding RAGEtag can be determined from Genbank mRNA entries, and the sequence of the specific "A"-end and "B"-end primers that will amplify this RAGEtag can be inferred. Thus, PCR amplification of the RAGEtag fragment library with the specific "A"- and "B"-end primers should give rise to a product (the amplimer) of a known size. This size is defined by the distance between the 3'-most "A" restriction site in the gene's cDNA and the closest "B" restriction site in the 3'-direction (A/B orientation). In the B/A orientation the size is defined by the distance between the 3'-most "B" restriction site and the closest "A" restriction site in the 3' direction. These sizes can be predicted from the mRNA sequence.

The GeneSystem320 Database Search Engine was developed to provide mRNA sequence data specific to GS320. The program can be used in two ways: to predict the pair of primers and the RAGEtag fragment library orientation to be used for amplification of each specific gene, and to determine the identity of GS320 amplimers after combinatorial GS320 analysis. The program utilizes sequence data collected from the NCBI Entrez GenBank and UniGene databases which is processed to extract the data appropriate to GS320. The data depends on the integrity of the 3' end of the mRNA and includes a validation system to assess sequences for the likelihood that the last nucleotides in the sequence do in fact represent the 3' end. A verification process is employed to select a representative sequence from each UniGene cluster that is most likely to correspond to a mature mRNA including the 3' end. The database includes data for all eukaryotic mRNAs. The database is updated monthly to add new and revised GenBank sequences and to maintain current UniGene data.

GS320 in practice
The GS320 technology has been used to examine differential gene expression between normal breast tissue and malignant human breast cancer cell lines. Directed analysis of the tissue inhibitor of metalloproteinase genes (TIMPs) is used as an example of the types of data that can be generated using the technology. Analyses of reactions on polyacrylamide gels with SYBR® Green fluorescent staining are shown in Figure 3. PCR reactions were run with pairs of GS320 primers using RAGEtag fragment libraries prepared from normal breast tissue and malignant breast cell lines of low and high metastatic potential as template, and reaction products were analyzed in adjacent lanes. RAGE primers specific for TIMP-1, TIMP-3 and TIMP-2 (Figure 3B) that are modulated in the tissue and cell lines were used in the reactions. In each case, an arrow pointing to the appropriate band indicates the predicted amplimer. In the case of the ribosomal protein control genes S26 and L29, major bands of equal intensity corresponding in size to the expected product were seen with all three templates (Figure 3A). In addition, in all three TIMP genes differential expression was observed. The TIMP-1 and TIMP-2 genes are slightly down regulated in the weakly metastatic breast cancer line relative to normal tissue and highly metastatic cell lines, while TIMP-3 is down-regulated in both cancer cell lines relative to the normal breast tissue. These results have been confirmed using RT-PCR and Affymetrix array analysis (data not shown). Expression profiles have been validated with at least three different RNA, cDNA and RAGEtag fragment library preparations.


click the image to enlarge

Figure 3. Analysis of gene expression in human breast tissue and breast cancer cell lines by GS320. GS320 PCR analysis was performed, using GS320 fragment libraries prepared from normal breast tissue (normal), or low and high metastatic potential human cell lines (low and high). Reactions contained primers chosen to amplify specific genes and the expected amplimers are indicated in the Figure by arrows. A. Control genes. B. Directed analysis.
GS320 facilitates gene discovery
In the small number of directed analyses performed to examine the expression profiles of the TIMP genes, other amplimers with a variety of molecular weights are seen (Figure 3B). Some of these bands represent known genes (HIG-2), while other amplimers correspond to genes that only have homology in Genbank to clones or open reading frames (unknowns). These provide a rich and easy source for gene discovery.

This study has been expanded for further gene discovery. Combinatorial analysis, where one of the 5' "B" GS320 primers is used pairwise with all 64 3' or "A" GS320 primers, enables searches for differentially expressed genes without bias. In just over 300 PCR reactions, nine differentially expressed amplimers were found that map to putative genes or clones of unknown function (data not shown). Six of the newly identified genes were found not to populate the U133A or U133B Affymetrix array.

Conclusions
To completely assay the entire transcriptome, each GS320 expression library would have to be amplified with each possible combination of GS320 primers. The total number of unique amplification reactions is therefore (2 libraries) 3 (256 "A"-end primers) 3 (64 "B"-end primers) or 32,768. Assuming the total number of genes in a typical mammalian genome is in the range of 60,000, each unique GS320 amplification reaction will produce 2 specific amplimers, on average. However, in most cases the size of the amplimers produced from two different genes will be distinguishable using standard electrophoretic separation techniques. In addition, GS320 may be made high throughput using fluorescent amplification primers. After amplification, reactions run with different fluorescent dyes may be pooled and analyzed via capillary electrophoresis.

In most cases a GS320 reaction designed to assay a particular gene-of-interest will simultaneously provide information on one or more other transcripts that may or may not be previously identified. This feature gives GS320 an enhanced potential for gene discovery. Since the average GS320 amplimer is approximately 128 bp in length, sequencing unknown amplimers after gel purification almost always gives enough information to uniquely identify corresponding ESTs.

While GeneSystem320 does not provide absolute quantitation of differentially expressed genes, this technology provides a rapid, comprehensive and reproducible system for differential gene expression profiling. The method facilitates quick analysis, confirmation and identification of results of both known and novel genes.

About the authors
J.M. Ray, T. Heiland, I. Carey, A.M. Smith, L. Do and W.G. Hearl are all part of the Capital Genomix Inc. research team.
More information is available from Capital Genomix Inc.
Use InfoLINK 4C1501 or Call 800-287-0633

References
1. Schena, M., Shalon, D., Davis, R.W. and Brown, P.O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467-470 (1995).
2. Lockhart, D.J et al. Expression monitoring by hybridization to high-density oligonucleotides arrays. Nat. Biotechnol. 14:1675-1680 (1996).
3. Myers, T.W. and Gelfand, D.H. Reverse transcription and DNA amplification by a Thermus thermophilus DNA polymerase. Biochemistry 30:7661-7666 (1991).
4. Brenner, S. et al. in vitro cloning of complex mixtures of DNA on microbeads: Physical separation of differentially expressed cDNAs. Proc. Natl. Acad. Sci. 97:1665-1670 (2000).
5. Velculescu, V. et al. Serial analysis of gene expression. Science 270:484-487 (1995).
6. Liang, P., and Pardee, A.B. Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 257:967-970 (1992).
7. Hubank, M. and Schatz, D.G. Identifying differences in mRNA expression by representation difference analysis of cDNA. Nucleic Acid Res. 22:5640-5648 (1994).
8. Bachem, C.W.B., et al, Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: Analysis of gene expression during potato tuber development. Plant 9:745-753 (1996).
9. Wang, A. et al. Rapid analysis of gene expression (RAGE) facilitates universal expression profiling. Nucleic Acid Res. 27:4609-4618 (1999).





Bioscience Technology Chromatography Techniques Drug Discovery & Development Laboratory Equipment Pharmaceutical Processing R&D Scientific Computing
Advantage business Media © Copyright 2008 Advantage Business Media
Privacy Policy | Terms & Conditions | Advertise With Us