Transcription Regulatory Network
What is Transcription Regulatory Network?
Transcriptional regulation of gene expression in a cell is important to the survival and development of all organisms. For example, bacteria living in an environment of high lactose but no glucose (which is their “food source”) need to be able to alter the expression of certain genes to produce a protein that can break down lactose into glucose in order to survive and reproduce. Similarly, a cell in the human body needs to be able to respond to environmental stimuli such as harmful toxins by controlling its gene expressions so that certain proteins won’t be produced to stop the cell from moving onward to cell division. Cellular specialization of a multi-cellular organism also requires a coordinated effort of turning genes “on” and “off” at specific times so that different cells can become specialized in different ways according to the proteins they synthesize. These tasks of regulating gene transcriptions are achieved by collections of regulatory proteins and their interactions with specific sequences in the promoter regions of targeted genes. A “Transcriptional Regulatory Network” describes such regulatory proteins and interactions.
Figure: A Transcription regulatory network. (1)
Transcriptional regulatory networks in viruses and prokaryotes
Transcription regulatory proteins, often known as “activators” (activate gene transcriptions) or “repressors” (repress gene transcriptions), can be found to function in the relatively simpler genomes of viruses and prokaryokes. For example, the bacteriophage lambda alters its gene expression depending on whether it is entering the lytic cycle or the lysogenic cycle .
Although the phage lambda genome is only 48,502 base pairs long, the regulatory network that allows the phage to choose and follow one of the two cycles is quite complex already and still remains to be fully elucidated. (2)
Figure: The phage lambda genome. (2)
Figure: The phage lambda regulatory network. (2)
For bacteria, the transcription regulatory network has been most extensively studied in E.coli. We have mentioned the need for the bacteria to metabolize lactose in the absence of glucose. In E.coli, the lac operon , a piece of DNA that contains a promoter region, an operator region and three genes, serves to regulate this process. The lac operon represents perhaps the simplest transcription regulatory mechanism; it involves a repressor protein, which is transcribed by the lacI gene and binds to the operator region, and the operator sequence itself. The presence of lactose serves to inactivate the repressor and thus allow for the transcriptions of the genes. Still, this transcription regulatory mechanism is only part of a bigger E.coli metabolic regulatory network, which may consist of as many as 149 genes, including genes that code for 16 regulatory proteins and 73 enzymes! (3)
Figure: The transcription regulatory network for metabolism in E.coli. (3)
Transcription regulation networks in eukaryotes
As you can imagine, transcription regulatory networks in eukaryotes are even more complicated. Besides elements such as promoters, activators and repressors, eukaryotic transcription features other cis-regulatory elements including inducers and enhancers. The best-studied eukaryotic model for transcriptional regulation is yeast Saccharomyces cerevisiae. Even still, for this “simple” eukaryotic organism, there are more than 200 regulatory proteins for the transcription of its >6000 genes. (4) In order to map these transcription regulation networks, a variety of experimental and computational tools are used.
Figure: An artistic rendition of the yeast gene regulatory network. (5)
Deciphering Transcription Regulatory Networks
In deciphering transcription regulatory networks, we want to know how each gene is controlled by transcription regulation. We can think of this as a reverse engineering problem. We see the end results of the working of a system (for example, we can observe the physiological changes in the development of an organism), but we don’t know how the system works. Since the expression of a gene is largely determined by transcription factors (TFs) and their binding sites, we also want to know which gene is regulated by which TF. Developments in genomic tools and computational algorithms have allowed us to better study these questions, illuminating the workings of transcription regulatory networks.
Gene expression analysis
To understand transcription regulation networks, we first have to identify which gene are transcribed. Microarray expression analysis allows us to define the targets of transcription factors as we delete or overexpress TF-encoding genes in cells under studied. Then, using singular value decomposition analysis , we can find patterns in the expression data and try to construct the actions of the TFs. (For more details on this approach, see 6 , 7 , 8.) Alternatively, researchers have also used Bayesian belief networks to analyze expression data. (For experiments using this approach, see 9 , 10) Still, microarray expression data alone cannot produce the whole picture of transcription regulation network. First of all, the observed gene expressions could be due to secondary effects not directly attributable to TFs. Also, each gene could be activated or repressed by a number of TFs, so TF functions could be redundant and compensated for in gene expression data. In addition, transcription regulation could be conditionally-dependent. For example, depending on certain growth condition, the TF may not be active, or it may regulate a different numbers of genes, or it may regulate a different set of genes.
Figure: Transcription regulation is conditionally specific. (11)
ChIP-chip
The ChIP-chip technology is also used to study transcription regulatory networks. The ChIP-chip methodology is described in details elsewhere, but briefly: it involves the chromatic immunoprecipitation of DNA fragments bound by the transcription factor of interest, and then hybridization to a DNA microarray. The following figure shows the schematic of ChIP-chip analyses compared to genome-wide expression analyses.
Figure: (a) cDNA microarray genome-wide expression analyses. (b) ChIP-chip. (4)
The advantage of ChIP-chip data is that they provide a direct measure of TF-DNA binding, thus eliminating possible secondary effects in regular gene expression analysis. For example, ChIP-chip studies have revealed that although the DNA sequence targeted by a certain TF may occur at many other sites in the genome, the TF actually only bind specifically to a subset of these sites in vivo. The study of the transcriptional activator Rap1 by Lieb et al. is a classic example. They found that the consensus binding sequence of Rap1, ACACCCRYACAYM (remember HWA2?), albeit can be matched to many sites of the yeast genome, is preferentially bound by the transcriptional activator only at sites close to promoters. (12) Thus, ChIP-chip allows for a “location” analysis of gene regulation by TFs.
ChIP-chip analyses have been used on many studies of transcription regulatory networks, including a seminal study on 106 TFs in yeast by Dr. Richard Young’s lab at MIT. (13) In this study, Young and colleagues were able to obtain a distribution of TF binding, i.e. how many promoter regions does each TF bind to, and how many TFs bind to each promoter region. This not only gives us a glimpse into the complexity of the transcription regulatory network, but also allows us to begin constructing the transcription regulatory networks. Still, one caveat on ChIP-chip studies is that ChIP-chip data may be conditionally-dependent as well. For example, TFs may not bind to promoters in an inactive state. (14) Recent studies include investigating TF binding profiles under different growth conditions. (Review in 14.)
Figure: (left) Number of transcription regulators bound per promoter region. The red circles reflect results from actual data, whereas the white circles represent originally predicted results. (right) Distribution of bound promoter regions per transcription regulator.(13)
Promoter elements
Another approach in identifying transcription regulatory targets is to search for promoter motifs to which TFs recognize and bind. Computational algorithms that are used to search for TF binding sites (TFBSs) have been described in detail elsewhere, and will not be repeated here. By identifying TF motif sequence, location and orientation, we can potentially predict gene expression regulation. Of course, simply identifying TFBSs in promoter elements is insufficient to decipher transcription regulatory networks, as motif match is always a probability, and, as mentioned above, we don't know whether the motif actually functions as TFBSs in cells.
Epigenetics
We have to realize that there are epigenetics factors in transcription regulatory networks that cannot be elucidated by looking at gene expression, ChIP-chip and promoter elements data. For example, as mentioned above, DNA sequences that can be considered as TF motifs may be found at many sites in the genome, however, only a fraction of these motifs are actually bound by TF. The mechanism that allows the TFs to selectively bind to these specific motifs is still unknown, but has been proposed to involve nucleosome position and histone modification. A recent study by a group at the Harvard Bauer Center for Genomics Research has further linked nucleosome position to TF binding, showing that most functional TFBSs are "devoid of nucleosomes," making these motifs accessible to TF binding. (15)
Figure: Trancription factor motifs that are actually bound by transcription factors (i.e. functional) are mostly found in the linker regions of the chromatin, which are more accessible than the nucleosome regions. (15)
Figure: Histone modification can regulate transcription by allowing or disallowing TFs to bind to DNA through acetylation and deacetylation.
Network Features
Besides identifying the regulatory factors in transcription and their interactions with DNA, we also need to be able to organize and assemble these information to produce the full picture of a transcription regulatory network. Although each transcription regulatory network will be unique depending on its participating genes and proteins, there are some common features in the architecture of transcription regulatory networks. The following sub-sections focus on what scientists have observed in yeast Saccharomyces cerevisiae, but one can imagine that such coordinations and network motifs can be found in other organisms as well, including human.
Coordination in regulation
Coming soon
Network motifs
In a transcription regulatory network, there are often recurring patterns of interactions among the regulatory proteins. These regulatory patterns are called “network motifs” (not to be confused with DNA binding motifs). We can imagine them as the basic units that make up the network architecture. The frequency that a motif is used by cells also reveals the evolutionary selection for the regulatory strategy. The common network motifs include:
Autoregulation
In autoregulation, the regulatory protein binds to its own promoter to induce transcription of more regulatory proteins. The autoregulation motif is found for yeast regulatory proteins ARO80, NRG1, RAP1, RCS1, SMP1, STE12, SUM1, SWI4, YAP6 and ZAP1. (16)
Feedforward Loop (FFL)
In a feedforward loop, a regulatory protein regulates a second regulatory protein, and both proteins bind and regulate a common target gene. In yeast, the Young lab has found 39 regulatory proteins being involved in 49 FFLs, controlling about 240 genes. (16)
Multi-Component Loop
In a multi-component loop, there are two or more proteins participating in the regulatory circuit. For example, a regulatory protein can bind to the promoter of another regulatory protein, which induce the synthesis of that regulatory protein, which back-regulates the first regulatory protein by controlling its promoter. (16)
Regulator Chain
A regulator chain consisnts of a series of three or more regulatory proteins that regulate one another like a chain. (16)
Single-input Motif
In a single-input motif, one regulatory factor binds to a set of target genes under specific conditions. (16)
Multi-input Motif
On the other hand, in a mulit-input motif, a set of regulatory proteins bind together to a set of target genes with the same binding pattern. (16)
Summary
References
(1) U.S. Department of Energy Genomics:GTL Program, http://genomicsgtl.energy.gov.
(2) Dodd, Ian B., Shearwin, Keith E., and Egan, J. Barry. Revisited gene regulation in bacteriophage lambda. 2005. Current Opinion in Genetics & Development. 15 (2): 145-152.
(3) Covert, Markus W., and Palsson, Bernhard O. Transcriptional Regulation in Constraints-based Metabolic Models of Escherichia coli. 2002. J. Biol. Chem. 277(31): 28058-28064.
(4) Wyrick, John J., and Young, Richard A. Deciphering gene expression regulatory networks. 2002. Current Opinion in Genetics & Development. 12:130-136.
(5) http://array.mbb.yale.edu/yeast/transcription/
(6) Alter, O., Brown, P.O., and Botstein, D. Singular value decompositioiin for genome-wide expression data processing and modeling. 2000. PNAS. 97:10101-6.
(7) Holter, N.S., Maritan, A., Cieplak, M., Fedoroff, N.V., and Banavar, J.R. Dynamic modeling of gene expression data. 2001. PNAS. 98:1693-98.
(8) Holter, N.S., Mitra, M., Maritan, A., Cieplak, M., Banavar, J.R. and Federoff, N.V. Fundamental patterns underlying gene expression profiles: simplicity from complexity. 2000. PNAS. 97: 8409-8414.
(9) Gifford, D.K. Blazing pathways through genetic mountains. 2001. Science. 293:2049-51.
(10) Friedman, N., Linial, M., Nachman, I., and Pe'er, D. Using Bayesian networks to analyze expression data. 2000. Journal of Computational Biology. 7:601-620.
(11)
(12) Lieb, J.D., Liu, X., Botstein, D., and Brown, P.O. Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. 2001. Nature Genetics. 28:327-334.
(13) Lee, T.I., Rinaldi, N.J., Robert, F., Odom, D.T., Bar-Joseph, Z., Gerber, G.K., Hannett, N.M., Harbison, C.T., Thompson, C.M., Simon, I., Zeitlinger, J., Jennings, E.G., Murray, H.L., Gordon, D.B., Ren, B., Wyric, J.J., Tagne, J., Volkert, T.L., Fraenkel, E., Gifford, D.K., and Young, R.A. Transcriptional Regulatory Networks in Saccharomyces cerevisiae. 2002. Science. 298:799-804.
(14) Chua, G., Robinson, M.D., Morris, Q., and Hughes, T.R. Transcriptional networks: reverse-engineering gene regulation on a global scale. 2004. Current Opinion in Microbiology. 7:638-646.
(15) Yuan, G.C., Liu, Y.J., Dion, M.F., Slack, M.D., Wu, L.F., Altschuler, S.J., and Rando, O.J. Genome-Scale Identification of Nucleosome Positions in S. cerevisiae. 2005. Science. 309: 626-630.
(16) http://web.wi.mit.edu/young/regulator_network/ (Website in support of: Lee, T.I., Rinaldi, N.J., Robert, F., Odom, D.T., Bar-Joseph, Z., Gerber, G.K., Hannett, N.M., Harbison, C.T., Thompson, C.M., Simon, I., Zeitlinger, J., Jennings, E.G., Murray, H.L., Gordon, D.B., Ren, B., Wyric, J.J., Tagne, J., Volkert, T.L., Fraenkel, E., Gifford, D.K., and Young, R.A. Transcriptional Regulatory Networks in Saccharomyces cerevisiae. 2002. Science. 298:799-804.)
(17) Beer, Michael A., and Tavazoie, Saeed. Predicting gene expression from sequence. 2004. Cell. 117(2):185-198.
Comments (0)
You don't have permission to comment on this page.