The most detailed molecular information about the transcription cycle is available in bacterial systems. The synthesis of RNA is initiated at the promoter sequence by the enzyme RNA polymerase. A single RNA polymerase type is responsible for the synthesis of messenger, transfer, and ribosomal RNAs.
When isolated from bacteria, prokaryotic RNA polymerase has two forms: The core enzyme and the holoenzyme. The core enzyme is a tetramer whose composition is given as α 2ββ′ (two alpha subunits, one beta subunit, and one beta‐prime subunit). Core RNA polymerase is capable of faithfully copying DNA into RNA but does not initiate at the correct site in a gene. That is, it does not recognize the promoter specifically. Correct promoter recognition is the function of the holoenzyme form of RNA polymerase.
The RNA polymerase holoenzyme contains another subunit, s( sigma), in addition to the subunits found in the core enzyme. Holoenzyme, α 2ββ′δ, is capable of correct initiation at the promoter region of a gene. Sigma thus must be involved in promoter recognition. Sigma subunits are related but distinct in different forms of RNA polymerase holoenzyme. These specialized δ subunits direct RNA polymerase to promoter sequences for different classes of genes. For example, bacteria exposed to high temperatures synthesize a set of protective proteins called heat‐shock proteins. The genes for the heat‐shock proteins have special promoter sequences that are recognized by an RNA polymerase holoenzyme with a specific δ subunit. The δ discussed here is the major δ of the common bacterium E. coli, about which most is known.
RNA polymerase holoenzyme starts by recognizing the promoter of a gene. The promoter isn't copied into RNA, but it is, nonetheless, an important piece of genetic information. The information in a promoter was determined by lining up a large number of promoters and counting how many times a particular base appeared at a given position in the various promoter sequences. The consensus sequence is given by the statistically most probable base at each point—the bases that appear most often in the promoter collection. Very few, if any, naturally occurring promoters match the consensus sequence exactly, but the “strength” of a promoter (how actively RNA polymerase initiates at it) correlates well with the degree of consensus match. For example, the promoters of genes for ribosomal RNA match the consensus well, while the promoters for the mRNA encoding some regulatory proteins match the consensus poorly. This correlates with the relative amounts of each gene product that are needed at any one time: many ribosomes, and only a few regulatory proteins.
The consensus sequence for an E. coli promoter has two conserved regions near positions ‐35 and ‐10 relative to the transcription start site. That is, the template‐directed synthesis of RNA begins 35 base pairs downstream of the first consensus region and ten base pairs downstream of the second. The ‐35 consensus is:
The ‐10 consensus is:
A couple of important points exist about the consensus. First, not all bases in the consensus are conserved to the same amount. The bases marked with bold type and underlined are more conserved than the others, and the ‐10 region is more conserved overall than is the ‐35 region. Secondly, the promoter sequence is asymmetrical; that is, it reads differently in one direction than in the other. (Compare this to the recognition sequence for the restriction enzyme BamHI, GGATCC.) This asymmetry means that RNA polymerase gets directional information from the promoter in addition to information about the starting point for transcription.
The transcription process
RNA polymerase only goes one direction from a promoter and only one strand of DNA is used as a template at any one time. To provide this template strand, the initiation of transcription involves a short unwinding of the DNA double helix. This is accomplished in a two‐step fashion. First, RNA polymerase binds to the promoter to form the closed complex, which is relatively weak. Then, the double‐stranded DNA goes through a conformational change to form the much stronger open complex through opening of the base pairs at the ‐10 sequence, as shown in Figure 2.
The initiator nucleotide binds to the complex and the first phosphodiester bonds are made, accompanied by release of δ. The remaining core polymerase is now in the elongation mode. Several experimental observations support the picture presented in the next figure, namely the fact that less than one δ exists in the cell per core enzyme in each cell.
Elongation is the function of the RNA polymerase core enzyme. RNA polymerase moves along the template, locally “unzipping” the DNA double helix. This allows a transient base pairing between the incoming nucleotide and newly‐synthesized RNA and the DNA template strand. As it is made, the RNA transcript forms secondary structure through intra‐strand base pairing. The average speed of transcription is about 40 nucleotides per second, much slower than DNA polymerase. Other protein factors may bind to polymerase and alter the rate of transcription and some specific sequences are transcribed more slowly than others are. Eventually, RNA polymerase must come to the end of the region to be transcribed.
Termination of transcription in vitro is classified as to its dependence on the protein factor, rho (ρ). Rho‐independent terminators have a characteristic structure, which features (a) A strong G‐C rich stem and loop, (b) a sequence of 4–6 U residues in the RNA, which are transcribed from a corresponding stretch of As in the template. Rho‐factor‐dependent terminators are less well defined, as shown in Figure 4.