NUCLEIC ACIDS
This text is divided into seven major sections:-
Organization of Genetic Material
-
Semiconservative Nature of DNA Synthesis
-
The Chemistry of DNA Synthesis
-
The Proteins of DNA Synthesis
-
DNA repair
-
Properties of the Three Major RNA Species.
-
RNA Synthesis
Reasons for organization.
- There are large amounts. ONE human chromosome (there are 46 of them) contains 50-250 x 10 exp 6 BP (base pairs), and is 1.7-8.5 cm long. Very fragile, subject to shear damage if unprotected.
- Not all of it is (or ought to be) available for expression; some is packaged in a manner that makes its genetic information inaccessible.
There are several successive layers of organization in eukaryotic DNA.
- The nucleosome consists of four histone molecules (H2A, H2B, H3 and H4) with a strand of double helical DNA wrapped around them.
The histones are proteins which are
- highly conserved across species lines (implies great importance of the structure).
- basic -- contain lots of arg and lys (pos chg), which interact with DNA via salt bridges.
- The chromatosome is a nucleosome plus a fifth histone (H1) which holds the structure together. (Some people call this whole thing a nucleosome.)
The DNA consists of
- core DNA, which is wrapped around the histones
- linker DNA, which is the DNA between the individual nucleosomes.
The structure looks like beads on a string.
- Nucleosomes form higher order structures, such as the100 Angstrom fiber and 300 Angstrom fiber (solenoid).
- These further fold. along with various nonhistone proteins, to form, ultimately, the chromosome.
- The result of all this organization -- chromatin, the name given to the complex of genetic material in eukaryotes.
- Composition:
- DNA
- Protein -- histones and nonhistone proteins
- RNA in small amounts
- Functionality in gene expression -- DNA that is highly condensed can't be expressed.
- Euchromatin -- uncondensed, can be transcribed. (Yes, DNA can be read through the nucleosomes without unwinding them.)
- Heterochromatin -- condensed (300 Angstrom), cannot be transcribed.
- The chromosome is one very large DNA molecule -- linear in eukaryotes. It has three essential elements.
- centromere (spindle attachment site during cell division)
- telomeres (ends)
- one or (in eukaryotes) many origins of replication (Autonomously Replicating Sequences). If the DNA were very large, and there were only one origin of replication, it would take too long to complete the process.
In prokaryotes DNA is also organized.
It is a double-stranded circular supercoil in a compact structure called nucleoid, containing various proteins (some histone-like) and RNA.
The intracellular location of the chromosome differs between eukaryotes and prokaryotes.
in prokaryotes it is attached to the inner side of the plasma membrane, but in contact with the cytoplasm.
in eukaryotes it is in the nucleus, isolated form the cytoplasm.
This has important consequences to how RNA and protein are synthesized in the two types of organisms.
DNA
Definitions:
- Conservative: Two parent strands stay together, and two daughter strands stay together.
- Dispersive: parental and daughter material are mixed on each strand.
- Semiconservative: One parent strand and one daughter strand appear in the final product.
New DNA is made by using the original DNA as a template.
The strands separate in a certain region.
Bases for the new strand line up on the parent strand according to the rules for Watson-Crick base pairing.
Requirements:
dATP, dCTP, dGTP, dTTP -> DNA polymer + PPi
- (PPi + H2O -> 2 Pi draws the reaction.)
Also required: DNA polymerase enzyme, DNA template, Mg++ and a primer.
Mechanism: 3'-OH of one sugar attacks the 5' phosphate of a deoxynucleoside triphosphate.
Each successive nucleotide residue is added to the 3' end of the nucleic acid.
RESULT: chain growth is always in the 5' -> 3' direction!!! This is called 5' -> 3' polymerase activity.
Enzymology -- this is an enzyme-catalyzed reaction, so there are several special features of it that need to be noted. Most of the features of DNA synthesis have been worked out in prokaryotes (particularly E. coli), and eukaryotic systems are assumed to be much the same.
- There are two important bacterial DNA polymerases, I and III (the naming was in order of discovery; the function of II is unknown.)
- DNA polymerases require a primer; they can elongate an existing polynucleotide, but cannot start a new one.
To start a fresh strand, an RNA polymerase lays down an RNA primer. (Why? Because RNA polymerases don't require primers.)
Or to elongate an existing DNA strand, the existing strand serves as the primer.
- The important DNA polymerases (I and III) have 3' -> 5' exonuclease activity.
- Definition: hydrolyses nucleotides FROM the 3'-end TOWARD the 5'-end. Works ONLY at the end of the chain.
- Effect: to undo what has just been synthesized.
- Why? To correct errors. "Proofreading" capability. (These errors may arise from incorporation of tautomeric forms of the DNA bases into incorrect places, where the tautomers can hydrogen bond.)
Many proteins are involved in the many mechanical steps associated with the unwinding of DNA prior to synthesis, the synthesis itself and rewinding of the new DNA.
- Some of the problems to be overcome during DNA replication:
- Where to start?
- Separation of the two DNA strands.
- Keep them apart.
- Make a primer.
- Make DNA -- this alone requires several enzymes.
- Release strain in the parental double stranded DNA as it unwinds during replication.
Identification of a replication origin.
- In E. coli there is one, and it is called the OriC location.
The OriC location contains four sites where a protein called the dnaA protein can bind and start the assembly of the necessary proteins (the final assembly is called a replisome).
Opening of the double strand then occurs in an AT-rich region.
- In eukaryotes there are many replication origins -- different numbers under different conditions of cell division -- and the signal (if any) for identifying one is not known.
SUBSEQUENT STEPS ARE FOR THE E. COLI SYSTEM:
Further assembly of the replisome.
- Other proteins add, including one with helicase (helix-unwinding) activity. This is the dnaB protein
- Topoisomerase II uses ATP to relieve strain in the double helix at either end of the "bubble."
- SSB proteins stabilize the single strands, so they don't zip back together. (Also prevents formation of hairpin loops in the single strand.)
AT THIS POINT THE COMPLEX IS CALLED A PREPRIMING COMPLEX, AND IS READY FOR SYNTHESIS OF A PRIMER.
- Primase (an RNA polymerase) and other proteins are added, forming a PRIMOSOME.
Primase begins to make a short piece of RNA on each DNA strand, using the DNA as a template.
- DNA polymerase III joins the complex (to extend the RNA primer with DNA) along with the rep protein (to unwind the double stranded parental DNA), and the replisome is complete.
Synthesis of DNA
The leading strand -- this is the strand where DNA synthesis started. Synthesis continues in an uncomplicated way, 5' -> 3', as the rep protein (and helicase II) unwinds the parental DNA double strand.
The resulting double helical DNA product is, of course, antiparallel.
The other -- lagging -- strand
- Primase binds to a signal on the lagging strand that is uncovered by the action of the rep protein.
- Primase makes a piece of RNA, which serves as a primer for a new DNA polymerase III molecule.
- Primase moves down the lagging strand, and does this again.
- The result is a series of short pieces of DNA, each with a RNA primer at the 5' end. These are called OKAZAKI FRAGMENTS.
Joining of the fragments of newly synthesized DNA involves two enzymes.
DNA polymerase I removes the RNA (5' -> 3' exonuclease activity), and fills in with DNA (5' -> 3' polymerase activity).
DNA ligase then seals the gaps between DNA fragments.
In EUKARYOTES:The process seems to be much the same.
Synthesis of bulk DNA
- DNA polymerase delta may synthesize the leading strand.
- DNA polymerase alpha may synthesize the lagging strand.
- There is a ubiquitous DNA polymerase epsilon that may also be essential.
How the ends of linear DNA molecules are synthesized: telomerase.
- Structure of the ends (telomeres) TTAGGG tandem repeat (hundreds of them).
- The problem: at the end of the lagging strand there is no room for a RNA primer to be synthesized.
- One answer: (Sci. Am. August, 1991)
- the 3' end of the parent double stranded DNA is longer than the 5' end, and is curled under (by non-Watson-Crick base pairing).
- During replication it uncurls, leaving room for a primer, and the lagging daughter strand can be synthesized.
- The problem is now to extend the leading strand on the other DNA molecule.
- Telomerase does this, using RNA to base pair to the 3' end of the new strand.
It serves as a template for extension of the 3' end.
And it catalyzes the addition of the new nucleotides, moving up the new DNA as needed.
is essential to correct errors and to fix damage. DNA is the only macromolecule that is repaired when damaged instead of being replaced.
Postreplicative repair removes errors missed by polymerase proofreading (known to occur in prokaryotes).
- Double stranded DNA is methylated. This methylation is involved in gene expression, and it takes time for the methylation enzymes to carry out the normal methylation of a new strand of DNA.
- The mismatched repair system detects distortions caused by any mismatched bases.
The mismatched base is excised on the new (presumably defective) strand, which is identified by its lack of methylation.
- DNA polymerase I fills in the gap.
- DNA ligase closes the last phosphodiester link.
- This pattern of repair is called excision repair, and is general, but different types of damage involve different enzyme systems to excise the damaged DNA.
There are two types of mutations
- Base substitution (point) mutations, where one base is substituted for another.
Modification of a base, resulting in incorrect base pairing in the next round of replication, can arise from
- chemical reactions with oxidizing and alkylating agents.
- damage by ultraviolet- or x-irradiation.
- Frameshift mutations, where one or more bases is added or removed. Can be caused by an aromatic compound inserting between bases in stacked DNA. This is called intercalation. Acridines and ethidium bromide cause this.
Either type of mutation, if not repaired by one of the repair systems, will be retained in subsequent rounds of replication. This may lead to
defective behavior of affected somatic cells in the individual (e.g., cancer) or
an inherited change (if germ cells are affected).
RNA
Summary of RNA species
- Transfer RNA (tRNA) is a small (65-110 nucleotides) molecule designed to carry activated amino acids to the site of protein synthesis, the ribosome. It is long-lived (stable).
- Ribosomal RNA (along with various proteins) forms the ribosome, the site of protein synthesis, and one rRNA is the catalyst for formation of the peptide bond (Science, June 5, 1992). Various species range in size from 4700 bases to about 120 bases. Eukaryotic and prokaryotic rRNAs are distinctly different. rRNA is also long-lived (stable).
- Messenger RNA (mRNA) is the carrier of genetic information on the primary structure of proteins from DNA, along with special features that allow it to attach to ribosomes and function in protein synthesis. Its size depends on the size of the protein for which it codes. It tends to be relatively short-lived, and its lifetime varies from molecular species to molecular species (depending on the biological role of the protein).
Transfer RNA (tRNA)
As an amino acid carrier, there must be at least one for every amino acid in protein synthesis (i.e., 20). Actually there are at least 56 in any cell.
Each recognizes a different codon for an amino acid. (Obviously there must be more than one of these per amino acid.) The different tRNAs that accept a given amino acid are called isoacceptors.
Each carries only one amino acid.
Structure: a "cloverleaf" consisting of a stem and three loops. A small "extra arm" may also exist.
The anticodon loop contains a triplet that
- base pairs to mRNA during protein synthesis. This triplet is called the anticodon.
- plays a role in specifying which amino acid becomes attached to the tRNA
The stem
ends in the sequence ...CCA, which is the attachment site for the amino acid.
It contains additional determinants of which amino acid is attached to the tRNA
tRNA contains many unusual bases, which arise by modification after transcription.
Ribosomal RNA
Ribosomal RNA (rRNA) differs in eukaryotes vs. prokaryotes
Eukaryotes contain 28, 18, 5.8 and 5 S rRNAs. S is the sedimentation coefficient, and is a measure of relative size.
Prokaryotes contain 23, 16 and 5 S rRNAs.
These differences are reflected in differences between the ribosomes.
- Eukaryotic ribosomes are large (80S), consisting of 40S and 60S subunits. Note that the S-values are not additive.
- Prokaryotic ribosomes are smaller (70S), consisting of 30S and 50S subunits.
Mitochondria also have ribosomes, which resemble prokaryotic ribosomes more than eukaryotic.
Messenger RNA
All messenger RNA (mRNA) contains regions that do not code for protein (untranslated regions). mRNA differs in eukaryotes vs. prokaryotes.
Conventions: upstream and downstream. mRNA contains a start signal for protein synthesis (usually the base triplet AUG). Positions toward the 5' end of the mRNA are referred to as being upstream of the start signal, and positions toward the 3' end are downstream. This is because protein synthesis flows in the 5' -> 3' direction.
Prokaryotic mRNA has a purine-rich upstream sequence called the Shine-Dalgarno sequence (or SD sequence).
The SD sequence is a consensus sequence.
A consensus sequence is an idealized sequence of bases
- whose real counterparts appear in various places in a polynucleotide and perform the same function in each,
- but with minor deviations of the real sequence from the ideal.
The notion of consensus sequence represents relative (as opposed to absolute) specificity for a nucleotide sequence.
The more a real sequence resembles the consensus, the better it performs the specified function.
The SD sequence of mRNA pairs with bases in 16S rRNA during an early phase of prokaryotic protein synthesis. This is how the mRNA attaches to the ribosome.
Eukaryotic mRNA has special structures at both ends.
The 5' untranslated region contains
a 5' cap, consisting of m7Gppp -> 5' O - , and then sometimes one or two nucleotides with methyl groups on the riboses. The first base is sometimes a methylated purine.
This is followed by 30 to several hundred bases, sometimes with some secondary structure.
The 3' untranslated region consists of
50-300 nucleotides, perhaps with secondary structure,
followed by a poly(A) tail, 20-200 nucleotides long, which probably stabilizes the molecule.
RNA synthesis -- transcription -- is similar in prokaryotes and eukaryotes, but substantial differences between the two systems are known.
Requirements:
ATP, GTP, UTP, CTP -> RNA polymer + PPi
- (PPi + H2O -> 2 Pi draws the reaction.)
Also required: RNA polymerase enzyme, DNA template, but no primer is needed.
The mechanism is the same as for DNA synthesis.
Direction: RNA synthesis, too, proceeds in the 5' -> 3', antiparallel to the direction of the template.
There are three stages to RNA synthesis
- initiation at a specific site
- elongation
- termination at a specific site
These stages are considered as distinct entities here, unlike with DNA synthesis, because during transcription only specific, selected pieces or RNA are made, and the specificity requires a carefully controlled process.
Prokaryotic transcription (E. coli) was studied first, and is simpler and better understood.
E. coli RNA polymerase is a multisubunit enzyme, with different functions ascribed to different subunits. The subunits include the following.
- 2 alpha -- initiation
- beta -- phosphodiester bond formation
- beta' -- binds DNA template
These four subunits are the core enzyme; they alone will carry out transcription, but cannot initiate rapidly at specific sites.
- sigma -- recognizes the promoter -- binding specificity. The core enzyme plus sigma factor is called the holoenzyme.
- omega -- unknown function
Initiation occurs at a promoter sequence in DNA.
The conventions for numbering locations in DNA relative to a gene that is being discussed are as follows.
- Position 1 is the first base to be transcribed into RNA.
- Positions toward the 3' end of the DNA are called "upstream," and are numbered negatively. Bases with negative numbers do not code for RNA.
- Positions toward the 5' end of the DNA are called "downstream," and are numbered positively. These code for RNA.
The promoter consists of a -35 consensus sequence and a -10 consensus sequence (the Pribnow box, containing the consensus sequence, TATAAT).
Step 1: formation of a closed complex -- RNA polymerase holoenzyme binds to double stranded DNA at a promoter site.
Selection of which RNA to synthesize is achieved in part by the fact that different promoters prefer different sigma factors,
so control of s factor availability influences which RNA will be synthesized.
Step 2: formation of an open complex -- the double stranded DNA separates at the Pribnow box to form a bubble of about 10 base pairs.
This is easier when the box contains all AT base pairs; hence deviations form the consensus make a promoter weaker.
The open bubble allows RNA to base pair with the DNA template.
Step 3: The first nucleoside triphosphate binds.
Elongation of the RNA chain occurs
by successive addition of nucleotides.
sigma drops off the holoenzyme when or soon after the first phosphodiester bond forms. As soon as this happens, the system is in elongation mode.
Termination can occur by one of two mechanisms.
- Rho-dependent termination involves a protein with ATPase activity, and is poorly understood.
- Rho-independent termination occurs at a region where
- a GC-rich palindrome containing 7-10 base pairs, which forms a hairpin, is followed in the RNA by
- an oligo U region 4-8 bases long.
This sequence disrupts the base pairing of newly synthesized RNA with the DNA template (dA=U base pairs have only two hydrogen bonds), and the RNA falls off.
- The RNA polymerase then also falls off the DNA.
Eukaryotic transcription is in general similar to prokaryotic, but there are additional complications.
There are three eukaryotic RNA polymerases.
Structure: each contains more subunits than the prokaryotic.
Distinguishing features:
Type location products alpha-amanitan sensitivity
I nucleolus 18, 5.8 and 28S rRNA -
II nucleoplasm mRNA precursor +++
III nucleoplasm tRNA, 5S rRNA +
Function: each makes a precursor of the final functional RNA, and the processing of the initial transcript is of major importance.
The promotor for eukaryotic transcription differs from the prokaryotic. It includes
- TATA box, about 25 bp upstream of the start site. [consensus: TATA(AT)A(AT)]
- CAAT box, somewhat further upstream, not always present. [consensus: GC(TC)CAATCT]
- Various factors bind to these and other important sites, and the DNA probably folds so all these proteins can interact with each other and the RNA polymerase they are supposed to control.
RNA polymerase I makes the precursor for 28S, 18S and 5.8S rRNA
- S factor is required for initiation of the rRNA precursor synthesis by RNA polymerase I. This is reminiscent of the role of sigma factor in prokaryotes.
- Cleavage of the unused sequences from the primary transcript results in the formation of the final rRNA species.
RNA polymerase II makes the precursor of mRNA, which is subsequently modified in the nucleus before transport to the cytoplasm.
- The 5' cap is added cotranscriptionally, the only known eukaryotic mRNA cotranscriptional modification, by a capping enzyme complex.
Functions of the cap:
- The cap enhances translation efficiency.
- It is necessary for subsequent splicing.
- Polyadenylation is signaled by the consensus sequence AAUAAA, which is usually downstream of the coding region.
An endonuclease cleaves the mRNA precursor about 20 bases downstream of the AAUAAA while transcription is still in progress.
A soluble polymerase then adds the poly(A) tail in a process NOT directed by DNA.
Most eukaryotic mRNA has the poly(A) tail.
- Splicing -- At this point the RNA sequence that codes for protein is often interrupted by one or more RNA sequences that do not code for protein. These must be removed, and the useful RNA spliced back together.
RNA polymerase III makes the precursor of 5S rRNA and tRNA.
- Cleavage of unnecessary sequences from the primary transcript is as before.
- In maturation of tRNA
- Sequences are removed
from the ends of the final product and
from the interior of the final product
- A CCA sequence is added to the 3' end.
- Bases are modified to produce the unusual bases found in tRNA.
Return to the NetBiochem Welcome page.
jb
Last modified 10/8/97