LECTURES
Biology 210
Genetics and Molecular Biology
Fall 2008
Tuesday September 9, 2008 (OPTIONAL 4th HOUR REVIEW)
First Day of Class - WELCOME!
I. General Introduction
I. General Introduction
What is a genome?? All of the genetic information that can be found
in an organism is that organisms genome?
In a eukaryote, like us, each body cell nucleus has two complete copies of the
genome.
(Show Slide - CF sequence)(Slide 2 in
Ella Resources)(Also Slide 3 in Ella Resources)
Mutation - DF508 - Half of white Americans
with Cystic Fibrosis have two copies of the DF508
deletion,
40% have one copy of the deletion and one other mutation, and about 10% do not
have the DF508 deletion. Forty-five percent of Italian
CF carriers have this mutation, and it is present in about 30% of the American
blacks and Ashkenazi Jewish CF carriers.
Man and Woman - No CF - Have a child affected by CF. HOW???
GENETICS
The textbook says that the term "Genetics" can be defined as,
"the study of genes".
This course, in keeping with the modern study of genetics, will include two
broad categories of genetic analysis
1) CLASSICAL GENETICS - Study of the transmission of traits from generation
to generation. Also called "TRANSMISSION GENETICS".
2) MOLECULAR GENETICS - Study of genes and their products at the molecular level. This is also called MOLECULAR BIOLOGY.
CLASSICAL GENETICS - Relies on variation.
Experimental Classical Genetics Includes
Isolation of phenotypic variants
Analysis of progeny of controlled matings
We'll be doing these in lab in Project 2
MOLECULAR BIOLOGY
Study of DNA (as well as RNA and protein)
In the lab, we are doing 2 big projects that involve both classical genetics and molecular biology experiments.
These 2 big projects are:
1) PTC Gene Analysis
and
2) Drosophila Mutational Analysis
1) In the PTC Gene Analysis project, which you are starting this week, you are going to do a bit of classical genetic analysis, and a lot of molecular biology. The neat thinig about this project is that your are doing a genetic analysis of yourself.
What you are doing is a genetic test. You are determining your Phenotype for a certain trait, and your Genotype for a gene involved in controlling that trait.
The trait is the ability to taste a bitter chemical called Phenylthiocarbamide, or PTC.
In the world population, there are 2 phenotpyes:
Tasters can taste PTC
Non-Tasters cannot taste PTC.
The gene that plays a major role in controlling ths trait is called the PTC gene. It is also called TAS2R38, but PTC is a lot easier to remember!
Worldwide, the PTC gene comes in 2 versions, called "Alleles".
(Show Slide - PTC GENE sequence)(Slide 4 in Ella Resources, it is placed after Slides 25)
----(Allele - One of two or more alternative forms of a gene)
The Taster allele encodes a protein called a taste receptor protein. These proteins are inserted in the plasma membranes of taste receptor cells in your tongue. When a molecule of PTC touches a cell in a tatse bud in your tongue, it binds to the PTC taste receptor protein, which then becomes activated, and binds to another protein, setting off a signal transduction cascade, which leads to a signal being sent to your brain saying "I taste PTC." "Yuck!".
The Non-taste allele of the PTC gene has a slightly different DNA sequence, so it codes for a protein with a slightly different amino acid sequence. This version of the protein is not activated by PTC, so it does not send any signal to the brain when you put PTC in your mouth.
Each of us has two PTC genes. We inherited one from Mom and one from Dad. So, you can have one of the following Genotypes for this gene
2 Taster alleles = "Homozygous Taster"
2 Non-Taster alleles = "Homozygous Non-Taster"
1 Taster Allele and 1 Non-Taster allele = "Heterozygous"
As for Phenotypes:
People who have the Homozygous Taster and the Heterozygous genotype can taste PTC.
People who have the Homozygous Non-Taster genotype canNOT taste PTC.
In this research project, each of you will determine your phenotype with respect to ability to taste the chemical, PTC, and also your genotype for the PTC gene.
Today in lab, you'll test your ability to tase PTC by touching your tongue with a piece of paper soaked in the chemical.
You then isolate genomic DNA from youself.
In subsequent weeks, you'll use molecular biology techniques to determine your genotype for the PTC gene. You will amplify a small piece of the PTC gene using a technique called "Polymerase Chain Reaction", or "PCR".
You will then treat the piece of the PTC gene with an enzyme called a restriction enzyme, which cuts the Taster allele into 2 pieces, but does not cut the Non-Taster allele.
You'll then use a technique called Gel Electrophoresis to see if your PTC gene pieces cut or not, and this deduce your genotype.
In addition, we'll sned some of your PTC gene pieces to a biotech company to have their complete DNA sequences detemrined, and we will do some Bioinformatics analysis.
Finally, we'll do some Population Genetics analysis (a form of classical genetics) with our findings.
2) The Drosophila Mutational Analysis, uses only classical genetics analysis. You will search for and find mutant fruitflies (Drosophila melanogaster) that look abnormal. You will then map the gene controlling the phenotype to a position on a chromosome, using Linkage Mapping. You will then try ot determine what the gene is that is controlling the phenotype.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4th Hour
Review of the Central Dogma of Molecular Biology
(This Review material will not be covered on quizzes or exams, but you'll need to know it to understand the material that will be covered on quizzes and exams.)
Helpful Textbook Reading:
Chapters 7, 8 and 9
4th Hour Notes Tuesday September
9, 2008
Genes are made of DNA (DeoxyriboNucleic Acid).Review of the structure of DNA
3 chemical components
Phosphate
Sugar = DEOXYRIBOSE
Nitrogen Bases - There are 4 - Adenine (A) and Guanine (G) (A and G are
purines), Cytosine (C) and Thymine (T) (Cytosine and Thymine are pyrimidines).
The combination of Deoxyribose, Phosphate, and Nitrogenous base is called a
Nucleotide.
A given nucleotide has a 5' Phosphate end, and a 3' Hydroxyl (OH) end.
(SHOW SLIDE, THE NUCLEOTIDES IN DNA)
A DNA strand is made by linking nucleotides
together, bonding 5' Phosphates to 3' Hydroxyls (Phosphodiester bonds).
Abbreviation - 5'-ACTG-3'
(SHOW SLIDE, BASE PAIRS).
A DNA molecule is usually Double-Stranded, and consists of 2 such strands,
alligned in an antiparallel fashion, and held together by specific Hydrogen
bonding interactions between the Nitrogenous bases. It thus has a Sugar-Phosphate
Backbone, and Nitrogenous Bases on the inside.
The N-base pairing interactions are specific
A pairs with T
G pairs with C
This makes a ladder structure, which
is twisted into a spiral, or helix called a DOUBLE-HELIX (SHOW SLIDE, DOUBLE
HELIX).
ABBREVIATION
5'-ACTG-3'
3'-TGAC-5'
Each pairing of A with T or G with C is called a base-pair (bp), and 1000 bases pairs is called 1 kilobase-pairs or 1 kb.
DNA AS THE MOLECULE OF HERIDITY
1. DNA REPLICATION
In order for DNA to be a hereditary molecule, it has to be able to duplicate.
It does, and this process is called "DNA Replication".
An enzyme called DNA Polymerase catalyzes DNA replication. The double helix
opens up, and each parent strand replicated to make 2 new strands.
5'AAGGCTGA3'
3'TT CCGACT5'
5'AAGGCTGA3' Old
3'TT CCGACT5' New
5'AAGGCTGA3' New
3'TT CCGACT5' Old
GENE EXPRESSION
HOW DOES A GENE EXERT ITS INFLUENCE ON PHENOTYPE???
CENTRAL DOGMA OF MOLECULAR BIOLOGY
DNA <---DNA Replication--DNA---Transcription--> RNA--Translation--> PROTEIN
IMPORTANT!!!!! - Not all genes code for proteins. There are genes that code for RNA molecules as their final products. the genes that code for proteins are called the "Protein Coding Genes", and we will focus on them right now.
RNA
Now, I've told you that genes, the units of genetic information, are made of
DNA.
How does DNA encode the genetic information?
Well, the code is in the sequence of different nucleotides.
Each chromosome is a very long sequence of nucleotide base-pairs.
The goal of the Human Genome Project is to determine the exact base-pair sequence
of each of the 24 different human chromosomes.
In doing so, we will determine the
sequence of each of our genes, of which we have about 25,000.
Genes are spaced oput along the chromosomes.
A given protein-coding gene codes for a given protein.
There is a gene that encodes Actin - a protein that is part of the cytoskeleton.
The sequence of nucleotides in a gene encodes the sequence of amino acids in
the protein that that gene codes for.
Remember, very early in the course I told you that structures called ribosomes
are the machines that synthesize proteins???
Well, the ribosomes use the information encoded by a gene to make the appropriate
protein.
Now, there's a problem.
The genes are on the chromosomes, which are in the nucleus.
The ribosomes are out in the cytoplasm.
How does the code, or the message, get from the gene to the ribosome.
Answer: With the help of another type of nucleic acid, called RNA., and the
process of TRANSCRIPTION.
RNA stands for Ribonucleic Acid
We will discuss three different types of RNA, but let's first discuss the general
structure of RNA.
The structure of RNA is just like that of DNA, with 3 key differences:
Sugar
Phosphate group
Nitrogen base
Difference:
The sugar in DNA is deoxyribose
The sugar in RNA is Ribose
Another difference
DNA N-containing bases: A, C, G, T
RNA N-containing bases: A, C, G, U (U = Uracil)
A Third Difference
DNA - usually double stranded, RNA is usually single strandedThe 3 Main Differences
Between DNA and RNA
1) The sugar in DNA is deoxyribose, The sugar in RNA is RIBOSE
2) DNA - A, C, G, T, RNA - A, C, G, U
3) DNA - usually double stranded, RNA is usually single stranded
Transcription means: changing information from one form into another.
We'll focus on the transcription of DNA (a gene) into a messenger RNA transcript.
When we speak of gene transcription, we are speaking about this process.
Let's go through transcription of DNA into messenger RNA step by step.
Let's look at an example of one of these genes.
At one end of the gene is a signal to start transcription. This signal is called
the promoter, and it is, itself, a sequence of nucleotides.
An enzyme called RNA polymerase binds to the promoter and starts transcription.
RNA polymerase separates nearby DNA into 2 strands. Then it begins to assemble
an RNA polymer by adding ribonucleotides to a growing chain, in a 5' to 3'
direction, using one of the strands of DNA as a template (the Transcribed Strand
[TS]), and moving along that TS in a 3' to 5' direction. RNA polymerase
adds
a
ribonucleotide
complementary to the nucleotide on the DNA template.
Where the DNA has an A, The RNA will get a U
Where the DNA has a C, The RNA will get a G
Where the DNA has a G, The RNA will get a C
Where the DNA has a T, The RNA will get an A
As soon as one molecules of RNA polymerase moves off of the promoter, another
can bind to the promoter and start transcription aginfto make another RNA polymer.
When the RNA polymerase reaches a termination signal on the DNA, it leaves the DNA, and the newly transcribed RNA strand also detaches.
Now, in a prokaryotic cell, like E. coli, the DNA is not sequestered in a nucleus, so the ribosomes can immediately use the newly synthesized messenger
RNA as a code to start making a protein.
This is not the case in the cells of eukaryotic cells, in which the DNA is
sequestered in the double membrane-bounded nucleus.
In a Eukaryotic cell the messenger RNA has to pass out of the nucleus and
into the cytoplasm, where the ribosomes can then use it for instructions
for making
proteins.
In a eukaryotic cell, the messenger RNA that comes off the RNA polymerase is not mature. It must be modified before it can be transported out to the cytoplasm and used by the ribosomes to make protein.
mRNA Processing
(SHOW SLIDE OF mRNA PROCESSING)
In Eukaryotes:
After mRNA made, it is PROCESSED. This takes place in the nucleus before the
mRNA is transported to the cytoplasm:
1) 5- Cap added (7-methyl guanosine residue)
2) Poly-A tail added at 3' end
3) Introns cut out and exons spliced back together.
SMALL NUCLEAR RNAs help to do this, recognizing specific splicing sequences
Then, the mature mRNA moves out of the nucleus and into the cytoplasm.
What is the nature of the genetic information in an mRNA molecule????
The Genetic Code
The messenger RNA contains information
in the form of 3-letter units ("words")
called CODONS.
Each Codon specifies a specific amino acid, out of the 20 possible.
So, the sequence of 3-nucleotide codons in a mRNA codes for the sequence of
amino acids in a protein.
(SHOW SLIDE OF GENETIC CODE)
Discuss Genetic code
TRANSLATION ( = Protein Synthesis)
We are going to discuss three different types of RNA molecule found in the
cell.
Types of RNA:
Messenger RNA (mRNA)
Contains the code for the order of amino acids in a protein. It carries this
info. from the DNA to the ribosomes, which make the protein.
Transfer RNA (tRNA)
carries amino acids to the mRNA at the ribosomes and fit them in the proper
order into the growing protein.
Ribosomal RNA (rRNA)
Is a major component of ribosomes
All 3 of these types of RNA are synthesized by adding ribonucleotides to a
growing poymer chain, using a single-strand of DNA as a template. This process,
called transcription, is directly analogous to replication DNA.
Let's pick up where we left off in a eukaryotic cell.
A primary transcript has been transcribed.
And it has been modified.
THEN, the transcript leaves the nucleus - probably through one of the nuclear
pores, and enters the cytoplasm.
The transcript is now ready to direct the synthesis of protein by ribosomes.
Some other molecules are going to play important roles in this process.
1) transfer RNA (tRNA) - Remember that the genetic
code consists of a series of three-letter words. Each word is a group of three
nucleotides, on the messenger RNA, called a codon.
Each codon specifies a particular amino acid.
How is a codon related to an amino acid for which it codes??
Codons and amino acids are related with the help of transfer RNA.
tRNA is an adaptor between mRNA and amino acids.
For each of the 20 amino acids, ther is at least one tRNA.
A tRNA molecule is small, consisting of only 75 - 80 nucleotides.
At one end is an attachment site for a specific amino acid.
Special enzymes attach the correct amino acid to the correct tRNA molecule.
At other end is a three nuceotide sequence called an anticodon.
The anticodon is the point of contact with the mRNA.
Each tRNA has a unique anticodon, allowing it to contact only one codon.
tRNAs are also designed to combine specifically with binding sites on ribosomes.
So, tRNA molecules:
1) Carry amino acids
2) associate with mRNA molecules, and:
3) interact with ribosomes.
The Ribosome.
R
ibosomes are responsible for the synthesis of proteins in
the cell.
Now, let's discuss ribosomes in a bit more detail.
Each ribosome consists of two subunits.
1) a large subunit, and:
2) a small subunit
In eukaryotes:
the subunits consist of ribosomal RNA (rRNA), and proteins.
Each ribosome has two tRNA binding sites that participate in translation. The
ribosome also binds to the mRNA that it is translating.
TRANSLATION
We have been working our way through the steps by which the sequence of bases
in a template strand of DNA specifies the sequence of nucleotides in protein.
When not in use the small and large subunits of a ribosome are not associated
- they are apart from each other.
At the beginning of translation a tRNA carrying the first amino acid in the
protein, and the ribosomal bind to the starting point at the 5' end of the
mRNA.
DRAW THIS
Note that the first amino acid is almost always a methionine, so the starting
point is at an AUG codon in the mRNA.
The ribosome move along the mRNA codon by codon, with tRNAs bringing the proper
amino acids to the growing chain.
This goes on and on, with the ribosome moving along the mRNA in a 5' - 3' direction.
Eventually, a STOP codon (UAA, UAG, or UGA) enters the A site, A REleasin factor
binds the A site, and and the translation stops.
The newly synthesized protein separates from the ribosome.
The orientation of the protein is such that the N, or amino terminus corresponds
to the first codon in the mRNA.
The C terminus is the last AA to join the chain.
Several ribosomes can work simultaneously on a single mRNA molecule to produce
a number of molecules of the same protein at the same time..
As soon as one ribosome moves far enough away from the initiation point, another
ribosome can get on the initiation point and begin translation.
The first ribosome to initiate translation is the first to finish translating and to be released from the mRNA.
END OF OPTIONAL REVIEW SESSION!!!!
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Extra Notes on MITOSIS and MEIOSIS (NOT COVERED IN REVIEW SESSION OR LECTURE!)
(This Review
material will not be covered on quizzes or exams, but you'll need to
know it to understand the material that will be
covered on quizzes and exams.)
Each gene is a part of a structure called a chromosome.
Each chromosome is made of a single, continuous very long DNA molecule that
is wrapped around proteins to pack it into the small space in the nucleus.
Without this packing, each chromosome would be centimeters long! How could
this fit into a nucleus that is microns (1 micron = 1 1 millionth of a meter)
in
diameter!
Each chromosome has its own identity, its own look, and its own collection
of genes.
If you look at a body cell (not
a gamete) of a diploid organism, you’ll
see that there are two of every chromosome.
The members of a pair of chromosomes
a re called “HOMOLOGOUS CHROMOSOMES”,
or “HOMOLOGS”.
Humans have 23 different chromosomes, and 2 of each.
The members of a pair of chromosomes
a re called “HOMOLOGOUS CHROMOSOMES”.
So in each human nucleus there are 46 chromosomes
Diploid number = 46
In each human nucleus, one of these pairs of chromosomes is the SEX CHROMOSOMES.
In woman’s cells - 2 X chromosomes
In man’s cells – 1 X
chromosome and 1 Y chromosome.In this karyotype, each chromosome has replicated,
and now consists of 2 SISTER CHROMATIDS joined
at the centromere.
GAMETES (sperm and eggs) are haploid
Haploid number = 23
Woman makes gametes = Eggs.
Each egg has 1 each of chromosomes
1 – 22 (autosomes), and 1 X chromosomes.
Man make gametes = sperm.
Each sperm can be of one of 2 types.
1 each of 1-22 (autosomes), and either 1 Y, OR 1 X chromosome.
If an X-containing sperm fertilizes an egg, zygote = girl
If a Y-containing sperm fertilizes an egg, zygote = boy.
Gametes are produced by cell division called MEIOSIS
1 diploid cell divides to yield 4 haploid cells.
Single-celled (diploid) Zygote develops into a multicellular human, including
cell division called MITOSIS
1 diplod cell divides to yield 2 diploid cells.
Let's go through mitosis step by step.
remember, before mitosis, the cell is in Interphase.
Prophase:
During most of the cell cycle, the chromosomes, while packed tightly enough
to fit in the nucleus, have been fairly spread out and diffuse, but during
Prophase, the chromosomes condense very tightly, and look like this:
Each chromosome consists of 2 arms. In the center is a structure called a centromere.
At this time, each chromosome has replicated, and the 2 replicates are attached
at their centromeres. These 2 replicates, or duplicates are attached at their
centromeres. Remember, the cell is diploid, having two copies of each chromosome
type.
So, after S period, the cell has 4 copies of every chromosome type.
In a human body cell:
there are 23 chromosome types
there are two homologues of each type
there are 2 sister CHROMATIDS of each homologue.
Metaphase:
During metaphase, the nuclear envelope fragments and disappears. Fibrous structures
made of microtubules, called spindle fibers, form and extend toward opposite
ends, or poles, of the cell. The condensed chromosomes are somehow attached
to the spindle apparatus, and are pulled to the equatorial plane of the cell.
The duplicate chromosomes are held at the plane, with one duplicate on each
side of the plane.
Anaphase:
The centromeres are pulled apart, freeing sister chromatids as individual chromosomes.
these chromosomes split and move to opposite poles of the spindle complex.
2 copies, or homologues, of each chromosome type, move toward each of the 2
poles.
Telophase:
Two new nuclear envelopes develop as the two new nuclei form.
The spindle apparatus breaks up.
Division of the cytoplasm begins.
The nucleoli reappear.
The chromosomes often decondense at this time.
Cytokinesis
Next, the process of cytokinesis occurs.
In animal cells and protists, the plasma membrane constricts between the two
new nuclei, eventually pinching off into 2 new daughter cells.
In plant cells, a structure called the cell plate is laid down acroos the cytoplasm,
between the spindle poles. Eventually this cell plate cuts through the entire
cell. it becomes fortified with cellulose to become the new cell walls of the
two daughter cells.
Now we are back to interphase, specifically G1.
MEIOSIS
So far, when we have been talking about cells in a muticellular eukaryote,
like a human being, we have been talking about the cells of most of the body
- the arms the heart, the brain, the bones, etc.
These cells are technically called Somatic Cells.
As I have told you, somatic cells have two copies of each chromosome type,
and are thus called diploid.
If we say that N = the number of different chromosome-types, then diploid
cells are 2N
A multicellular organism starts off life as a single fertilized egg.
The egg cell from the mother contains one copy of each chromosome type, and
is thus called haploid.
The sperm cell from the father contains one copy of each chromosome type,
and so is also haploid.
If we say that N = the number of different chromosome-types, then haploid
cells are 1N
The haploid egg and sperm cells are called gametes.
When the haploid sperm and the haploid egg get together, the resulting cell
is now diploid, and is called a zygote.
The zygote has two copies of each chromosome type - one copy from the mother,
and one copy from the father.
For a particular chromosome-type the copy from the mother, and the copy from
the father are called homologous chromosomes, or homologues.
This zygote will divide by MITOSIS to make two cells, which will divide to
make four cells, etc., etc. until it developes into a multicelluler organism,
with many, many, diploid body cells.
Now, in order for this diploid organism to reproduce, it must produce haploid
gametes.
How does it do this?????
It does this by a type of nuclear and cell division called meiosis, in which a single diploid cell (with replicated chromosomes) divides into 4 haploid daughter cells.
Two nuclear divisions occur one after the other
First division - Homologous chromosomes pair, then separate from each other.
Second division - Sister Chromatids separate from each other.
Sperm and Egg cells - gametes, are produced via meiosis.
In
order for diploid organisms to reproduce sexually, they must produce haploid
gametes.
How do they do this?????
They do this by a type of nuclear and cell division called meiosis, in
which a single diploid cell (with replicated chromosomes) divides into
4 haploid
daughter cells.
Sperm and Egg cells - gametes, are produced via meiosis.
Quick Gametogenesis Overview
Sperm cells are produced in the male's testes
Egg cells or ova (singular = ovum) are produced in the ovaries of the female.
Meiosis is a continuous process, but for the sake of convenience of discussion,
we can divide it up into a number of different phases.
Two separate nuclear divisions occur during meiosis
During Prophase I of meiosis, each chromosome somehow finds its homologue among
all the other chromosomes in the nucleus.
The two homologues of each pair line up next to each other very precisely.
During synapsis, portions of the chromosomes are exchanged between homologous
chromosomes.
This process is called Crossing Over.
At the end of meiosis, note that the result is 4 new haploid nuclei!!!!
END OF REVIEW INFORMATION!!!!
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
THE NATURE OF GENES and Genomes
Chromosomes and Genomes
The sum of an organism's genetic information is called that organism's Genome.
EUKARYOTES vs. PROKARYOTES
Prokaryotes Bacteria
Example = E. coli.
Bacteria don't really have chromosomes. A bacterium is a single-celled organism
that has a single, circular ds DNA molecule for a genome.
The bacterium, Eschericia coli (E. coli), which we'll be using in lab, has
a genome consisting of a circular ds DNA molecule 4,700 kb in length that contains
4,288 genes.
In addition to their large, circular DNA genomes, bacterial cells can also
contain much smaller, circular ds DNA molecules called
(SHOW SLIDE - BACTERIAL GENOMIC DNA)(Slide 5 in Ella Resources)
PLASMIDS.
Plasmids are not essential to the cell, but can carry numerous genes, often
for antibiotic resistance.
Plasmids can replicate into high copy number, and are passed on to daughter
cells when a bacterial cell divides.
(SHOW SLIDE - Bacterial Plasmids)(Slide 6 in Ella Resources)
A Eukaryote's genome is made up of CHROMOSOMES
A Gene is a subset of a Chromosome. Each Chromosome in a Eukaryote (everything
that is not a bacterium. Yeast to humans) consists of one long, double-stranded
DNA molecule, complexed with proteins. Each Chromosome has many genes.
In between genes, there are non-coding regions called INTERGENIC REGIONS, which do not code for any product.
(SHOW SLIDE, FIG. 2-7, p. 36)(Slide 7 in Ella Resources)
(SHOW SLIDE, FIG. 2-8, p. 37)(Slide 8 in Ella Resources)
The DNA molecule in a cell is way too long to fit into the nucleus of a cell, so, with the help of proteins, it is packed tightly into a compact structure.
(SHOW SLIDE, FIG. 2-3, p. 34)(Slide 9
in Ella Resources)
These are the chromosomes of the "Indian Muntjac", a small deer. They are undergoing mitosis.
The Proteins that help package the DNA are called HISTONES.
1st order level of packaging are units called "NUCLEOSOMES".
Histone Octomer – H2A. H2B, H3, H4 (2 of each)
DNA wraps around a histone octomer twice to form a nucleosome.
(SHOW SLIDE, FIG. 2-4a, p. 35) (Slide 10 in Ella Resources)
(SHOW SLIDE, FIG. 2-4b, p. 35)(Slide 11 in Ella Resources)
Nucleosomes spaced out along DNA like beads on a string.
H1 histones in between nuclesomes.
(SHOW SLIDE, FIG. 2-5, p. 35)(Slide 12
in Ella Resources)
Packed DNA + Proteins = CHROMATIN
A chromosome is made up of chromatin.
(SHOW SLIDE – Figure 2-26, p. 48)
(SHOW SLIDE - CHROMOSOME PACKAGING)(Slide
13 in Ella Resources)
(SHOW SLIDE - Human Karyotype)(Slide 14 in Ella Resources)
The human genome is made up of different types of chromosomes, each carrying
its own set of genes.
1-22 = Autosomes
X
Y
X and Y are the sex chromosomes.
In this karyotype - cross structure, because they have replicated. Would not look like this if not replicated. Since each chromosome has replicated, each is made up of 2 identical SISTER CHROMATIDS, which are connected at the centromore.
Also would not be so condensed if not in the right part of the cell cycle.
Each cell in a woman has 2 each of 1-22, plus 2 X-chromosome. The members
of a pair of chromosomes of the same type are called "HOMOLOGOUS CHROMOSOMES"
or "HOMOLOGUES" (also spelled "HOMOLOGS").
Each cell in a man has 2 each of 1-22, plus and X and a Y-chromosome.
Each human nucleus (excluding eggs and sperm) has 2 of each chromosome, and
is called DIPLOID.
Mature egg cells and sperm cells (the Gametes) have only 1 of each chromosome,
and are called HAPLOID.
Human chromosomes average 10s of thousands of kb in length, and contains thousands
of genes.
MEIOSIS = Cell divisions to produce sex cells results in the production of
Haploid sperm by men, and Haploid Eggs by women.
Each child gets one set of 23 chromosomes from father, and one set of 23 chromosomes
from mother.
Thus we each get 2 copies of every gene.
Exceptions = Men get only one copy of genes contained in the X chromosome.
Women get no genes contained on Y chromosome.
ORGANELLAR GENOMES
In addition to their nuclear genomes, made up of chromosomes, eukaryotic cells
also have genomes in their organelles:
mitochondria and
chloroplasts
These are circular, ds DNA molecules on the order of 10s of kb. Can be present
in high copy number.
Share no genes with nuclear genome. Similar to prokaryotic genomes.
Endosymbiont Theory states that mitochondria and chloroplasts are descended from prokaryotic cells that were engulfed by, and took up residence in larger cells.
(SHOW SLIDE - COMPARATIVE GENOMES, incl. ORGANELLAR GENOMES)(Slide 15 in Ella Resources)
GREEN – Coding Regions
Light Green – Introns (we'll discuss later)
Red – Repetitive DNA
Blue - Repetitive DNA
Polymerase Chain Reaction (PCR) (Note that this has been moved up in the lecture schedule!)
Gene Structure and Function
Control of gene Expression
IN LAB NEXT WEEK!!!:
PCR
Polymerase Chain Reaction (PCR)
See this web site for a good PCR tutorial! (Choose from the menu )
See this web site for molecular biology tutorials! (Choose PCR from the menu at left)
(SHOW SLIDE, FIGURE 2-8, p. 37 - A Specific Human Chromosomal Landscape)(Slide 8 in Ella Resources)
As you look at this Slide of apiece of a specific human chromosome, you might ask, how to we get our hands on just one of these genes, so we can study it?
(SHOW MAP OF PTC GENE [TAS2R38] ON WEB - Go to the National Center for Biotechnology Information home pageIn drop-down menu next to "Search", select "Gene"
In text box next to "For", write "TAS2R38".
Click "GO".
Click on #1 TAS2R38.
Scroll down to "Genomic context" see map.
In the lab, how are we isolating this PTC gene from our own genomic DNA preferentially over all of the other DNA in the genome?
The answer is that we are using a techniques called Polymerase Chain Reaction (PCR).
Polymerase Chain Reaction (PCR) is a method for making many copies of, or
amplifying, a specific gene or other DNA sequence. PCR is performed using a
pair of specific single-stranded primers, which are themselves sequences of
DNA, usually about 20 bases long. A primer is designed to be complementary
in sequence to a specific region of the genome or other template DNA, and a
pair of primers that flanks a specific region of the genome is used to amplify
(make many copies of) that genomic region, which is called the template. The
primers prime replication of the target DNAsequence by an enzyme called Taq
DNA polymerase. Taq DNA polymerase is isolated from a bacterium that lives
in very hot water, and this enzyme is extremely resistant to heat.
A PCR reaction mixture contains
Template DNA (in the lab, this is your own genomic DNA),
Forward (left ) primer (many molecules)
Reverse (right) primer (many molecules)
Taq DNA polymerase,
free deoxyribonucleotide triphosphates (dNTPs) (dATP, dCTP,
dTTP, and dGTP), and
buffer (contains MgCl[2]).
In PCR, template DNA is
- first heated to denature the double stranded DNA molecules, making them
single stranded. This is the DENATURATION STEP.
- The reaction mix is then cooled, allowing primers to anneal to complementary
sequences on opposite
strands of the DNA (by hydrogen bonding between complementary bases: A-T, G-C),
flanking the DNA segment to be
amplified. This is the ANNEALING STEP.
-The reaction is then brought to an intermediate temperature, and, using free
deoxyribonucleotides added to the
reaction mixture, Taq DNA polymerase extends these primers from their 3' ends
toward each other, This replicates the region between the two primers, and
generates two double stranded DNA molecules from the original one. This is
the EXTENSION STEP.
The DENATURATION STEP, ANNEALING STEP, and EXTENSION STEP make one Cycle of PCR.
After this cycle of replication is completed, the reaction mixture is heated to denature the double stranded DNA molecules, beginning another cycle. Then the temperature is lowered to allow the primers to anneal again - this time with double the number of templates, then brought too extension temperature to allow replication again, completing another cycle. This process is repeated for a number of cycles (usually 20-30 cycles), resulting in the production of many copies of the template DNA sequence. These copies are called the PCR product(s). Depending on the number of nucleotide base-pairs present between the two primers used in a PCR reaction, different-sized fragments of DNA (products) will be generated in the reaction.
-------------
GENE EXPRESSION - The process by which a gene produces, or provides the information that it encodes. In the case of a protein-coding gene, this is by coding for the production of a messenger RNA, which then codes for the production of a specific polypeptide, or protein.
GENE EXPRESSION
The genetic information in a gene is in the sequence of nucleotides, which
codes for a sequence of nucleotides in an RNA (Ribonucleic Acid). A messenger
RNA (mRNA) molecule,
in turn, codes for a sequence of amino acids in a specific protein. (Central
Dogma of Molecular Biology)
Genes are only actively transcribed at certain times, and in certain cell types.
Chimp and Human have all the same protein-coding genes (about 25,000). What makes us different from a chimp?
The precise regulation of the time and place that each gene is expressed!
GENE STRUCTURE
(SHOW SLIDE, EUKARYOTIC GENE STRUCTURE)(Slide 16 in Ella Resources)
EUKARYOTIC GENES
If we look at the structure of a typical eukaryotic protein coding gene, we
will see that it consists of:
1) Regulatory sequences, usually near a promoter
2) The Promoter site to which RNA Polymerase enzyme binds to begin
transcription
3) Coding sequences which code for the production of a messenger RNA
(mRNA).
4) A Transcription Termination Signal
The coding regions of genes in prokaryotes (bacterial cells) are continuous.
The coding regions of genes in Eukaryotes (yeast to humans) are interupted
by intervening sequences called INTRONS.
EXONS will code for amino acids in the protein product.
Introns will not end up represented in the protein product.
Genes have REGULATORY REGIONS, and CODING REGIONS. The coding region of a gene
actually codes for the nucleotide sequence in the RNA molecule, and thus the
protein.
The regulatory region ensures that the gene gets transcribed, and that it gets transcribed only in the right cells at the right times.
Thus, the coding regions produce their RNA and protein products only when needed.
(SHOW SLIDE, EUKARYOTIC GENE AND RNA STRUCTURE)(Slide 17 in Ella Resources)
Regulatory regions are binding sites for proteins called Transcription Factors.
The Regulatory Regions also include the PROMOTER, which is the binding site for the Transcription Initiation Complex - A group of proteins that form a platform to which RNA polymerase binds to begin transcribing the gene.
Transcription Factors regulate the ability of RNA Polymerase to bind to the promoter and begin transcribing the gene.
To initiate transcription, eukaryotic RNA polymerase requires the assistance
of proteins called transcription factors
General transcription factors are essential for the transcription of all protein-coding
genes
In eukaryotes, high levels of transcription of particular genes depend on control elements interacting with specific transcription factors
Proximal control elements are located close to the promoter
Distal control elements, groups of which are called enhancers, may be far away
from a gene or even in an intron
An activator is a protein that binds to an enhancer and stimulates transcription of a gene (by helping RNA polymerase bind to the promoter and transcribe the gene).
A repressor is a protein that binds to an enhancer and inhibits transcription of a gene (by inhibiting RNA polymerase from binding to the promoter and transcribing the gene)..
(The Estrogen Receptor is and example of a specific transcription factor.)
Genes are only actively transcribed at certain times, and in certain cell types.
ex) insulin gene - on in pancreas cells, off in eye cells
photoreceptor protein gene - off in pancreas cells, on in eye cells.
All cells regulate which genes are active at a given time.
Transcriptional Regulation in Prokaryotes:
Different from Eukaryotes!
PROKARYOTIC EXAMPLE (CONTROL OF GENE EXPRESSION)
PROKARYOTIC EXAMPLE
Prokaryotic (bacterial) genes also have regulatory regions. Ensure a gene is ON
or OFF at right times.
Ex) lacZ gene.
The Lac Operon
Prokaryotes regulate gene expression by controlling transcription, in one of
two ways: either the presence of the substrate induces the transcription of
the gene specifying an enzyme that acts on that substrate, or the presence
of the product of an enzyme represses transcription of the gene specifying
that enzyme. In the former, the cell makes a digestive enzyme only when there
is something for it to digest; in the latter, it makes a synthetic enzyme only
when it needs the product of that enzyme.
The lac operon is an inducible system, meaning that the presence of the substrate lactose induces transcription of the genes specifying the enzymes necessary for its digestion. When lactose is absent, the repressor protein binds to the operator and prevents transcription. When lactose is present, it binds to the repressor protein, changing its shape in a way that prevents it from binding to the operator, and so transcription is permitted.
EUKARYOTIC GENE EXPRESSION
(SHOW SLIDE, TS and NTS)(Slide 18 in Ella Resources)
In any gene, ONLY one of the strands is the Transcribed Strand (TS) (also called the Template Strand). The other strand is called the Non-Transcribed Strand (NTS) (also called the Non-TemplateStrand).
The end of the gene where promoter is is called the 5'- end of the gene, because the 5' end of the RNA is made first. the other end is called the 3' end of the gene.
(SHOW SLIDE, SEVERAL GENES ON SAME CHROMOSOME, TS and NTS)(Slide 19 in Ella Resources)
A chromosome contains many genes. Some have one strand as the TS, others have the other strand as the TS
(SHOW SLIDE, FIGURE 2-8, p. 37 - A Specific Human Chromosomal Landscape)(Slide 20 in Ella Resources)
mRNA Processing (THIS IS BRIEF REVIEW FROM BIO 200!)
In Eukaryotes:
After mRNA made, it is PROCESSED. This takes place in the nucleus before the
mRNA is transported to the cytoplasm:
1) 5- Cap added (7-methyl guanosine residue)
2) Poly-A tail added at 3' end
3) Introns cut out and exons spliced back together.
SMALL NUCLEAR RNAs help to do this, recognizing specific splicing sequences
Introns allow for one gene to code for >1 protein via alternative exon splicing.(alternative exon splicing is NOT ON QUIZ #1)
(SHOW SLIDE - A TYPICAL EUKARYOTIC mRNA)(Slide
21 in Ella Resources)(Also Slide 17 in Ella Resources)
If we look at a typical mature eukayotic mRNA, we see that it has:
1) a 5 CAP helps ribosome bind to 5 end of mRNA
2) 5 Untranslated Regions (UTR) (or Sequences) which contain regulatory information telling the ribosomes where to bind to begin translation.
---------Includes a consensus translation start sequence
3) a START codon, which is the first codon read by the ribosomes to produce the polypeptide, or protein. The start codon begins the
4) Open Reading Frame (ORF), which is the sequence of ribonucleotide triplets,
called codons, which code for the sequence of amino acids in the protein. the
ORF ends with a
5) STOP codon, which tells the ribosomes to stop translation.
6) 3 Untranslated Regions (or Sequences) which often have important
regulatory functions, such as tethering an mRNA to the appropriate place in
a cell.
(Example: nanos mRNA localized to posterior end of early Drosophila embryo
by interactions between its 3 untranslated sequences and the cytoskeleton.
Causes high levels of NANOS protein in the posterior. There NANOS protein blocks
translation of hunchback mRNA, so low HUNCHBACK Protein in posterior develops
as a rear end!)
The 3' UTR also contains a polyadenylation signal, which tells special enzymes to add a Poly-A tail added to the 3' end of the mRNA.
7) 2) Poly-A tail at 3' end
STOP HERE FOR QUIZ #1!!!!
(SHOW SLIDE OF mRNA PROCESSING [from Web sitehttp://www.emunix.emich.edu/~rwinning/genetics/eureg2.htm])
Introns allow for one gene to code for >1 protein via alternative exon splicing.(SHOW SLIDE, Alternative Splicing, Slide 22 in Ella Resources)(Also Slide 23 in Ella Resources)
Protein Structure
Proteins
(SHOW SLIDE - PROTEIN STRUCTURE - Fig. 9-3, p. 323)(Slide
24 in Ella Resources)
After an mRNA has been processed, it is now a mature mRNA that moves into the cytoplasm. There, the ribosomes read the sequence of ribnucleotides in the ORF of the mRNA, and string together the correct sequence of amino acids to make the protein. A ribosome reads the mRNA starting at the 5' end of the ORF and moving to the 3' end until it reaches a stop codon. The ribosome reads the nucleotides in the mRNA in a triplet code (1 triplet = 1 codon). Each codon specifies one of the 20 amino acids. A ribosome starts the synthesis of the protein at the amino end (specified by the 5' end of the ORF), and connects amino acids togther carboxyl to amino end by forming peptide bonds. A ribosome finishes by adding the last amino acid to the carboxyl end of the protein.
Example - Human beta Globin and Hemoglobin
A protein is formed from one or more polypeptides.
The monomers that make up polypeptides are amino acids.
Amino Acids H2N-CHR-COOH
There are 20 different amino acids with 20 different R groups.
Two amino acids become attached by dehydration synthesis between carboxyl
group of one and the amino group of another to form peptide bonds.
Make a long chain, like beads in a necklace.
Primary Structure
Different polypeptides have different sequences of amino acids. This is the
primary structure of the protein.
Our own cells build about 60,000 different kinds of proteins.
Secondary Structure
A newly formed protein twists and folds into different types of shapes called
its secondary structure.
These shapes can be:
alpha helix, or
beta sheet.
These structures are held by bonding forces (hydrogen bonding, electroctatic
forces, and van der Waals forces) between amino acids.
Tertiary Structure
Most proteins have a tertiary structure. Forms as secondary structure folds
back upon itself to form a spherical shape, or other 3-dimensional shape.
formed by bonds (usually hydrogen or ionic) between R groups.
Quaternary Structure
Forms as bonds form between two or more polypeptides.
Only some proteins have quaternary proteins.
Example - mammalian hemoglobin - made of 4 polypeptide subunits. 2 alpha globin and 2 beta globin
The shape of a protein determines its properties and functions, and is dependent upon the prmary amino acid sequence, which is dependent on the nucleotide sequence of the gene that encodes it!!!.
SHOW:
Structure of Human Hemoglobin from NCBI (CnD3) (You need to download the Cnd3 software from the NCBI web site to view this.)
Protein Data Bank (http://www.rcsb.org/pdb/Welcome.do;jsessionid=PYIcf8vAnVV19MxUNY46gA)
National Center for Biotechnology Information home page
TYPES OF PROTEINS
Structural - example: Keratin, provide structure and stength to cells and tissues
Enzymes - examples: ß-galactosidase and RNA Polymerase, catalyze chemical
reactions.
Messenger - example: G-protein, Transmit signals from outside the cell to inside the cell
Receptor - example: PTC receptor (TAS2R38), With the help of messenger proteins, transmit signals from outside the cell to inside the cell
(SHOW SLIDE - G-Protein coupled Receptor PROTEIN)(Slide 25 (the second one) in Ella Resources)
MUTATION
WHEN SOMETHING GOES WRONG -
Change in DNA base-pair sequence of gene can lead to change in amino acid
sequence of protein.
Can cause different structure of protein.
Missing important amino acid at imporant (active site) of protein may affect
function.
Can result in different forms of a gene (called different ALLELES)
Allele - One of two or more alternative forms of a gene
Wild-type allele - the allele found most commonly in nature, and that carries
out the normal function.
Can also have mutant alleles. - have a different bp sequence from the wild-type allele
MORE ON MUTATION
Each chromosome contains many genes.
These genes are basically positions along the chromosome, and consist of specific
sequences of nucleotide base-pairs.
The sequence of different nucleotide base-pairs (A, G, C, or T) in a gene
is what contains the information.
A sequence of nucleotides in DNA specifies the sequence of amino acids in
a protein.
So, for any given gene, having the correct sequence of nucloetides is critical
to the gene's proper function.
A mutation is a change in the sequence of nucleotide base pairs in a gene.
This can affect coding regions, and alter the amino acid sequence of the protein.
Also can affect regulatory regions and alter the regulation of the gene at
the molecular level.
There are several different types of mutations:
A Point mutation is a change in single nucleotide base pair, or in a very
small number of adjacent base pairs.
Insertion - insertion of base-pairs into a chromosomal location.
----An Insertion can lead to the addition of extra DNA into a gene, which
can severely disrupt the gene. An insertion of one base pair can be considered
a point mutation.
Transposable elements can cause insertion mutations. A transposable element
is a short sequence of DNA that can move from one location to another within
a genome (cam move from one chromosomal insertion site to a different chromosomal
insertion site).
Deletion - results from the removal of pieces of the DNA molecule, resulting
in gaps in genes. An deletion of one base pair can be considered
a point mutation.
Duplication - the repeating of a segment of DNA.
Inversion - "flipping" of a section of chromosome.
More on Mutation
Translocation - movement of a piece of chromosome (other than a transposable
element) form one location to another within a genome.
Point Mutations
DRAW EXAMPLE, SHOW AFTER REPLICATION
There are 2 classes of point mutations that involve base pair substitutions:
Transitions - change from purine to purine, or pyrimidine to pyrimidine
G and A = purines
C and T = pyrimidines
G-C pair to A-T pair.
Or change from C-G pair to a T-A base- pair
Transversion - change from purine to pyrimidine, or pyrimidine to purine
DRAW EXAMPLE OF TRANSVERSION
G-C pair to T-A Mutation in a coding sequence of a gene can change what that
gene codes for.
Mutation in Regulatory region can disrupt transcription of the gene.
A lot of sequence in most chromosomes in eukaryotes (INTERGENIC REGIONS) does not contain genes, and mutations that occur in DNA
in these places may have no noticeable effect on the phenotype.
Mutations, such as point mutations can occur spontaneously.
Spontaneous mutations occur naturally and arise in all cells.
Mutations can also be induced by agents that cause changes in DNA
Point mutations can arise spontaneously when errors in DNA replication occur.
Point mutations can also be induced spontaneously by Spontaneous lesions,
which result in changes from one base to a different base.
Point mutations can also be induced by chemical mutagens.
Mutagens:
Chemical Mutagens - chemicals that cause changes in DNA sequence. Often point mutations.
ex) Ethyl Methanesulfonate (EMS) - used in lab in Mutational Analysis Project.
Example: EMS (Ethyl methane sulfonate) causes mostly G-C -->A-T transitions
-----------------------------------------------------------------------------------------------------------
REVERSION - change from a mutant base-pair sequence back to the wild-type sequence.
WT Forward Mutation mutant Reversion............... WT
G ----------------------> A ----------------------> G
C -------------------------T------------------------ C
Point mutations can be reverted back to wild-type by changing the altered
base back to the correct wild-type base.
For example, if in a gene, the correct base-pair at a specific position is
A-T, and a mutations changes this to a G-C base-pair,
A change back to an A-T base-pair will revert the gene back to the wild-type
form.
More on Mutations
We discussed the Central Dogma of Molecular Biology.
DNA (genes) code for:
RNA (mRNAs), which, in turn, code for:
Proteins.
In order for a gene to properly encode its protein, the sequence of that genes
must be correct, so that it will direct the correct mRNA sequence of ribonucleotides,
and the right amino acid sequence in the protein.
A stable, heritable change in a gene is called a mutation.
I'd like to dicuss some different types of mutations that can occur in genes.
DRAW A DNA SEQUENCE
A Point Mutation is a change in a single nucleotide base. In order for a point
mutation to be stable and heritable, it has to go through a round of DNA replication.
After replication, this will result in a change in a single base-pair.
For example - a change from a C to an A. After Replication will yield a T:A
base-pair, where there should be a C:G base-pair.
DRAW THIS!!!!!!!!
Or, a change from a T to a C.
Any change from one base to the next is a point mutation.
Missense Point Mutation. The mutation results in a codon that encodes the wrong amino acid.
(Slide 26 in Ella Resources)
--DNA 3'-TACTTTGTCGGTCCA-5' ----------------------------->3'-TACTTTGTAGGTCCA-5'
mRNA 5'-AUGAAACAGCCAGGU-3'-------------------------> 5'-AUGAAACAUCCAGGU-3'
Protein---- fMet--Lys-Gln--Pro-Gly---------------------------------> fMet--Lys-His--Pro-Gly
(Slide 31 in Ella Resources - PTC gene sequence)
The change from a C in the Taster allele to a G in the non-taster allele at position 145 in the PTC gene is a Missense Point Mutation, because htat changes a codon for Proline in the Taster allele to a codon for Alanine in the non-taster allele
Nonsense Point Mutation. The mutation results in a premature STOP codon and
a shorter than normal protein.
--DNA 3'-TACTTTGTCGGTCCA-5'--------------------------------> 3'-TACATTGTAGGTCCA-5'
(Slide 27 in Ella Resources)
mRNA 5'-AUGAAACAGCCAGGU-3'----------------------------> 5'-AUGUAACAUCCAGGU-3'
Protein-- fMet--Lys-Gln--Pro-Gly--------------------------------------> fMet- (UAA
is a STOP codon)
Silent point mutation - The mutation results in a codon that encodes the same
amino acid as the wild-type allele. No effect on Protein
(Slide 28 in Ella Resources)
--DNA 3'-TACTTTGTCGGTCCA-5'------------------------------> 3'-TACTTTGTTGGTCCA-5'
mRNA 5'-AUGAAACAGCCAGGU-3'-----------------------------> 5'-AUGAAACAACCAGGU-3'
Protein fMet--Lys-Gln--Pro-Gly-----------------------------------------> fMet--Lys-Gln--Pro-Gly
A Deletion is the removal of a chunk of DNA from a gene. Deletions often destroy
the gene's ability to code for a protein.
An Insertion is the addition of extra DNA into a gene, which can severely
disrupt the gene.
Very small insertions or deletions can cause
frameshift mutations
READING FRAME - Translation of an mRNA must start at the appropriate AUG of
the mRNA and work along the mRNA in a 5' ----> 3' direction, reading the
mRNA in units of 3. It must read in the exactly correct sequence of 3-base
codons.
(Slide 29 in Ella Resources)
Demonstration:
THE FAT CAT ATE THE BIG RAT
A deletion results in the removal of the "F". The consequence of this is that the sentence no longer makes any sense "downstream of the deletion". In molecular terms, the word "downstream" means toward the stop codon.
THE ATC ATA TET HEB IGR AT
Frameshift mutations change the reading frame of the messenger RNA.
(Slide 30 in Ella Resources)
For example: Template strand of DNA
3'-TACAGAGTGCTGCAT-5'
5'-AUGUCUCACGACGUA-3' mRNA
AUG UCU CAC GAC GUA codons
fMet Ser His Asp Val
Through a deletion, the DNA strand loses the fifth base in the sequence above:
3'-TACAAGTGCTGCAT-5'
5'-AUGUUCACGACGUA-3' mRNA
AUG UUC ACG ACG UA
fMet Phe Thr Thr
Frameshift mutations can be caused by deletions or insertions of a number
of base-pairs that is not 3 or a multiple of 3 in the Open Reading Frame (ORF)
region of a gene.
Another type of mutation:
Translocation - movement of a piece of chromosome (other than a transposable element) form one location to another within a genome.
Example: The Philadelphia chromosome (or Philadelphia translocation) is a specific chromosomal abnormality that is associated with chronic myelogenous leukemia (CML). It is the result of a reciprocal translocation between chromosome 9 and 22, and is specifically designated t(9;22)(q34;q11). The presence of this translocation is a highly sensitive test for CML, since 95% of people with CML have this abnormality.
SOMATIC VS. GERM-LINE MUTATIONS
Only mutations affecting the GERM-LINE CELLS (gametes [sperm and eggs] or cells
that are precursors of gametes) will be transmitted to the progeny.
Mutations are occuring all the time in cells throughout the human body. These mutations result from mistakes in replication, or mutagens such as UV light.
An elaborate system of cellular DNA Repair processes corrects the vast majority of mutations.
But, germ-line mutations that do not get repaired can be passed on to progeny, and
somatic mutations that are not repaired can lead to problems like cancer.
STOP HERE FOR QUIZ #2!!!!
MOLECULAR BASIS OF DOMINANCE VS. RECESSIVENESS
allele - one of two or more alternative forms (sequences) of a gene.
wild-type allele - the allele found most commonly in nature, and that codes for a product with the proper function
What about mutant alleles?
(SHOW SLIDE - CFTR PROTEIN)(Slide 32 in Ella Resources)
Ex) Cystic Fibrosis
cftr gene - wild-type allele codes for a transmembrane Chloride ion channel.
Gene is on human chromosome 7, an autosome. So most of us have 2 copies of
the wild-type allele.
There are mutant alleles of the gene in our population. Most common one = deletion
of three 3 base-pairs in gene, results in deletion of 3 bases in mRNA, results
in missing phenylalanine in the protein.
The most common mutation is referred to as the DF508
deletion. Half of white Americans with Cystic Fibrosis have two copies of the DF508
deletion,
This protein is defective. Does not function. This is called a Loss-of-Function mutant
allele.
If no functional Cl- channel, extracellular environment in
respiratory tract is
too dry. This
leads to
thick mucus that cannot be cleared, respiratory infection and death.
This mutation is recessive.
That is, a person having one of these DF508 mutant
alleles, and one wild-type allele (heterozygote) will not suffer from CF.
The one WT allele codes for enough functional Chloride ion channel
protein to do the job.
In this case the wild-type allele is Dominant, and the mutant allele is recessive,
since one copy of the wild-type allele is enough to assure normal cellular
function and thus to yield a wild-type phenotype. This is called "Haplo-Suffiency"
(PLEASE NOTE: There was a typo here, but I have fixed it!)
C = the CFTR gene wild-type allele
c = the CFTR gene mutant allele (lower-case indicates recessiveness)
Genotype-- Name of Genotype----------- Phenotype
CC---------- Homozygous dominant----- Normal (Healthy)
Cc ----------Heterozygous----------------- Normal (Healthy)
cc -----------Homozygous recessive------ Defective (Suffers from CF)
Think about the gene we are studying in lab (PTC). What is the molecular nature
of the dominance relationships between the taster and nontaster alleles?
Note: In the case of mammals, in which males have only one copy of the X chromosome (and are thus called "hemizygous" for all X-linked genes), if a male inherits a loss-of-function mutant allele of a gene (from his mother), he will not be able to make the protein encoded by that gene, since there is no other X chromosome to carry a wild-type allele of the gene. Examples:
The blood clotting factor proteins, coagulation factor VIII, and coagulation factor IX, are encoded by the X-linked genes, F8 and F9, respectively.
If a boy inherits a loss-of-function mutant allele of F8, he has Hemophilia A, which causes failure of blood to clot.
If a boy inherits a loss-of-function mutant allele of F9, he has Hemophilia B, which causes failure of blood to clot.
Go to this web site for a list of X-linked genes involved in human diseases
Dominant Mutations
B = Mutant allele of the B gene (upper-case indicates dominance)
b= Wild-type allele of the B gene
Genotype--------------------- Name of Genotype--------------------- Phenotype
bb----------------------------- Homozygous recessive----------------- Normal
Bb----------------------------- Heterozygous---------------------------- Defective
BB----------------------------- Homozygous dominant ----------------Defective
In some cases involving a loss-of-function mutant allele, one copy of the wild-type allele is not enough to do the job ( cannot code for enought gene product to yield normal cellular function), and the heterozygous genotype leads to a Defective phenotype. In this case, the loss-of-function mutant allele is dominant, and the situation is described as
"HAPLO-INSUFFICIENCY"
Human Example:
The Scar/WAVE family of scaffolding proteins organize molecular networks that relay signals from the GTPase Rac to the actin cytoskeleton. The WAVE-1 isoform is a brain-specific protein expressed in variety of areas including the regions of the hippocampus and the Purkinje cells of the cerebellum. Targeted disruption of the WAVE-1 gene generated mice with reduced anxiety, sensorimotor retardation, and deficits in hippocampal-dependent learning and memory. These sensorimotor and cognitive deficits are analogous to the symptoms of patients with 3p-syndrome mental retardation who are haploinsufficient for WRP/MEGAP, a component of the WAVE-1 signaling network. Thus WAVE-1 is required for normal neural functioning.
In the cases of both haplo-suffiency and haplo-insufficeny, the mutant allele is called a "loss-of-function" allele, and codes for no functional protein.
Another Way a Mutation can be Dominant involves a mutant allele that codes for a protein with a new, toxic function.
Another way that a mutant allele of a gene can me dominant to the wild-type allele is if it functions when it should not, or if it coeds for the prodution of too much protein.
This is Called a DOMINANT GAIN-OF-FUNCTION allele.
Ex) Huntingtons disease
Inherited as an autosomal dominant.
Inheritance of the dominant mutant allele of the HD gene suffer neural
degeneration, convulsions and premature death.
Wild-type allele codes for a protein called Huntingtin necessary for
normal nervous system function.
Dominant mutant allele codes for a protein with excessive numbers of repeated
Glutamine amino acids. Seems to have an abnormal, disruptive function. Forms
aggregates that are toxic to the cell.
Definition:
The dominant allele is the allele whose associated phenotype is seen in the heterozygote
Manipulating DNA
The ability to manipulate DNA has been made possible by the discovery and development of a number of important "tools" - some of which are discussed below.
DNA will be in form of long molecules - impossible to work with.
In order to work with, and analyze, need to cut it up into small pieces.
Cut with restriction endonucleases (also called restriction enzyme)s.
Explain restriction enzymes and their recognition sites.
Example = EcoR I recognizes and cuts at
(cuts between G and A on each strand)
5'------GAATTC--------3'
3'------C TTAAG--------5'
to yield:
'5'------G-3'
3'------CTTAA5'
AND
5'AATTC-------3'
--------3'G-------5'
(Normal role of restriction enzymes in nature - made by bacterial cells to protect against foreign DNA by cutting it up into pieces. Bacterial cells modify their own DNA so that the restriction enzymes will not cut their own bacterial DNA.)
In lab, you are using the restriction enzyme, Fnu4H I, which recognizes and cuts at (where "N" = A, C, G, or T):
5'------GCNGC--------3'
3'------CGNCG--------5'
In the PTC gene, which we are analyzing in the lab, there is an Fnu4H I site at position 785 in the Taster (T) allele of the gene. This is within the PCR product we are amplifying. The non-taster (t) allele has a slightly different sequence at position 785, and Fnu4H I will not recognize and cut the non-taster allele at that position.
At Position 785:
Taster allele:
5'------GCTGC--------3'
3'------CGACG--------5'
non-taster allele:
5'------GTTGC--------3'
3'------CAACG--------5'
In lab, you will be digesting your 303 bp PCR product with Fnu4H I. This will cut the Taster allele product into 2 fragments of
238 bp, and 65 bp.
It will NOT cut the non-taster allele, which will remain a 303 bp fragment.
We can identify the different-sized fragments of DNA resulting from restriction
enzyme digestion
reactions by "electrophoresing" the product(s) of that reaction on
an agarose gel.
Gel electrophoresis
(Slide 34 in Ella Resources - Gel Electrophoresis)
ALSO: (Slide 36 in Ella Resources - Restriction Digest + Gel Electrophoresis)
We can identify the different-sized fragments of DNA resulting
from PCR
reactions by electrophoresing the product(s) of that reaction on an agarose
gel.
After a PCR reaction, run the resulting product on an electrophoretic gel.
Stain for DNA - See a band - That is the product, in many copies!
Can cut out of the gel and analyze.
See this web site for a good Gel Electrophoresis tutorial! (Choose from the menu )
Briefly, the DNA (consisting of a mixture of many copies of each these three fragments) is loaded into one of the sample wells at one end of an agarose electrophoretic gel. A voltage is set up across the gel such that the sample wells are closest to the negative electrode, and the far end of the gel is closest to the positive electrode. DNA has a net negative charge, and thus migrates (moves) through the gel toward the positive electrode (this process is called electrophoresis). The gel is composed of a porous matrix that acts like a molecular sieve. Smaller DNA fragments move through the gel faster than do larger DNA fragments. After the electrophoresis of the digested DNA has been carried out for a certain period of time, it is stopped, and the DNA in the gel is stained to make it visible. The DNA is often visible as discrete bands. Each band represents a collection of many DNA fragments that are all of the same size and have thus migrated the same distance in the gel.

See this web site for a molecular biology tutorials! (Choose Gel Electrophoresis from the menu at left)
This, combined with PCR, makes it possible to do a type of analysis called Restriction Fragment Length Polymorphism (RFLP) Analysis.
This is what you are doing in the lab. You are determining your genotype for the PTC gene using RFLP Analysis.
RFLP Analysis can also be used in genetic testing to detect alleles of genes that can leas to genetic diseases.
An RFLP is defined as:
a gene (or other chromosomal site called a "locus") for which at least 2 different
alleles exist. For
a given restriction enzyme, some alleles of the gene are cut by that enzyme
at a a specific position, and other alleles are not cut by htat enzyme at that
position. Therefore, different alleles of the gene can be detected by amplifying
the locus by PCR,
digesting
the PCR
product with the restriction enzyme in question, and running the digested DNA
on an electrophoretic gel. The banding pattern onthe gel will reveal the specific
allele.
(SHOW SLIDE - Human ßGlobin Gene)(Slide 90 in Ella Resources)
(SHOW SLIDE - Human ßGlobin Gene Gel with RFLP)(Slide 91 in Ella Resources)
Sickle Cell Anemia is a genetic diseases that is caused by a well-characterized
gene alteration.
This disease affects 0.25% of African Americans, and results in the inability
of the red blood cells to carry oxygen properly.
Sickle Cell Anemia results from a point mutation GAG ---> GTG, resulting
in a valine residue substituting for the normal glutamic acid residue at position
6 in the ß globin chain of hemoglobin.
This is a missense point mutation and is also called an SNP (Single Nucleotide Polymorphism).
This point mutation, in addition to affecting hemoglobin's ability to carry
oxygen, eliminates a restriction site for the restriction enzyme, Mst II.
CCTGAGG recognized and cut by Mst II
CCTGTGG not recognized by Mst II.
To detect the mutant (S) allele:
1) isolate genomic DNA from person.
2) Amplify beta-Globin alleles by PCR
3) Digest PCR products with Mst II
4) Run digested PCR products on an electrophoretic gel
The result is an altered pattern of bands upon electrophoresis of Mst II -
digested beta-Globin alleles from sickle-cell anemia patients
This is called a Restriction Fragment Length Polymorphism (RFLP).
This is exaclty like what you are doing in lab to detect different alleles of the PTC taste receptor gene!!!!
How do you determine the base-pair sequence of a certain DNA fragment?
**** Explain DNA sequencing:
Most commonly used method of DNA sequencing is called Sanger Dideoxy sequencing,
developed by Fred Sanger.Uses single-stranded DNA template, and DNA synthesis
in the presence of dideoxyribonucleotides, which lack 3'-OH group, so cannot
form a phosphodiester bond.
Dideoxyribonucleotides can be incorporated into a growing chain, but they
terminate synthesis.
Explain Automated Sequencing
Now, almost all sequencing is done in an automated way.
sLabel the 4 different dideoxyribonucleotides, each with a different colored fluorescent tag.
A reaction tube:
with:
single-stranded DNA template for the sequence of interest
DNA polymerase
primer coplementary to known sequence adjacent to DNA to be sequenced
a small amount of a the 4 different dideoxy nucleotide triphosphates (ddATP, ddCTP, ddGTP, or ddTTP), together wuth the four normal dNTPs
Each ddNTP is labeled with its own unique fluorescent dye.
Ex)
ddAtp - Green
ddCtp - Blue
ddGtp Purple
ddTtp Red
The DNA Polymerase enzyme starts at the primer, and synthesizes a new DNA strand that is complementary to the template strand inserted in the vector.
Various chain lengths will be produced, each corresponding to the point at which the respective ddNTP was incorporated and terminated chain growth. Each newly-synthesized DNA strand will thus be labeled with a color depending on the ddNTP that is at its 3' end.
A specifc set of different-length DNA strands
will be newly synthesized. The size will depend on where the respective ddNTP
that was added to the newly-synthesized strand and terminated synthesis.
Run products of the reactions in a lane of a special
gel called a polyacrylamide gel.
Will separate DNA molecules that differ in size by one base.
A scanner reads the gel, detecting the color of each band, and thus determining the ddNTP at the end of the fragment.
DRAW THIS!
Count up from the bottom, reading colors to read sequence of the newly sythesized strand 5 - 3 as you go from bottom to top.
Of course, a laser scanner does this this, and sends the data to a computer. Get and image called a "trace" that looks like:
(SLIDE 35 in Ella Resources)
Each peak shows the probability that the nucleotide (A, C, G, or T) associated with that color occursat that position.
Then, can analyze the data using bioinformatic techniques.
BLAST Searches and other Bioinformatic Analyses
Stated simply, “Bioinformatics” is the use of computers in analyzing
biological data. The data analyzed using a bioinformatic approach are often
sequence data (either nucleic acid or amino acid sequences).
For example, once the nucleotide base-pair sequence of a particular DNA molecule has been determined, you can analyze it using a bioinformatic approach.
To analyze the sequence, you can do a BLAST search. BLAST stands for Basic Local Alignment Search Tool. In doing a BLAST search, you submit a sequence of interest (the “query” sequence, which is in this case a nucleotide sequence) to a special World Wide Web site developed and maintained by the National Center for Biotechnology Information (NCBI).
When you do a BLAST search, a computer algorithm is used to compare your query sequence to sequences in the computer databases (subjects).
These
may include all known nucleotide sequences, or a subset that you restrict your
search to. According to the NCBI web site,
“ BLAST finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.
BLAST can be used to infer functional and evolutionary
relationships between sequences as well as help identify members of gene families.”
In the PTC experiment, when the BLAST search is completed, you will receive a list of all sequences that share regions of similarity with your query sequence.
The degree of similarity that each sequence shares with the query will be rated based on several different scores and values (see below), and the sequences will be ranked in descending order of similarity to the query sequence. A BLAST search can tell you what other sequences are related (through evolution) to yours.
There are numerous other ways in which you can analyze sequences using
a bioinformatic approach, and we will be discussing these in class and in lab.
E-value and score for a BLAST search
According to Campbell and Heyer (2007) a BLAST search “returns hits,
sequences that produce significant alignments to the query search.
The significance of the hit is measured by its E-value, or expect value. The E-value of a given hit is the number of alignments you expect to find by chance (i.e., with no evolutionary relationship) with bit scores, measures of similarity between the hit and the query, at least as large as the bit score of this hit.
Biologically significant hits tend to have E-values much less than 1.0. The larger the E-value the greater the chance that the similarity between the hit and the query is mere coincidence.
E-values are calculated based on the following three factors:
1. The bit score (S). Because a larger bit score is less likely to be obtained
by chance than is a smaller bit score, larger bit scores correspond to smaller
E-values.
The bit score accounts for the type of scoring system used by the database,
and is therefore more informative than the raw score, which measures the similarity
between the query and the hit, but does not account for the bit score, the
query
length or the database size. The bit score is calculated from the raw score
by normalizing with the statistical variables that define a given scoring system.
Therefore, bit scores from different alignments, even those employing different
scoring matrices can be compared (NCBI site).
2. Length of the query. Because a particular bit score is more easily obtained
by chance with a longer query than a shorter one, longer queries correspond to
larger E-values. (Keep in mind that the query and the hit do not have to have
exact nucleotide matches over the entire length of the sequences.)
3. Size of the database. Because a larger database makes a particular bit score more easily obtained by chance, a larger database results in larger E-values.”
Show example of a BLAST Search
BLAST this sequence:
Do a nucleotide-nucleotide BLAST (blastn)
AATTGGAAGCAAATGACATCACAGCAGGTCAGAGAAAAAGGGTTGAGCGGCAGGCACCCAGAGTAGTAGG
TCTTTGGCATTAGGAGCTTGAGCCCAGACGGCCCTAGCAGGGACCCCAGCGCCCGAGAGACCATGCAGAG
GTCGCCTCTGGAAAAGGCCAGCGTTGTCTCCAAACTTTTTTTCAGGTGAGAAGGTGGCCAACCGAGCTTC
GGAAAGACACGTGCCCACGAAAGAGGAGGGCGTGTGTATGGGTTGGGTTTGGGGTAAAGGAATAAGCAGT
TTTTAAAAAGATGCGCTATCATTCATTGTTTTGAAAGAAAATGTGGGTATTGTAGAATAAAACAGAAAGC
ATTAAGAAGAGATGGAAGAATGAACTGAAGCTGATTGAATAGAGAGCCACATCTACTTGCAACTGAAAAG
TTAGAATCTCAAGACTCAAGTACGCTACTATGCACTTGTTTTATTTCATTTTTCTAAGAAACTAAAAATA
CTTGTTAATAAGTACCTAAGTATGGTTTATTGGTTTTCCCCCTTCATGCCTTGGACACTTGATTGTCTTC
TTGGCACATACAGGTGCCATGCCTGCATATAGTAAGTGCTCAGAAAACATTTCTTGACTGAATTCAGCCA
ACAAAAATTTTGGGGTAGGTAGAAAATATATGCTTAAAGTATTTATTGTTATGAGACTGGATATATCTAG
TATTTGTCACAGGTAAATGATTCTTCAAAAATTGAAAGCAAATTTGTTGAAATATTTATTTTGAAAAAAG
TTACTTCACAAGCTATAAATTTTAAAAGCCATAGGAATAGATACCGAAGTTATATCCAACTGACATTTAA
TAAATTGTATTCATAGCCTAATGTGATGAGCCACAGAAGCTTGCAAACTTTAATGAGATTTTTTAAAATA
GCATCTAAGTTCGGAATCTTAGGCAAAGTGTTGTTAGATGTAGCACTTCATATTTGAAGTGTTCTTTGGA
TATTGCATCTACTTTGTTCCTGTTATTATACTGGTGTGAATGAATGAATAGGTACTGCTCTCTCTTGGGA
CATTACTTGACACATAATTACCCAATGAATAAGCATACTGAGGTATCAAAAAAGTCAAATATGTTATAAA
TAGCTCATATATGTGTGTAGGGGGGAAGGAATTTAGCTTTCACATCTCTCTTATGTTTAGTTCTCTGCAT
GTGCAGTTAATCCTGGAACTCCGGTGCTAAGGAGAGACTGTTGGCCCTTGAAGGAGAGCTCCTCCCTGTG
GATGAGAGAGAAGGACTTTACTCTTTGGAATTATCTTTTTGTGTTGATGTTATCCACCTTTTGTTACTCC
ACCTATAAAATCGGCTTATCTATTGATCTGTTTTCCTAGTCCTTATAAAGTCAAAATGTTAATTGGCATA
AATTATAGACTTTTTTTAGCAGAGAACTTTGAGGAACCTAAATGCCAACCAGTCTAAAAATGCAGTTTTC
AGAAGAATGAATATTTCATGGATAGTTCTAAATACTAATGAACTTTAAAATAGCTTACTATTGATCTGTC
AAAGTGGGTTTTTATATAATTTTCTTTTTACAAATCACCTGACACATTTAATATAGGTTAAAAAATGCTA
TCAGGCTGGTTTGCAAAGAAAATGTATTACAAAGGCTGCTAAGTGTGTTAAGAGCATACTCATTTCTGTT
CTCCAAAATATTTCATAAGGTGCTTTAAGAATAGGTATGTTTTTAAAAGTTAAGTTCCTACTATTTATAG
GAACTGACAATCACCTAAAATACCAATGATTACAAACTTCCTTCTGGCCTTCTGGACTGCAATTCTAAAA
GTGTAAAAAACATATTTTCTGCATTAAGTTAGGCAGTATTGCTTAGTTTTCAAAGTGGTAGGCTTTGGAG
TCAGATTATTTTGATTCAGATCCTACATCTACTGTTTAGTAGCTCTGTTGCCTGAGGCAGGTCCCTTAAC
ATCTCTGTGTGTGACTTGACCTTTAAAATTTGGAGACTGTCATAGGGGTTAATCCCTTGAGAAAATGAAT
GTGAAAAGTTAGCCTAATGTTAACTGCTATTATTATGGATTACCATATTTTCACATTCATCACAGTACAT
GCACCTTGTTAATATAAGATGCTCAATTCATCTTTGAGTATAATTTTGTGACTCTCAATCTGGATATGCA
ATGAGTGGGCCTGTATGAGAATTTAATTTATGAAAAATTGTGTTTCACATGGCCTTACCAGATATACAGG
AAACACGTCACATGTTTCTATTGTATGTTGTTAAATGCCTTAGAATTTAACTTTCTGAATAGGATCCCTT
CAGTTTGAGAGTCATAAAAGAGTAAAATTATTATGGTATGAGTTATAGATTGTATTGAATATCTCTTTAT
ATGTCTAGGTTTTGTCATTGGAAAACCAAAAAGTTTGGAAAAAAAATCTAAGTTATTTCTTACTTTCTTA
Recombinant DNA Technology
The relatively recent development of recombinant DNA technology has enabled
biological researchers to make great strides in our understanding of the structures
and functions of genes.
Before the development of recombinant DNA technology, the complex genomes
of eukaryotes were extremely difficult to analyze.
Recombinant DNA technology enables researchers to break large genomes into
specific fragments, which can then be inserted into the small genome or into
a DNA molecule from a different species, such as a bacterium, and analyzed
with relative ease.
The small genome or DNA molecule into which the fragments are inserted is
called a vector, and recombinant DNA molecules can be made by inserting DNA
fragments from almost any species into a vector.
Recombinant DNA molecules are commonly introduced into bacterial "host" cells
by the process of transformation.
The vectors used to construct recombinant DNA molecules are usually capable
of replication, so once inside a bacterial cell, the recombinant DNA molecule
will be replicated, resulting in the amplification (replication of many copies)
of a specific DNA fragment.
The insertion of a fragment of DNA into a vector, and the subsequent replication
of the recombinant DNA molecule is often called "cloning".
The ability to produce many copies of a given DNA sequence has been extremely
helpful in the analysis of gene structure and function.
The Human Genome Project and other genome projects would be impossible without
recombinant DNA technology.
In addition, genes encoding medically- and industrially-important polypeptides
can be inserted into vectors, and maintained and amplified in host cells.
Host cells capable of synthesizing the polypeptide products of these recombinant
genes provide a means of producing large quantities of important molecules.
For example, insulin, which is necessary for the treatment of some types of
diabetes, is produced inexpensively and in large quantities by bacterial cells
that express the human insulin gene from a recombinant vector.
Recombinant DNA technology has been made possible by the discovery and development of a number of important "tools" - some of which are discussed below.
Cut DNA with restriction endonucleases (also called restriction enzyme)s.
Explain restriction enzymes and their recognition sites.
Example = EcoR I recognizes and cuts at
(cuts between G and A on each strand)
5'------GvAATTC--------3'
3'------C TTAAAG--------5
'5'------G3'
3'------CTTAA5'
AND
5'AATTC-------3'
--------3'G-------5'
(SHOW SLIDES - SLIDES 37 and 38 in Ella Resources)
Example = EcoR I recognizes and cuts at
5'------GvAATTC--------3'
3'------C TTAAAG--------5
'5'------G3'
3'------CTTAA5'
AND
5'AATTC-------3'
--------3'G-------5'
Notice that, when many restriction enzyme, such as EcoR I cut, they leave
5' single-stranded overhangs = STICKY ENDS
This enables us to make recombinant DNA molecules.
Can combine DNA of interest (Fly P-element) with vector cut with same restriction
enzyme.
Then ligate, and you have a
Recombinant Plasmid Can also cut with two different enzymes directional
cloning!
TYPES OF VECTORS
(SHOW SLIDE 39 in Ella Resources)
Plasmids
Plasmids are small, circular DNA molecules. Plasmids can be introduced into
competent bacterial cells by transformation. Inside the bacterial cell, a
plasmid exists and replicates independently from the much larger bacterial
genome. Plasmids can (and have been) engineered to carry genes that confer
the cells containing them with resistance to specific antibiotics. Plasmids
can also carry genes encoding certain enzymes that can be used to "mark" bacterial
cells by assaying the cells for the presence of those enzymes. Since plasmids
replicate in bacterial cells, they allow the amplification of the inserted
DNA molecule into many copies. One disadvantage of plasmid vectors is they
that cannot contain large inserts. Most plasmid vectors can hold only inserts
smaller than 10 kilobases (kb) (1 kb = 1,000 base-pairs).
STOP HERE FOR MID-TERM EXAM#1!!!!
-------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------
TYPES OF VECTORS,
Making and Using DNA Libraries
(This material will not be covered on Quiz #5.)
(SHOW SLIDE 39 in Ella Resources)
Plasmids (reminder)
Plasmids are small, circular DNA molecules. Plasmids can be introduced into
competent bacterial cells by transformation. Inside the bacterial cell, a
plasmid exists and replicates independently from the much larger bacterial
genome. Plasmids can (and have been) engineered to carry genes that confer
the cells containing them with resistance to specific antibiotics. Plasmids
can also carry genes encoding certain enzymes that can be used to "mark" bacterial
cells by assaying the cells for the presence of those enzymes. Since plasmids
replicate in bacterial cells, they allow the amplification of the inserted
DNA molecule into many copies. One disadvantage of plasmid vectors is they
that cannot contain large inserts. Most plasmid vectors can hold only inserts
smaller than 10 kilobases (kb) (1 kb = 1,000 base-pairs).
Making a Genomic Library (in this case using a plasmid vector)
Isolate genomic DNA from organism of interest. In this case, human beings.
Cut genomic DNA with restriction enzyme or enzymes.
Cut the plasmid vector with the same restriction enzyme or enzymes.
Mix together.
Sticky ends find each other, stick together with H-bonds (base pairing)
LIGATION - Forms phosphodiester bonds.
Introduce library of recombinant molecules into bacterial cells by transformation.
Plate cells. Now you have a plated library.
You can use other types of vectors to make a genomic library,
Can also make a genomic library using a phage lambda vector
Advantage = larger inserts!!! For even larger inserts, use a cosmid vector or a YAC vector.
Using a Genomic Library
Why and How do you use a genomic library???? Lets focus, for now, on
a human genomic library.
Use for a Genomic Library
Genomics!!!!
The Human Genome Project
Goal: To determine the exact bp sequence of each of the 24 different human chromosomes (1-22, X and Y).
Use a Genomic library to try to figure out the make up of each chromosome in the genome.
J. Craig Venter, of Celera Genomics in Maryland, didnt like this slow,
step by step approach to analyzing the human genome. He decided to take a
whole genome Shot Gun approach to determine the base
pair sequence of each human chromosome.
-Made human genomic libraries. Used several libraries with DNA digested with different restriction enzymes (as a result, each fragment shares some sequence (overlap) with at least one other fragment).
-Then, determined the base-pair sequence of each inserted fragment. Detemined the base pair sequence of each DNA fragment using the automated Sanger Dideoxy DNA Sequencing Method.
By comparing each sequenced fragment, and looking for shared sequences between individual fragments (overlap), computers were able to assemble them together into the sequence of each of the 24 different human chromosomes (1-22, X and Y).
The human genome is now sequenced, so we have the base-pair sequence of every chromosome.
(SHOW SLIDES - SLIDES 41/42 and 43 in Ella Resources)
Other Vectors:
Bacteriophage lambda Vectors
BACTERIOPHAGE - A virus that infects bacterial cells.
Derivatives of the bacteriophage lambda can
be used clone larger fragments of DNA - fragments on the order of 15 to 20
kb can be inserted into a phage lambda vector.
Phage lambda has a linear
double-stranded DNA genome.
Phage Genetics (SHOW SLIDE)
Virus - protein coat outside, genetic material inside (DNA or RNA).
BACTERIOPHAGE - A virus that infects bacterial cells.
In research, we often make use of a bacteriophage called lambda (l),
which infects E. coli cells.
(Draw Phage lambda)
Phage lambda has a linear,
ds DNA molecule of about 48 kb for a genome.
Phage lambda packages its genome into phage particles,each of which consists of a protein head and a protein tail.
A phage lambda infects an E.coli cell, inserts its genome into the E. coli
cell cytoplasm, where the phage genome is replicated, and codes for the production
of new head and tail proteins. Genomes, heads, and tails are assembled into
progeny phages in the E. coli cytoplasm. When enough progeny phages are assembled,
the cell bursts, or "lyses", releasing many progeny phages, all genetically
identical to the original infecting phage. These go on to infect neighboring
E.coli cells. In a confluent "lawn" of E. coli cells grown on a plate,
this will result in a clear circle called a "plaque", where the cells
have lysed. The plaque is full of identical recombinant phage lambda.
The lambda genome
can be made into a cloning vector by removing much of its central portion,
which can then be replaced with foreign DNA fragments, resulting in recombinant
molecules. The recombinant phage are then replicated in host E. coli bacterial
cells, the cells are plated, and pure recombinant phages can be isolated from
a plaque.
Cosmid vectors are hybrids between plasmid and phage vectors. Cosmids can be used to clone insert fragments of up to 45 kb in length. Cosmids can be maintained in bacterial cells in the circular plasmid form, and they can be purified from the cells by packaging into phage particles.
Bacterial Artificial Chromosome (BAC) Vectors
Modified plasmid in E. coli. Takes in serts up to 300 kb.
Yeast Artificial Chromosome (YAC) Vectors
Modified yeast plasmid. Grow in yeast. Behaves like a small yeast chromosome
during mitosis and meiosis. Take inserts up to 1,000 kb.
Genomics, and the Human Genome Project
The goal of the Human Genome Project is to
be able to READ all of the genetic information a human being contains.
The HGP is a type of research called Genomics
This is part of a branch of molecular biology called "GENOMICS" -
The cloning and characterization of entire genomes.
Genomics is divided into 3 basic areas;
Bioinformatics - Analyzes the information content of entire genomes. This includes:
-numbers and types of genes and gene products
-binding sites on DNA and RNA that allow functional products to be made in the right cells at the right time.
Comparative Genomics - Considers the genomes of closely and distantly
related species for evolutionary insight and enables conserebd sequences to
be used as a guide to analyzing gene function.
Functional Genomics - Uses an expanding variety of methods, including reverse genetics, to understand gene function and to delineate networks of interacting genes and proteins in biological processes.
What
are the goals of the HGP?
Go to:
Human
Genome Project Information (http://www.ornl.gov/TechResources/Human_Genome/home.html)
Completed in 2003, the Human Genome Project (HGP) was a 13-year project coordinated
by the U.S. Department of Energy and the National Institutes of Health. During
the early years of the HGP, the Wellcome Trust (U.K.) became a major partner;
additional contributions came from Japan, France, Germany, China, and others.
See our history page for more information.
Project goals were to
* identify all the approximately 20,000-25,000 (protein coding) genes in human
DNA,
* determine the sequences of the 3 billion chemical base pairs that make up
human DNA,
* store this information in databases,
* improve tools for data analysis,
* transfer related technologies to the private sector, and
* address the ethical, legal, and social issues (ELSI) that may arise from
the project.
(SHOW SLIDES - SLIDEs 43A and 44 in Ella Resources)
Recall that:
The Human Genome Project
A whole genome Shot
Gun approach was taken to determine the base pair sequence of each human
chromosome.
-Made human genomic libraries. Used several libraries with DNA digested with different restriction enzymes (as a result, each fragment shares some sequence (overlap) with at least one other fragment).
-Then, determined the base-pair sequence of each inserted fragment. Detemined the base pair sequence of each DNA fragment using the automated Sanger Dideoxy DNA Sequencing Method.
By comparing each sequenced fragment, and looking for shared sequences between individual fragments (overlap), computers were able to assemble them together into the sequence of each of the 24 different human chromosomes (1-22, X and Y).
The human genome is now sequenced, so we have the base-pair sequence of every chromosome.
How do we identify the genes?
First of all, let's talk about the set of protein coding genes (the Proteome).
Looking for candidate genes:
Examine sequence:
1) look for ORFs. Using bioinformatics, can conceptually remove introns and try to identify long sequences of codons.
(SHOW SLIDE 44A in Ella Resources)
2) look for binding sites such as promoter sequences and binding
sites for transcription factors, etc.
-includes binding sites for Spliceosomes. These binding sites are sequences
at the 5' and 3' ends of introns where a group of proteins and RNAs called
the "Spliceosome"
binds to cut introns and splice exons back together.
(SHOW SLIDE 45 in Ella Resources Folder)
3) Youd like to be able to have a collection of all the mRNA transcripts
coded for by all the human genes.
How do you do this??
You cannot clone mRNA cannot cut in with restriction enzymes or insert
it into vectors, or sequences it???
But you can make DNA copies of mRNA molecules, and restriction-digest, clone,
and sequence them.
How do you do this???
Make cDNA libraries.
cDNA Library
Youd like to be able to have a library of all the mRNAs in an organism
so you can study and manipulate them.
Cannot clone mRNA into a vector.
But you can make Complementary DNAs (cDNAs) and clone them into vectors.
cDNAs are DNA copies of mRNAs.
Describe construction of cDNA.
1) Isolate mRNA from the organism (Human)
2) Set up a "Reverse Transcription" reaction:
oligo-dT primer (this is for making cDNA from polyadenylated mRNA. If you want to make cDNA from all RNA, you can use Random Oligonucleotide Primers).
dATP, dCTP, dTTP, dGTP
Reverse Transcriptase
This makes a ss DNA copy of the mRNA, with a hairpin loop at the 3/ end.
3) Remove RNA by NaOH or RNase treatment.
4) Add DNA Polymerase I - makes a second DNA - dsDNA.
5) Remove hairpin loop with S1 nuclease.
6) insert in to cloning vector (set up a ligation reaction).
Isolate all possible mRNAs from human cells.
Make a ds cDNA correspoding to each one.
Insert each cDNA into a vector.
Now you have a human cDNA library.
Sequence each cDNA insert in the library by Sanger dideoxy method.
All human cDNAs also sequenced!!!!
Compare sequence of all cDNAs to sequence of genome find all genes
and their exact chromosomal positions.
What will be different between cDNA sequence and genomic gene sequence???
Genomic gene has introns.
cDNA lacks introns!
Learn where introns and exons are.
cDNAs also tell us what the real genes are. These are the genes that get transcribed to make an RNA.
We also have large data sets of incomplete cDNA sequence reads called Expressed Sequence Tags (ESTs). An EST is sequence data from only the 5' or 3' end of a cDNA.
The Structure of the Human Genome
About 45% of the human genome is repetitive and composed of known transposable
elements or sequences descended from transposable
elements!
And a good piece of the rest is composed of sequences that may be descended
from transposable elements, but have accumuluated mutations.
So, the majority of our gemome is made up of genetic hitchhikers, which are foreign DNA sequences that have been introduced into human chromosomes!
Less than 3% of the genome codes for exons of mRNAs.
Exons are typically small (about 150 bases), while introns are typically large
(1,000 - 100,000 bases!)
The average transcript is composed of 10 exons. Many have a lot more!
Humans have around 20,000 different protein-coding genes, but the exact number
is not known.
Early in the project, it was thought that there were many more, but have found more that 19,000 Pseudogenes, which look like ORFs, but are never expressed (due to mutation or other reasons).
About 60% of these genes are alternatively spliced, so each of these genes
codes for multiple proteins.
On average, there are 3 splice variants per protein-coding gene.
We have also come to realize that there are quite a number of genes that
do not have protein as a final product.
These include, of course, the genes that code for:
Ribosomal RNAs (rRNAs) Structural and catalytic component of ribosomes.
Transfer RNAs (tRNAs) Carry amino acids to the ribosomes during translation
Small Nuclear RNAs (snRNAs) - Play roles in exon splicing
In addition, we are learning that the genome also codes for RNAs that play
roles in regulating translation of mRNAs.
These a called microRNAs (miRNAs)
These microRNAs are encoded by many regions of the genome that were formerly thought to not be transcribed.
microRNAs function to regulate the translation of messenger RNAs. Often prevent a specific mRNA from being translated. Mechanism not well understood.
Comparative Genomics
The proteins encoded by the Human genomce can be grouped into families based
on sequence and function.
The families extend across species.
Because natural selection generally rejects mutations that decrease fitness, genes or other functional DNA sequences are conserved over long periods of evolution.
A DNA sequence that is common to divergent species is likely to perform a necessary function. This can help us determine the function of the sequence. For example, transcription factors are proteins (wihc are, or course, encoded by genes) that bind to regulatory sequences of target genes and regulate their transcription. There are families of transcription factors that share common amino acid sequences in the part of the protein that binds to DNA.
Genes identified in one organism are likely to be identifiable, on the basis of sequence and genome location, in related species.
In the science of comparative genomics, we try to identify conserved DNA regions,
and we also try to revela how species diverge. Species evolve and triats
change through changes in DNA sequence.
The first step in comparing the genomes of 2 different species (for example, mouse and human) is the identification of the most closely related genes, called homologs.
Some homologs are the same geneitc locus inherited from a common ancestor. There are called orthologs.
Other homologs are genes that related by genome duplication events within a genome. These are caleld paralogs.
STOP HERE FOR MID-TERM EXAM#3!!!!
Mouse vs. Human Genome (Lineages diverged about 75 million years ago):
The mouse has been a very useful model organism for doing genetic research, and its genome has been studied exrensively.
Mouse and human have about the same number of protein coding genes.
99% of mouse protein coding genes have a homolog in human.
99% of human protein coding genes have a homolog in mouse.
By studying these genes in mouse, we can learn about their functions in humans!
We have also found similarities in the structure of the mouse and human genomes.
Within chromosomes, there are regions of synteny between the 2 species, where the order, on the chromosome, of homologous genes is the same.
There are some differences. Mice have more copies of genes that function in immunity, olfaction, and reproduction. These physiological systmes may be evolving in the rodent lineage.
Humans vs. Chimpanzee Genome (Diverged from a common ancestor about 6 million years ago):
There are about 35 million single-nucleotide differences between the human and chimp genomes.
There are also insertions and deletions (about 5 millon of them) that cause differences in the genme sequences. Most are NOT in coding regions.
So, the 2 genome sequences differ by about 3 percent.
The proteins encoded by human and chimp are VERY similar.
29% of all orthologous proteins are identical in sequence!
Most proteins that differ do so by only about 2 amino acid replacements.
Duplications of chromosome segments have also contributed to genome divergence.
These may play some role in phenotypic differences, but we are not sure yet.
Since the proteins in humans and chimps are so similar, some hypothesize that it is differences in the regulation of the exact place and time in which each gene is expresses that leads to the phenotpyic differences between human and chimp.
Conserved noncoding elements
In the human genome, only about 2% of the total genome sequence encodes protein. How do we learn about the function of the 98% that does not encode protein?
Compare genomes of different species, and look for noncoding sequences that are highly conserved. This conservation indicate