genes to proteins Flashcards
Describe the key steps in the central dogma
The central dogma states that genetic information flows only in one direction, from DNA to RNA to protein, or RNA directly to protein. This directional flow is the central dogma.
Steps:
Transcription: One strand of the gene’s DNA is copied into RNA. In eukaryotes, the RNA transcript must undergo additional processing steps in order to become a mature messenger RNA (mRNA). First step in central dogma
Translation: The nucleotide sequence of the mRNA is decoded to specify the amino acid sequence of a polypeptide. This process occurs inside a ribosome and requires adapter molecules called tRNAs.
- Exceptions to central dogma: tRNA and rRNA
Describe fundamental nucleic acid structure
- Nitrogenous bases: adenine and guanine (purines),uracil, thymine and cytosine (pyrimidines)
- Nucleic acids are complex macromolecules. Two types: DNA and RNA
- In DNA: Adenine and thymine pair. Guanine and cytosine pair
- In RNA: same but adenine and uricil pair
- A sugar molecule plus nitrogenous bases make nucleosides
- Nucleotides have a phosphate group on carbon 5. They form polymers
- Nucleosides have an OH group on carbon five
- Dna is double stranded. Rna is single stranded. Single stranded nucleic acids are fragile
- Two hydrogen bonds between A and T and three between G and C. h bonds and phosphodiester bonds must be broken to form recombinant DNA
- DNA is a naked molecule. It is associated with histones. DNA + histones= chromatin. Histones allow it to compact and wind on itself to for a higher order structure CHROMOSOME
- Humans have 23 chromosome pairs. 22 autosome and 1 sex
- 3.2 BILLION nucleotide pairs in human genome
Explain the C and G value paradoxes and articulate how splicing contributes to both
- C value= genome complexity
- G value= number of protein coding genes
- Paradox: the apparent disconnect between the complexity of an organism and its genome size (c value) and the number of protein coding genes
- Ex. Humans have 3.2 billion bp and approx 20,000 genes that code for protein. Amoeba has approx 600 billion bp and only have just over half of the protein coding genes that we have
- There is no correlation between size of genome and complexity of organism. This is g pardox
- C value paradox looks at complexity and mass sequences of noncoding
Explain the key steps in expression of protein coding genes
- One gene can give rise to many proteins with vast to moderately different funcions
SLICING - The biochemical process by which introns are removed from a primary transcript (for RNA: precursor mRNA) to generate a mature transcript
- Mediated by spliceosomes (large ribonucleic protein complex. Also has part of complex as non coding)
- 5’ and 3’ splice site= at 5’ and 3’ ends of the intron
- branchpoint = 21-34 nt upstream of 3’ splice site
- Distance matters?!
Types of splicing
Consecutive splicing (all spliced out)
Mutually exclusive splicing: one or the other spliced out
Exon skipping: common in vertebrates and invertebrates
Alternate 3’ splicing: same 5’ splice site but alternate 3’ splice site
Alternate 5’ splicing: same 3’ but alt 5’
Intron retention: can be problematic. Introns are usually spliced out if introns have stop codon; if left in, could leave detrimental consequences
Then translation happens to make proteins
Distinguish among the different types of spontaneous mutations and the mechanisms that cause them
- Single nucleotide (ie point mutation) or a few base pairs. Substitution, deletion, or insertion
- chromosomal re-arrangements (several hundred or thousands of base pairs. Large deltion, large insertion, inversion, translocations
- Changes in chromosome number (ie. aneuploidy)
Explain how some spontaneous mutations lead to disease in humans
- Ex of lof and gof mutations in cancer: Tp53
- tumor protein 53 is the transcription factor that functions as the gatekeeper of cellcycle progression
- P53 arrests cell cycle progression if there is dna damage= prevents cells with damaged DNA from proliferation. Important tumore suppressor
- In cancer, most p53 loss of function mutations occur in hte DBD
- Significant reduction in DNA binding capacity
- Abolished gene expression
- May also result in gain of function
- Mutant p53 suppresses the function of WT p53
- Effects apoptosis genes, dna repair genes, and cell cycle genes
dna synthesis overview
- dna lacks the -OH group on carbon 2 of the ribose sugar (hence deoyribose)
- phosphodiester bonds between carbon 5 of the 5’ nucleotide and the carbon 3 of the 3’ nucleotide. mediated by dna polymerase
gene structure
distal promoter: will either stabilize RNA pol 2 to enchance transcription OR block (assembly of) RNA pol 2 to silence transcription. 1000s of bp away
proximal promoter: contains specific sequences that are recognized by specific transcription factors. regulate transcription rates ( inc or dec relative to basal levels). 200-300 bp away
core promoter: the minimal RNA sequence required to initiate transcription. TATAA box= a common core promoter of class 2 genes. Bound by GTFs and RNA pol 2 –> basal transcription levels. a promoter is a DNA sequence to whch transcription factors will bind. 50-100 bp away
coding sequence: DNA sequence that provides information that encodes for protein or non-coding RNAs. exons have info of coding proteins. introns allow for exons to be configured in alternate wats
terminator: sequence hat signals end of transcription. a very complex process. proximal to end of coding sequence. a poly A tail will end transcription
codon and reading frame
codon: the nucleotide triplet that specifies an amino acid or translation termination signal (ie stop codon)
reading frame: since 1 codon= 3 nucleotide, there are 3 possible ways to read a nucleic acid sequence. this is the reading frame
translation
- bichemical process where mature mRNA is read by the ribosome, which catalyzes the formaion of peptide bonds between amino acids
- creates polypeptide chain
- transfer RNAs (tRNAs): small non coding RNAs. contain anticodons that code for a particular amino acid
- ribosome: large protein-rRNA complexes that catalyze peptide bonds
- transcription in nucleus; RNA not in cytosol until matured (5’ cap, poly-A tail, all introns spliced out.). then assocites with ribosomes on rough ER (where prot synthesis happens)
open readinf frame
- begins with a start codon and ends with a stop codon
degeneracy of he genetic code
64 codons and 20 amino acids. redundancy because if a mutation, can still code for proteins
- minimizes the affect mutation has on a genetic code
- still could have an affect. ie if a premature stop codon is introduced
what is a mutation
- a change in the DNA sequence
- germ line mutations are heritable.
- somatic mutation: non heritable
- can both be in protein coding or non coding regions
- protein coding region: will affect polypeptide sequence
- non-coding regions: will affect gene regulatory regions; non-coding RNAs
why are mutations important for genetic diversity
- can result in the formation of gene variants, known as alleles
- wildtype alleles= the more abundant allele. present at greater frequences
- mutant alleles= the more rare allele
point mutation
- single base-pair subsitutions that result in:
A) nonsense mutations= code for a stop codon, which can truncate the protein
B) missense mutations= code for a difference amino acid, resulting in a non-functional protein or a protein with a different function
C) silent mutations= code for the same or different amino acid but there is no functional change in the protein