Week 2: Protein Structure, Chromosomes & Chromatin, DNA & Histone Modification Flashcards
Why do proteins have such a diverse array of specialized functions?
Because they have a diverse set of building blocks - amino acids
Amino acids structure + linkage
- joined by covalent bonds
- each free amino acid has: a central carbon (C(alpha)), an amino group (-NH3+), and carboxolate group (-COO-), and a side chain
~20 diff types of side chains!
How do amino acids exist?
besides glycine
as 2 distinc steroisomers that differ in arragement about the alpha carbon
* the L- and D- aa: contain identicall types of atoms and chemical bonds, but are mirror images of each other
Peptide bonds are…
+ how they form
…the covalent linkages that form polypeptides (polymer of aa)
* form between carboxylate group of 1 aa and amino group of another
* precipitates 1H2O
Amino acid residues are
the amino acids that have been incorporated
Peptide backbone
and structure
repeating series of atoms from which the aa side chains protrude
* one end has an exposed amino group (N-terminus) while the other has an exposed carboxyl group (C-terminus)
ribosome synthesizes beginning at N-term
Importance of the deolocalization of the e- in atoms of the peptide bond unit
- causes these atoms to all lie in a plane (resonance)
- prevents free rotation of one of the bonds that form the overall peptide bond unit, locking the atoms into a planar configurations
- each aa unit has only 2 covalent bonds about which free rotation can occur: The N-alpha carbon bond, and the C-alpha carbon bond
A ramachandran plot provides a graphical depiction of the allowable combos of N-alpha carbon and C-alpha carbon angles
What is protein folding driven by?
non-covalent interactions between atoms in the polypeptide chain
What are the four levels of organization of protein folding
Primary structure: sequence of amino acids
Secondary structure: short regions of the polypeptide chain can form regular, repeating regions of structure stabilized by hydrogen bonds (alpha helices and beta sheets)
Tertiary structure: when a protein folds and the regular repeating element come together to form a defined shape
Quaternary structure: complexes formed by the association of several folded polypeptides
Alpha helices formation and structure
What are amphipathic alpha helices?
In an alpha helix, the backbone curves in a right-handed helical pattern
* hydrogen bonds form between the carbonyl oxygen of one residue and the amide nitrogen of residues further along the chain
* structure is a cylinder with side groups outside, backbone inside
* amphipathic alpha helices: hydrophobic side chains on one face, polar side chains on other
Beta sheet formation and structure
parallel vs antiparallel beta sheet?
Beta sheets form when 2+ segments of the backbone (beta strands) H-bond throug their carbonyl and amide groups
* sheet-like structure that is slightly twisted
* can form between 2+ non-contiguous segments of a polypeptide
* parallel beta sheet: strands oriented in same direction (N to C term)
* antiparallel beta sheet: strands oriented in opposite directions
* (mixed sheets can also occur)
Importance of tertiary structure function
why is it energetically favorable?
In the tertiary structure, the hydrophobic groups in the polypeptide chain interact inside the protein (primarily van der waals), while polar groups are on the proteins surface (because of the polar, aqueous environment of the cell)
It is energetically more favorable because:
* bury hydrogen-bonded polar atoms in hydrophobic interior
* very few empty spaces/gaps interior
* *folded *form of polypeptide
* - when hydrophobic side chains are outside, H2O molecules have more limited orientation options (less entropy) vs when hydrophobic side chains are inside, whole system has more entropy as the H2O molecules have more options
What is possible/not possible to predict about protein structure based on sequence?
Process of protein folding is so complex that is impossible to precisely predict tertiary structure from aa sequence, but is possible to predict whether 2 proteins will have a similar structure based on their aa sequence… if they have a sequence identity of just 25% they will be fairly similar
Can also predict secondary elements pretty well
Can use already-known protein aa sequences and structure to help predict the structure of another aa sequence: if any proteins have at least 25% of aa in common, likely to adapt the same fold… often, but not always, means that the proteins perform a similar function
What do chaperones do?
They help the protein folding process go smoothly
(Proteins can be denatured by things that change solutions conditions such as heat, chemicals, etc)
Definition of protein fold
Protein fold = the arrangement of secondary structure elements that characterizes a particular protein
Proteins may not have identical structures, but can still have identical folds
Even though there is lots of diversity in types of proteins folds that are known, the number of protein folds found in nature is limited… why?
Most arbitrary amino acid sequences would fail to fall into a stable structure
What types of changes are tolerated in aa sequence of proteins vs not tolerated?
Changes in the aa sequence that do not alter overall fold.properties may be tolerated, but changes that are more deleterious are not tolerated
As a protein collects further mutations, may evolve a new function
Divergent evolution
proteins with new characteristic, and eventually separate functions evolve from a single ancestral protein
Convergent evolution
whe nature solves the same problem twice: 2 proteins that carry out a similar function, but have evolved independently
Domain
a compact region of protein structure, usually made up of a contiguous segment of the poplypeptide chain, that is capable of folding on its own
Most proteins are built up in a modular fashion from several domains fused together.
A given DNA sequence can be characterized by what?
A distinct array of hydrogen bond donors and acceptors, as well as by methyl groups that are exposed in the grooves
Why is there less variability in the chemical surface exposed in the minor groove?
What does this imply for protein binding?
The pattern of H-bond donors and acceptors is the same for A-T and T-A base pairs and C-G and G-C base pairs
This means that most DNA binding proteins that recognize a particular sequene do so primarily through major groove contacts
ex: H-bond donor paired with H-bond acceptors while a hydrophobic side chain might be in van-der-waals contact with the methyl group of thymine
What can increase chances that side chains on a given protein will approach functional groups on the DNA?
If the shape of the protein is complementary to that of the DNA
What usually binds in the major groove
(protein structure)
In the major groove, usually alpha helices and 2-stranded beta sheets bind because their shape fits
What usually binds in the minor groove?
(protein structure)
The minor groove is usually too narrow to fit either alpha helices or beta sheets, but can happen with an energetic penalty that comes from distorting DNA
What two amino acids are generally involved in protein interactions with DNA?
arginine and lysine (positively charged)
Also the hydroxyl groups on serines or tyrosines (partial positive charge) can form favorable electrostatic interactions with phosphate groups
The four levels of protein structure as taught in class
Primary: aa sequence
Secondary: localized structure: alpha helices, beta sheets, loops (H-bonds between backbone hold these structures)
Tertiary: folding of the whole polypeptide
Quaternary: diff polypeptides interacting with each other
Charge of amino acids/side chains general info
- Whether or not a side chain is charged depends on the pH
- cell pH is about 5.6 so positive charge if pKa >5.6, negative charge if pKa <5.6 (in normal cell conditions)
- (pKa tells us if aa are charged at most conditions)
- histidine is sometimes positively charged, sometimes negative
What is DNA wrapped around?
Histones
What are the two types of histones?
basic structure
Core histones and linker histones
all structurally similar + small basic proteins
positively charged - rich lysine and arginine
What are the different histones called?
H1, H2A, H2B, H3, H4, H5
H1 and H5 are linker histones found between wrapped groups
H2A, H2B, H3, H4 are core histones
Results of German’s isolation of histones and separation by gel eletrophoresis?
DNA + Histones make what?
nucleosomes
Also, all DNA is wrapped into nucleosomes
How did we know that DNA + histones = nucleosomes?
- electron microscopy shows “beads on a string”
- Nuclease digestion of DNA in nuclei: useful in saying both expressed and unexpressed genes found in nucleosomes
* isolate nuclei, add micrococcal nuclease (breaks phosphodiester bonds)
* then stop enzyme, remove all proteins, look at size of DNA
* do as a function of time
Describe the formation of the histones that DNA wraps around
Also, what is the basic structure of histones?
H2A and H2B make a dimer
H3 and H4 make a tetramer
DNA wraps around tetramer first, then dimers associate
histones have core regions and tails (60-70aa)
Describe the nucleosome structure
- H2A:H2B dimers (2 of them) + H3:H4 tetramer
- DNA wrapped around twice
- positively charged aa on core histones are near DNA helix
- histones have “tails” - positively charged
Describe “bendability” of DNA
“All” DNA is found in nucleosomes, but some DNA sequences wrap better than others
“bendability” is affected by DNA sequence
* alternating of purines and pyrimidines bend much better than repeating nucleotides (ex: AAAA…)
* these regions tend to be in between nucleosomes
What is the role of H1?
H1 stabilizes nucleosomes and helps them associate with one another
Describe the variation in structure of chromatin
- Euchromatin vs heterochromatin
- can also be histone variants in particular nucleosomes
- Histone modifications
Euchromatin vs Heterochromatin
Euchromatin (true-chromatin) - normal chromatin, easier for proteins to interact with DNA sequences
Heterochromatin - much more tightly compacted, difficult for proteins to recognize DNA sequences
Acetylation in euchromatin, methylation in heterochromatin normally
Histone Variants
- H2A, H2B, H3 variants exist
- somehow “identify” particular nucleosomes
- some we know what its role is, some we dont
Histone Modifications
- acetylation of lysine
- methylation of histones (lys or arg)
- phosphorylation of serine, threonine
- ubiquitylation - adds a small protein called ubiquitin at lys
- SUMO-lation adds a diff small protein at Lys
- and others
What makes up chromosomes?
Chromosomes are made of DNA packaged with specific proteins that help condense the DNA into a relatively tiny space
What is one method organisms use to help package DNA?
In all organisms, small basic (+) proteins bind to DNA along its length, help counteract negative charge
What are histones?
the basic DNA-binding proteins that package DNA into chromatin
What is the complex of histones found in a nucleosome called?
Histone Octamer - 2 copies of each of the core histones
What is the implication of the fact that DNA winds around histone-octamers in a left-handed manner?
When a histone octamer is stripped away from the DNA, leaves behind negatively supercoild DNA (easier to separate double-stranded DNA if negatively wound than positively)
The way histone octamers bind to DNA is relatively insensitive but nucleosomes do form preferentially along certain types of DNA sequences…what are these?
- A-T rich sequences alternating with stiffer G-C rich sequences spaced roughly half a helical turn apart
- pyrimidine-purine base steps are more bendy than other base steps
Describe the three successive levels of chromatin packaging
- 10nm fiber - arises from the way histone octamers asssociate with double stranded DNA to form nucleosomes
- 30nm fiber - the nucleosome core particles form a regular, alternating arrangement that brings the nucleosomes in contact with one another (In vitro, H1 binds to the linker DNA that connects successive nucleosomes)
- looped structure: the large loops of chromatin in 30nm fibers are further anchored to a central scaffold
Euchromatin
relatively decondensed regions
Heterochromatin
more compacted regions
Do genes that are transcribed typically reside in euchromatin or heterochromatin?
Euchromatin
Where is heterochromatin typically concentrated?
Near the periphery of the nucleus - there is some transcription in heterochromatin, but at the same time translocation of certain genes to heterochromatin can prevent transcription
Chromosome structure influences what cellular processes?
transcription, recombination, and chromosome transmission
ex: if a gene that is active in its normal location in euchromatin is placed next to a telomere, the gene becomes silenced
ex: heterochromatin generally has a reduced recombination rate
Describe what phospohorylation does to histones
Phosphorylation addes to free OH group using ATP as a source of energy
Histone Kinases take ATP and add on phosphate group
Usually done to ser or thr (sometimes tyr) and PO4 (2-) has a negative charge so this changes what aa can interact with
Changes OH group from a polar H-bond donor to regular charge, no H-bond donor
(Kinases add phosphate groups, phosphotases remove)
Describe what acetylation does does to histones…
Acetylation removes the positive charge from lysine… loosens the chromosome (“non-specific”)
Histone acetyl transferases (HAT) use Acetyl coA to charnge lysine from pos charge to now charge and now with a hydrophobic group
*But some acetylation can have different effects depending on the particular lysine (there are proteins that bind to particular acetylated histones
acts as a tag for a particular protein
Acetyl groups have a neg charge, lysines have a pos charge, so cancels out charge
Describe what methylation does to histones…
Doesnt change, but introduces hydrophobic group
Histone methyl transferases (HMTs)
Lysine can be up to thrice methylated (me1, me2, me3)
Arginine can be up to twice methylated (me1, me2)
Nomenclature system for modifications
Name which histone, which amino acid, and what the modification is
ex: H3K4ac or H2AR79me2
What types of processes do covalent modifications of histones affect?
Any process that involvers accessing the DNA
Major histone modifications are always ______
Reversible.
There are specialized enzymes that add modifications, and ones that remove them
How do levels of acetylation vary throughout chromatin?
- actively transcribed regions (euchromatin) have lots, while heterochromatin has low levels
- primary targets of acetylation are H3 and H4 - have many lysines that can be acetylated
- lysines are +, acetyl-lysines are neutral
- acetylation is uniformly associated with active transcription
What are the consequences of different degrees of methylation?
Different degrees of methylation have different consequences by recruiting specific proteins
(up to 3 methyl groups to 1 lysine and up to 2 methyl groups to 1 arginine)
Methylation is associated with activation or repression of transcription: depends which residue is methylated
Histone tails can be phosphorylated at what residues?
At either serine or threonine residues
This plays a variety of roles
ex: phosphorylation of H3 at ser 10 facilitates transcription of genes required for cell growth
What is the largest histone modification and what is its effect?
The covalent attachment of ubiquitin to lysine side chains
- attached in a series of enzymatic steps, removed by a deubiquinating enzyme (DUB)
- histones are similarly modified by small ubiquitin-like modifier (Sumo)
- plays roles in regulating different steps of transcription and DNA repair
Describe the process of DNA methylation
methyl groups (CH3)are added to DNA itself
- To cytosine in bacteria and eukaryotes, and to adenine in bacteria - BUT in eukaryotes this makes the cytosine more unstable, sometimes turns into a thymine (in bacteria, turns into Uracil which is recognized as DNA damage)
- carried out by DNA methyltransferases
- They do “base flipping” to access the cytosine from the tightly packed base stack - while methylating, replace w/an aa side chain temporarily
IIn E. Coli, adenine methylation can be used to distinguish the newly replicated strand from the old strand in the repair process… explain
Both strands are methylated, but only the parental strand remains methylated immediately after DNA replication
- half-methylated site is “hemi-methylated” –> quickly remethylated by DNA Adenine Methylase (DAM) after DNA synthesis… before tho, the methylation “status” can be read by DNA replication and repair enzyes: transiently hemi-methylated DNA regulates replication
- Also, DNA repair proteins use this to identify the parental strand and use it as a template to fix errors in new strand
In many bacteria, DNA methylation is used to distinguish their genomic DNA from invading bacteriophage DNA
In many eukaryotes, DNA methylation is used to silence transcription of genes… explain
Takes place most commonly at the sequences CpG and CpXpG (adjacent C and G nucleotides joined by a phosphodiester base)
- this methylated - and hence silenced - stage can be passed down to daughter cells (when DNA is replicated so is the pattern of methylation along the strand) - Epigenetic Silencing