Exam 2 Lectures 9-16 Flashcards
What are two types of prokaryotes?
Bacteria and archaea
How are prokaryotic genomes different from eukaryotic?
- no chromosomes
- circular or linear
- much smaller (less than 1 mb-5 mb)
- haploid - one copy of each gene
What is the base DNA molecule that prokaryotes have?
Circular DNA molecules which may be called chromosomes- not actually similar to eukaryotic chromosomes
Where is prokaryotic DNA located?
The nucleoid
Describe the structure of prokaryotic DNA.
Circular, double stranded DNA can be coiled around itself in a supercoil
What proteins are associated with prokaryotic DNA?
HU protein and H-NS (histone-like nucleoid structure protein)
What are 3 components of prokaryotic genomes?
“Chromosomes,” chromid, plasmid
What is a prokaryotic chromosome?
Located in the nucleoid, carries essential genes
What is a chromid?
Used plasmid partitioning system, carries essential genes
What is a plasmid?
Uses plasmid partitioning system, carries non-essential genes
What are three ways a bacterial cell can transfer/exchange DNA?
Transformation, conjugation, and transduction
What is transformation?
Transfer of DNA between donor and recipient bacteria
What is conjugation?
- A donor cell physically attaches to a recipient cell,
- useful for genetic mapping, sequential transfer of markers, time at which gene arrives indicates order of genes
What is transduction?
- bacteriophage transfers genetic information
- Co-transfer of closely linked markers during trans, frequency with which AB transfers together depends on how close together they are the chromosome
Describe genome organization in E. Coli?
- Very little intergenic space, very few introns
- outside transcribed clockwise, inside transcribed counterclockwise
What is a group of genes in prokaryotic genomes called?
Operon - group of genes involved in a single biochemical process
What is the term to describe the lac operon?
It is an inducible operon, because it is usually off (repressed) but can be turned on in the presence of an induced protein
Describe how the lac operon works?
- Bacterial cell is not always in the presence of lactose , does not always want to be synthesizing these enzymes that breakdown lactose
- LacI encodes for a repressor, in the absense of lactose the repressor is bound to the operator which prevents transcription of the lac genes
What is an example of a repressible operon?
Trp operon
Describe how the trp operon works.
- Trp binds to the operator and represses gene transcription once it is synthesized
- Negative feedback loop
How do bacterial genome sizes vary?
- genomes vary due to varying lifestyles
- free living bacteria code for more genes than parasitic bacteria
What does it mean that prokaryotes have a pan-genome?
- Core genome - set of genes possessed by all members of a species
- Accessory genome - entire collection of additional genes present in strains and isolates of that species
Do the # of genes vary in Eukaryotes?
In Eukaryotes the # of genes remains relatively similar while it varies more in prokaryotes
How are evolutionary relationships defined?
1) Evolutionary relationships inferred from complete genome sequences
2) Evolutionary relationships inferred from gene x sequences
What are species?
Individuals that can interbreed
What is vertical genetic transmission?
Transmission of genes from parent to offspring
What other type of genetic transmission do prokaryotes possess?
Lateral genetic transmission- transfer of genes between different individuals/species
How does lateral genetic transformation in bacteria impact characterization?
Makes it hard to define species in bacteria
Describe mitochondrial genomes.
- 5-1500 kb
- 3-93
- energy production, transcription, translation
Describe the chloroplast genome.
- 60-525 kb
- 200 genes
- majority of genes are involved in the process of photosynthesis
What defines the lifestyle of viruses?
Obligate parasites
What do viruses consist of?
- Nucleic acid + protein structure
- can carry their own polymerase for transcription but will need the hosts translation machinery to make protein
- specific to their host
What are bacteriophages?
Viruses that infect bacteria
What are 3 structures of viruses?
Icosahedral, filamentous, head&tail
Describe virus structure assembly.
- Proteins assemble into polypeptide subunits called protomers which then assemble into a protein coat called a capsid
- Role of the protein is to encapsulate the nucleic acid
What can virus genomes be made of?
RNA or DNA (double or single stranded)
Describe the structure of viral genomes.
Viruses have very compact genomes, may have overlapping genes with different reading frames
What are two infection cycles of bacteriophages?
Lytic and lysogenic
Describe the lytic infection cycle.
- Bacteriophage (T4 that infects E. coli) attaches to receptor protein on target cell
- phage DNA is injected into the cell
- Transcription of phage DNA begins
- Replication of phage DNA
- Capsid protein synthesis
- Host cell bursts, new phages released
Describe the lysogenic cycle.
- used by lambda bacteriophage
- recombination of viral DNA in E. coli cell
- Excision & synthesis of new phages
- Many cell divisions
- Induction of prophage
- Phage gene expression, DNA replication, capsid synthesis
- New lambda phages released
Describe Eukaryotic viruses.
- Can be either icosahedral or filamentous, but not head and tail
- capsid can be surrounded by lipid membrane
- can have a lytic or lysogenic life cycle -> usually less dramatic infection
What is a retrovirus?
- a virus that integrates its genome into the host genome
- RNA viruses that use the enzyme reverse transcriptase
- discovered by Howard Temin & David Baltimore
Describe the genome of retroviruses.
- (7-12 kB)
- LTR = long terminal repeats
- Gag = structural glycoprotein
- Pol = reverse transcriptase
- Env = viral coat
- Then LTR again on the other side
(This retroviral genome must be integrated into the host genome)
What is HIV?
- A virus that binds to CD4 receptors on the T-cell membrane
Describe the corona virus genome.
- sars-cov2 - family of coronaviruses 26-32 kb
- 4 major structural proteins: spike, membrane, envelope, nucleoproteins
- spike protein binds to human ACE2 receptor in the lungs, kidneys, liver
- single stranded RNA virus 29.9kb in size, 13-15 ORFs -> 12 expressed proteins
What are 2 types of mobile genetic elements?
Retrotransposons and transposons
What are two types of retrotransposons?
Retrotransposons that have LTRs ones that don’t
Describe the pathway of retrotransposons that have LTRs.
- Retrotransposon in gene
- Transcribed to single stranded RNA
- Reverse transcribed into DS DNA
- Re-integrated into host genome
- Example: Retroviruses that infected their host , but then became inactivated and leave behind their mobile genetic information
Describe the pathway of retrotransposons that don’t have LTRs.
- have non-LTR elements aka retrotransposons
- LINEs and SINEs
- Alu is an example of a SINE, 120 bp, 1.2 million copies in the human genome, 10% of the human genome
What is the substrate of evolutionary change?
Mutation
What are the two different levels of mutation?
A gene/point mutation or a chromosome mutation
What is a gene mutation?
- a point mutation
- an allele or gene changed to a different allele
What is a chromosome mutation?
Segments, whole chromosomes or sets of chromosomes change
What is a reference point?
- wild type allele - allele that is most commonly present in a population (in nature or lab stock)
What are two types of alleles?
Mutant allele and wild type allele
What are three mutations at the DNA level?
Transitions, transversions, and additions/deletions
What is a transition mutation?
- purine replace by purine (AG)
- OR pyrimidine is replaced by pyrimidine (CT)
What is a transversion mutation?
- purine replaced with a pyrimidine
- OR pyrimidine replaced with a purine
What are 3 impacts of gene mutations on proteins?
- Silent Mutation - changes one codon for an amino acid into another codon for the same amino acid
- Missense Mutation - changes one codon for an amino acid to one that encodes for a different amino acid
- Nonsense Mutation - changes one codon for a stop codon - creates a truncated protein
What are 6 possible mutant types?
morphological mutants, lethal mutations, conditional mutants, biochemical mutations, loss of function, gain of function
What is a morphological mutant?
mutations that affect outwardly visible properties of an organism
What is a lethal mutation?
in a gene that encodes for a protein involved in an essential process
What is a conditional mutation?
- mutation only causes a mutant phenotype in a certain environment
- temperature sensitivity common
What is a biochemical mutation? What are examples?
- mutation that prevents the organism from carrying out a biochemical function
- prototroph - can exist on inorganic salts and energy source
- auxotroph - must be supplied with nutrients to growth
What is a loss of function mutation?
- gene no longer makes a functional protein, usually recessive
What is a gain of function mutation?
- mutation confers a new function to the protein, usually dominant
How does the location of a mutation impact the effect on the organism?
- Somatic cells - if mutation is early in development there is a large effect
- Sex cells - always large effect
What are two ways that mutations can arise?
DNA replication and mutagens
What is a tautomer shift?
- DNA replication mutation
- When a base changes from Keto to enol form
- Enol T binds with G
- Enol A binds with C
- ENol G binds with T
What mutations can DNA replication create?
- insertions/deletions
- these create frameshift mutations
- 3 nucleotide deletion/insertion is less harmful
What is replication slippage?
a type of DNA replication mutation that occurs in segments of DNA with alot of short tandem repeats (microsattelites)
-adds a tandem repeat to a daughter molecule - creates microsattelites of different lengths
What disease can be produced as a result of replication slippage?
Nucleotide repeat expansion diseases- result of replication slippage
- expansion of repeats causes a mutated protein which leads to huntingtons disease
What is the DNA replication mutation rate in E. coli?
1 / 10^7 bp
What is a mutagen?
a chemical or physical agent that causes mutations
What are other environmental agents (other than mutagens)?
- Carcinogen - causes cancer
- Clastogen - causes fragmentation of chromosomes
- Oncogen - induces tumor formation
- Teratogen - results in developmental abnormalities
Mutagens can be:
- base analogs that can be added directly to DNA during replication
- can react with DNA and cause structural damage
- can cause the cell to synthesize chemicals that have a mutagenic effect
What is 5-bromouracil?
- base analog for thymine, this molecule can be added instead of thymine and this molecule shifts between keto and enol form more easily
What are deaminating agents? What are examples?
- remove amine group - ex: nitrous acid
- hypoanthine is a deaminating agent that cuases adenine to pair with cytosine
What is ethidium bromide?
- a mutagen that intercolates/integrates between the bases of DNA
How does UV light function as a mutagen?
- UV light - induces dimerization of two adjacent pyrimidines - leads to deletions in the DNA sequence during replication
What are alkylating agents? Example?
- add methly or ethyl
- EMS (ethylmethane sulfonate) adds an ethyl group to guanine
- G pairs with T instead of C, GC ->AT transition mutation
What is direct repair?
- a nick in the DNA may be repaired with DNA ligase
- the Ada enzyme removes methyl groups from bases
What are the steps in base excision repair?
1) DNA glycosylase removes the damaged nitrogenous base. Creates AP site (baseless site)
2) AP Endonuclease removes the reibose sugar
3) DNA polymerase adds the correct nucleotide
4) DNA ligase creates phosphodiester bonds
What does nucleotide excision repair act on?
a segment of damaged DNA
What are the steps of nucleotide excision repair?
1) HElicase enzymes unwind the DNA helix
2) Endonucleases create single stranded cuts 24-32 nucelotides
3) DNA polmerase and DNA ligase fill in the excised gap
How does mismatch repair work?
- It is best understood in E. Coli
- After parental molecule is replicated there is a little window of time before it gets methylated - (normal state for E. Coli to be methylated)
1) MutH binds the unmethylated sequence in the daughter strand
2) MutS recognizes the mismatch
3) MutH cuts the phosphodiester bond adn DNA helicase 2 removes the strand with an exonuclease
4) DNA polmerase and DNA ligase fill in the missing sequence
What occurs when mutagens cause double stranded DNA breaks?
Non-homolgous end joining
- this process involves the binding of Ku proteins to the ends of DNA break, then DNA-Pkcs, XRCC4, and DNA ligase IV produce repaired DNA by joining the fragmented DNA
What is homologous recombination?
- breakage and reunion of polynucleotides that share extensive sequence homology
- this process is what is occuring duirng crossing over, but is also used for DNA repair
When can homologous recombination occur?
between different molecules or within a single DNA molecule
What are the two strategies to identify gene function?
- Reverse genetics - start with a gene
(remove the gene or make too much of it and evaluate the effect on the phenotype) - Forward genetics - start with a phenotype and try to find the underlying genetic cause
What is C. elegans?
- 1mm long worm
- grow on bacteria on agar plates
What are some features of C. elegans?
- produce large numebrs of progeny
- easy to maintain
- can freeze
- each worm consists of only 959 cells
- transparent - easy to image on microscope
- embryos develop externally from the parent (in an egg shell)
- the same cells are in the same position in all animals
Describe C. elegans anatomy.
There are 2 neurons, one on either side of the head (ADFL)
What is a genetic screen?
- wild type carrying fluorescent marker
- expose to EMS mutagen
- F1 are heterozygous for many unique mutations
- self-fertilize
- Some F2s are homozygous for a particular mutation
- Look through plates for an obvious phenotype that can be isolated and grown on its own plate
- Mutant with abnormal dendrite morphology
How do you determine if a mutation from a genetic screen is recessive or dominant?
- Self fertilize a mutant to produce a homozygous mutant, then cross with a wild type male
- If F1 looks like mutant = dominant
- If F1 looks like wild type = recessive
How is the phenotype of a mutant linked to a particular gene sequence?
SNP mapping
Outline the process of SNP mapping.
- Cross mutant (N2 strain) with a Hawaiian strain wild type
- Allow the F1 generation to self propagate
- Select for the mutant phenotype in the F2 generation
- Grow to make F3 progeny
- Harvest and pool genomic DNA
- Look for segment without Hawaiian SNPs
Explain the pathway of forward genomics.
- Start with a phenotype and try to find the underlying genotype
- Wild type animals expressing flurescent markers are mutagenized w/ EMS
- A mutant with a noticeable phenotype is isolated
-Analyze mutant w/ crosses - determine recessive allele - Perform SNP mapping to identify an interval in the genome carrying the mutation
- From the interval identify candidate genes
- Go through the candidate genes one by one and see if they have mutations
Rescue expression of the gene and look for phenotype of mutant
What is a dyf-7 mutation in C. Elegans?
- dyf-7 is a small extracellular matrix proteim secreted by cells, forms a cap that the dendrite attaches to at the nose tip
- Attachment at the nose tip does not happen properly in dyf-7 mutants
How do sensory neurons in C. Elegans develop?
By attaching at the nose tip and stretching, cell body migrates away from the nose tip “retrograde extension”
How does the dyf-7 protein exhibit penetrance in C. elegans worms?
- All have the same genotype, but demonstrate different phenotypes (dendrite length)
What is reverse genetics?
Start with a gene and want to identify its function
What are the first step to identifying coding genes?
- Predict the existence of a gene in the human genome
- Find mRNA molecules that correspond to the predicted gene
- Use DNA sequence to predict the amino acid sequence of our gene of interest
What is a bioinformatic approach of reverse genetics?
A homology search
What is a homology search?
- locate genes by comparing the DNA sequence of interest to all other DNA sequences in a database
What are homologous genes?
share a common evolutionary ancestor
What are orthologs?
homologous genes present in different species
What are paralogs?
genes present in the same species, often members of a recognized multigene family
How does a homology search work?
- works by alignment
- query seqeunce (gene x) is compared to every sequence in our database
- number of positions at which you have the same nucleotide or amino acid is translated into a score
What is BLAST?
- Basic Local Alignment Search Tool
- Identifies homologous genes that have 40% similarity or more
What is a protein domain?
- a segment of a protein that possesses a characteristic tertiary structure and carries out a particular biochemical function
What is prosite?
Identifies where highly conserved domains exist in our sequence
What is Prosite Strategy/Bioinformatics Analysis pathway?
- identify a new gene
- translate into protein
- performe BLAST search, look for conserved domains with prosite
- What protein family is our protein of interest in?
- Next: Want to inactivate the gene and look for a phenotype
How do you inactivate a gene?
downregulate or completely remove the gene from the genome
How is microRNA used?
- If a microRNA (miRNA) targets a specific mRNA sequence it will bind and repress gene expression
How is RNAi used in C. Elegans?
- RNAi = RNA interference
- Double stranded RNA designed complementary to our mRNA sequence of interest
- Worms eat the RNAi bearing bacteria
- Gene knockdown throughout the organism
What are some cons of using RNAi?
- RNAi does not result in complete silencing of the target gene (knockdown, not knockout)
- Off- target effect of RNAi are possible
- Immune response to RNAi in certain cell types
What is a technique to completely inactivate a gene?
Knockout - completely remove the gene sequence from a cell or organism using homologous recombination
What are ES cells and what technique are they critical for?
- critical for creating knockout mice
- totipotent cells - give rise to many cell types - germline/gametes
Why are some chimeric mice produced when attempting to create a knockout mouse?
- you are not sure that a genetically modified egg has been fertilized with genetically modified sperm
What is a strategy to simplify downregulating genes?
- target an early exon rather than deleting a whole gene sequence
What is another way to turn off genes other than creating knockout organisms?
CRISPR- Cas9 Genome Editing
What are two major components of CRISPR Cas9 Genome Editing?
- Guide RNA (20bp) - designed to be complementary to the DNA sequence of interest
- Cas9 endonuclease - will follow the guide RNA and create double stranded break at the site where guide RNA is
- The cut is repaired -> indroducing mutation
- this process can be used to mutate the first exon in genome editing,
- vector DNA can be added after DS break and introduced into the genomes
What is the phenotype in mice with mutated caspase?
- Mutations is caspase create a protrusion in the brain, because caspase is involved in apoptosis, during development neurons are overproduced, but then don’t die, open spaces (ventricles) are not formed in the brain
What is the phenotype of a BMP7 knockout mouse?
- missing eyes and kidneys
- BMP7 is a signalling molecule
What are regulatory sequences and how can they be used to edit genomes?
- Regulatory sequences include enhancers and promoters and can be used to overexpress our gene of interest
- this type of mutation creates a transgenic organism
- look for phenotype cause by overexpression
What is a technique used to identify when and where gene X is expressed?
- Green Fluorescent Protein
- Comes from jellyfish
- Replace gene of interest with GFP to visualize expression in a model animal
What is immunofluorescence? What is it used for?
- Immunofluorescence is using a fluorescently tagged antibody specific to your protein of interest to locate where it is found in the cell
What is a challenge with immunofluorescence?
need a specific antibody that only binds to your protein
What are 3 general strategies to understanding gene function?
1) Inactivate the gene and look for a phenotype
2) Overexpress the gene product and look for a phenotype
3) Determine when and where the gene is expressed
What is the transcriptome?
collection of RNA molecules present in a cell
What are the RNAs in cells?
mRNA - messenger <5%
rRNA - ribosomal
tRNA - transfer
What are the two branches of non-coding RNA?
- Short non-coding (snc)RNA <200 nucleotides
- Long non-coding (lnc)RNA >200 nucleotides
What are 5 types of short non-coding RNAs?
- snRNA
- snoRNA
- siRNA
- miRNA
- piRNA
What is snRNA?
- small nuclear RNA
- small nuclear ribonuclear protein (snRNP)
- involved in splicing found in spliceosomes
What is snoRNA?
- small nucleolar RNA
- involved in chemical modifications of RNAs
- found in nucleoli
What is siRNA?
- short interfering RNA
- 20-25 nucleotides in length
- silence target mRNA transcripts
What is miRNA?
- microRNA
- similar to siRNAs, precursor for miRNAs is an RNA molecule with a stem loop
What are piRNAs?
- piwi-interacting RNAs
- 25-30 nucleotides
- repression of gene expression including retrotransposons
- role is unclear
What are long non-coding RNAs?
- in the human genome 50,000 lncRNA transcripts
- found between genes, in introns, may overlap with exons
What is the difference between lincRNA and lncRNA?
- lincRNA is intergenic
- lncRNA is not intergenic
What are some theories about lncRNAs?
- they are pseudogenes that are still transcribed
- they are transcriptional noise
- they regulate other genes
- they are found in cancers
What is an example of lncRNA?
- PANDA lncRNA is an lncRNA that acts as a sponge for transcription factors, removes transcription factors (NF-YA) and reduces transcription of genes downstream
How many genes are expressed in a single tissue?
10,000 -15,000 genes
What are some of the most complex and least complex human tissues?
- cerebellum + testes - most complex (most genes)
- skeletal muscle + liver - least complex (fewest genes)
How many different mRNA sequences are produced and why?
- Differences in alternative synthesis and spicing result in a transcriptome that is 100,000 different mRNA sequences
What is the average number of mRNA molecules in a mammalian cell?
200,000 mRNA molecules, which is ~15 mRNA molecules/gene
How many RNA polymerases are in eukaryotes?
- There are 3 Eukaryotic nuclear RNA polymerases
- RNA polymerase II codes for mRNA
How many RNA polymerases are in prokaryotes?
1
What RNA polymerases do mitochondria and chloroplasts employ?
- Chloroplasts encode for their own RNA polymerase
- Mitochondria use the RNA polymerase that is encoded by the nuclear genome
What is the rate RNA polymerases work at and what is their error rate?
- 2,000 nucleotides/minute
- errors come up 1 per 10^4-10^5 nucletoides
How does RNA polymerase know where to begin transcription?
- Promoter - target sequence located directly upstream of an individual gene
- RNA polymerase recognizes this sequence directly with the help of DNA binding proteins
Describe promoter sequences for transcription in bacteria.
- bipartite promoter consensus sequence
- one locate -10 bp and one -35 bp upstream of the ATG transcriptional start site
How is transcription initiated in bacteria?
- two consensus sequences are bound by RNA polymerase
- bacterial RNA polymerase has a strong affinity for this promoter sequence and therefore high rates of basal trasncription
Describe the initiation of transcription in Eukaryotes.
- RNA polymerase does not assemble efficiently and bind promoter sequences readily
- Basal rate of transcription is low and you need activating factors to start transcription
How do the promoter sequences for RNA pol II in eukaryotes differ from in bacteria?
- promoter sequences are more variable
- TATA box - 25 nucleotides upstream from ATG
- Initiator sequence - also part of the promoter
What is the first step in initiation of transcription in Eukaryotes?
- General transcription factors bind:
- TATA- binding protein (TBP): makes contact with the minor groove in the region of the TATA box and forms a kind of saddle which other factors bind to
- TBP- associated factors (TAFs): there are about 12 and they help to attach TBP to the TATA box
- Other players include TAF and initiator-dependent cofactors (TICs)
- More factors: TFIIA, TFIIB, TFIIF, TFIIE. TFIIH (which make up the pre-initiation complex)
What is the second step in initiation of transcription in Eukaryotes?
- After the assembly of the pre-initiation complex , phosphate groups are added to the C-terminal domain (CTD) of the largest subunit of RNA polymerase
- This changes the ionic properties of RNA polymerase such that it leaves the preinitiation complex and begins synthesizing RNA
What is the general process of initiation of transcription in Eukaryotes?
Assembly of the per-initiation complex will recruit RNA polymerase and change its biochemical properties so it can start the process of transcription
What other factors are involved in initiating transcription in Eukaryotes other than the pre-initiation complex?
- other transcription factors
- may bind to proximal binding sites - not too far upstream in the gene
- may bind to enhancers which can be located far upstream, in this case RNA forms a loop which allows TFs to interact with each other
- these transcription factors can be activators or repressors and some may be both depending on the context
What is the purpose of a pulse chase experiment?
to determine the lifetime of an RNA molecule
How do you perform a pulse chase experiment?
- provide the cells with a tagged substrate that will get incorporated into RNA (Ex: Radioactive 4-thiouracil)
- RNA is synthesized with the tag during the “pulse”
- Follow the tagged RNA molecules until they disappear (“chase” is removing the substrate)
How is mRNA degraded in bacteria?
-mRNAs are degraded by the degradasome - a multiprotein structure that removes nucleotide sequentially from the 3’ end
How is mRNA degraded in eukaryotes?
- the exosome degrades mRNA (similar to degradasome in bacteria)
- si/miRNAs also degrade mRNA - they are incorporated into RNA-induced silencing complexes (RISCs)
- Argonate-endonuclease that cleaves RNA is within the RISC complex
What are 6 strategies to identify what RNA transcripts are found in a cell at a particular time?
Northern blot, quantitative PCR/Regular PCR, FISH, Microarray analysis, RNA sequencing, and Single-cell RNA seq
How do you perform a northern blot?
- extract RNA from cells or tissue, run it on a gel, transfer to a membrane, probe w/ a DNA sequence
How do you perform quantitative PCR?
- extract RNA, convert it to cDNA, carry out a PCR reaction,
- can make it be quantitative by looking for how much product is produced or you can just look for the presence of particular RNAs
What is a FISH experiment?
- Use a DNA probe that is fluorescently labelled
- Hybridize it to RNA at the level of cells or tissue or an entire embryo
What is a benefit of microarray analysis to analyze the transcriptome?
- You are able to detect thousands of transcriptomes in one experiment, while in Northern blot, PCR, and FISH experiments you must go one at a time
How is microarray analysis performed?
- RNA is isolated from cultured cells
- It is converted into cDNA
- This is hybridized to a microchip plate
- Camera takes picture after hybridization
- An array image is produced
- Heat map gene analysis is performed
- (cDNA libraries may be labelled with different colors and expressed on the chip)
What is hierarchical clustering?
- performed after a microarray experiment
- compares expression levels of every gene in the transcriptome and groups them based on similarity of expression patterns
- produces a dendrogram - genes with related expression profiles are clustered together
- this may also be expressed in a heat map (green = low expression and red = high expression)
How is RNA sequencing performed?
- NGS to determine the transcriptome of a sample
- Make cDNA
- Shatter into fragments
- Sequence fragment ends, map reads onto sequences and determine which genes are expressed
Describe microarray strategies.
- economical
- data are easier to analyze
- good for detecting relative differences in gene expression
- can’t identify new genes
- no detection of isoforms
Describe RNA seq stragegies.
- expensive
- the data are harder to analyze - sequencing reads have to be mapped onto the genome
- more sensitive - get more quantitative information
- identify new genes
- detect different isoforms/alternative splicing events
What must be performed to confirm the accuracy of both microarray and RNA-seq experiments?
Need replicate data, repeat the process 3x using separate biological samples
What is single cell RNA-seq and what are two strategies for it?
- scRNA-seq allows for different types of cells in a tissue to be visualized rather than an average
- Two strategies: Dropseq or 10x genomics
- scRNA-seq is “highly dimensional” - thousands of cells expressing thousands of genes
- allows you to cluster cells with similar transcriptomes
- allows for identification of how many unique cell types are found in a particular sample
How does the process of scRNA-seq work?
- Tissue
- Isolate and sequence individual cells (oil droplet and microbeads)
- Map reads onto genes
- Read counts in a table
- Compare gene expression profiles of single cells
- cluster cells with similar transcriptomes using a t-SNE or uMAP plot
What is the epigenetics/epigenome?
- “on top of”/”in addition” to the study of genetics
- stably heritable phenotype resulting from changes in a chromosome without alterations in the DNA sequence
What is the major functional component of the nucleus?
Nucleolus - center for synthesis and processing of rRNA molecules
What is a minor functional component of the nucleus?
Cajal bodies in the nucleus are the site where small nuclear and small nucleolar RNAs are made
When are chromosomes most compact? What happens to chromosome structure after cell division?
Chromosomes are most compact in metaphase - after cell division chromosomes become less compact and cannot be distinguished as individual structures
What are two major components of chromatin?
- Euchromatin and Heterochromatin
Describe euchromatin.
- DNA is relatively open, conformation is either 30nm fiber or the bead on a string, actively transcribed genes
What is constitutive heterochromatin?
- highly compact, trasncriptionally silent
- permanent feature of the genome
- contains no genes
- retained in compact organization
- EX: centromeres, telomeres, most of chromosome Y
What is facultative heterochromatin?
- not permanent, seen in areas where genes are inactive
What are the two types of heterochromatin?
constitutive and facultative
What is chromosome painting and what did it allow us to learn?
- Chromosome painting is a variation on FISH that allows us to understand where chromosomes have set territories
- Learned that each chromosome has its own territory in the nucleus and these territories are conserved over time
Describe general chromosome structure. What is a feature of its parts/spacing?
- Different parts of the chromosome are looping and closer to each other than others
- These are called topologically associated domains
- Each domain is a contiguous segment of chromatin folded into loops and coils 10 to 1,000 kb
- In humans or mice the average is 1mb
What is a nucleosome?
a structural unit of a eukaryotic chromosome, consisting of a length of DNA coiled around a core of 8 histones
What are nucleosome modifications?
- changes in chromatin packing that affect gene expression
- occurs at the level of the nucleosome
- chemical modifications added to histone tails allow sthe nucleosome to become more/less compact and impacts how easily RNA is transcribed on certain genes
What is histone acetylation? What is the impact of acetylation? Where does it occur? What performs this process?
- attaches an acetyl group to lysine amino acids in the histone tail
- acetylation reduced the affinity of the histones for DNA and possibly reduced the interaction between individual nucleosomes
- Histones in heterochromatin are generally unacetylated while histones in active areas are acetylated
- Acetylation performed by Histone Acetyle Transferases (HATs)
- Acetylation is reversible
What are HATs?
- Enzymes that add acetyl groups (Histone Acetyl Transferases)
- p300/eBP
- HATs work in a complex, there are 5 different families of HAT proteins and different HATs acetylate different histones
What undoes histone acetylation?
- Histone Deacetylases (HDACs)
- This represses gene expression
- Sin3 is an HDAC complex in mammals
- Mutations in the proteins of this complex can result in cancer because too much gene expression of certain genes can drive uncontrolled cell division
Describe methylation of histones. What amino acids? What is the effect?
- Methylation of lysines/arginines in the histone tail
- More longterm activating or repressing effect
- Performed by histone methyl transferases and histone demethylases
Name the 5 major changes to histone tails.
Acetylation, Phosphorylation, methylation, ubiquitination, citrullination
Where does phosphorylation occur in histone tails?
Serine, threonine, and tyrosine amino acids
Describe ubiquitination of histones.
Ubiquitination of lysine amino acids
- add a small molecule called ubiquitin
- or add small ubiquitin related modifier (sumo)
Describe citrullination of histones.
Citrullination- N-terminal regions of H3 and H4 conversion of arginnie to citrulline NH -> O
How many histone modification possibilities are there?
- over 80 sites can be modified and modifications can interact w/ each other
What is the histone code?
pattern of chemical modifications that specifies which regions of the genome are activated or inactivated
What is nucelosome remodelling and what does it include?
- Nucleosome remodelling is changes in the positioning of nucleosomes so that DNA binding proteins can access their binding sites
- includes remodelling, sliding, and transferring of nucleosomes
What is a DNAse 1 sensitivity assay?
- open areas in chromatin are sensitive to DNAse 1 activity
- when the gene is wound around the nucleosome tightly it is protected from being cut by this enzyme
- DNAse 1 hypersensitive sites found upstream of genes are the areas that must be opened up and are the areas that are important for activating transcription
Describe DNA methylation. What enzyme? What are the 2 types? What does it lead to?
- DNA methyltransferase
- Maintenance methylation - maintaining methylation from a parent strand on a daughter strand
- De-Novo methylation - new patterns of methylation in daughter strands
- DNA methylation always leads to gene repression
- methylation status of CpG islands reflects the expression patterns of adjacent genes, if an island is methylated the gene isn’t expressed
How do CpG islands cause gene repression?
- CpG island becomes methylated
- HDAC (histone deacetylase) attaches via a methyl group
- Chromatin is more packaged -> gene not expressed
What is genomic imprinting?
- presence of methylation that silences a gene depends on which parent it was inherited from
- 200 genes where only a maternal or paternal gene is activated
What is an ATAC-seq assay?
- for transposable accessible chromatin using sequencing
- used to understand which regions of the genome are open and therefore likely actively transcribed or poised
- this assay takes advantage of a hyperactive TN5 transposase that inserts sequencing adaptors into open regions of the genome
How does ATAC-seq occur?
- Transposase cleaves and adds adaptor sequences
- Isolate short DNA sequences and perform next generation sequencing
- Map reads onto reference genome (to find open chromatin sites)
- Used to compare changes in chromatin accessibility as detected by a taq-seq (when certain cells are treated with drugs vs not)
What is the proteome?
all of the proteins in a cell
What are two different variations of proteomics? And which is more useful?
- Top down and bottom up proteomics
- Bottom up proteomics is more useful
What is top down proteomics?
- Take protein mixture (with mass less than 50 kDa), separate the proteins, MS analysis of intact protein (<50kda)
- Get intact protein mass and protein sequence data
What is bottom up proteomics?
- Take protein mixture (no mass limit), digest into peptides, MS analysis of peptides (~500-3000kDa)
- Get intact peptide masses and peptide sequences
What is trypsin?
- a protease that only cleaves after an arginine or a lysine (R or K)
- every protein is cut in a specific pattern
What are the benefits of mass spec?
- unbiased - no prior knowledge of targets is necessary
- high-throughput
- sensitive
- versatile
How does mass spec work?
- peptides are sprayed into a tiny orifice in the machine (low pressure) - electrospray
- they are ionized and enter the gas phase
- enters the C-trap where they can be held
- read in orbitrap - extremely accurate mass deduced
- a few miliseconds/peptide
How does the mass spec sequence peptides?
- sequences peptides by fragmentation
- peptides fragment only once per molecule when exposed to N2 gas
- each new fragment forms a different peak on the spectra
- the “b-ion” series sequences left to right
- the mass differences between each peak each represent a mass change for the addition of one amino acid
- the know amino acid differences can be used to deduce the amino acid sequence
What is proteomics good for?
allows you to identify if a protein is phosphorylated, degraded or located in a membrane in a way that transcriptomics cannot
Why should the interactome be mapped?
While it is now easy to map genomes, gaining an understanding of protein function gives a richer more comprehensive understanding of the entire cell
What is IP-MS? What is it used to study? What is the process?
- Immunoprecipitation Mass Spec
- Allows identification of protein interactions
- Start with cells engineered to express a specific protein of interest “bait protein” with a molecular “flag tag”
- Lyse cells
- Incubate lysate with beads that have the antiflag antibodies
- Perform a trypsin digest and mass spec to identify interacting proteins
What can knowing many protein interactors allow for?
- allows you to define new complexes and pathways
- reveal basis for proteome organization and regulation
What is bioplex and what is it useful for?
- the human protein- protein interactome
- association of new proteins with known complexes