Proteomics & Transcriptomics Flashcards
Expression profiling
Comparing gene expression levels in 2 or more samples using thousands of genes
Microarrays
Assay expression of thousands of genes
Gene chips
slides or chips with oligonucleotides
either ~25 bp, ~70bp, correspond to each gene;
multiple per gene;
attached to chip
How do microarrays show expression levels
cDNAs from 2 samples are fluorescently labelled with different dyes & competitively hybridized to gene chips
cDNA hybridized to chips, chips are scanned & fluorescence detected; data analyzed to determine expression level
Microarray – RNA-seq comparisons for expression profiling:
Microarray advantages
cheaper, less computational intensive than RNA-Seq
Microarray – RNA-seq comparisons for expression profiling:
Microarray Disadvantages
- Need a reference genome or transcriptome to construct arrays
- Need to verify results with qRT-PCR
- Cross-hybridization to genes with highly similar sequences
Microarray – RNA-seq comparisons for expression profiling:
RNA-seq, advantages
can detect…
- rare transcripts, can detect
- allele-specific expression
- exon-intron boundaries
- alternative splicing
Don’t need reference Genome or transcript one
Don’t need to construct an array
- Works with non-model organisms
- No cross hybridization but can Multi-map reads
Microarray – RNA-seq comparisons for expression profiling:
RNA-seq, disadvantages
- Very expensive
- Very large sequence files
- Need scripting Knowledge (Python, R)
transcriptome profiling
Profiling: comparing expression levels in different samples
Transcriptome profiling platforms
Illumina is currently best
Transcriptome sequencing to get a reference transcriptome
Reference transcriptome sequencing:
all the RNAs present in the sample, multiple tissue/organs are used – can mix or not mix
RNA-seq data analysis
1) Align reads to the genome or reference transcriptome
2) count reads
3) quantify expression for profiling using analysis programs
For read alignment: Bowtie (older), GSNAP, STAR
Cufflinks (older) — used to estimate transcript abundance & test for differential expression
Measuring gene expression levels with Illumina RNA-seq
RNA-seq reads aligned to a genome: read coverage in exons is often not uniform
Normalize the number of reads to the length of the gene
Ribosome profiling: Ribo-Seq
- Capture ribosome-mRNA transcript complexes in action
translating mRNAs, by cycloheximide treatment - 28-30 nt of a transcript captured
- Allows for high-throughput analysis of translational activity
- Measure the rate of protein production, but not abundance in contrast to MS proteomics
Hi-C
Method to study the three-dimensional architecture of genomes by sequencing long range interactions of DNA regions
How to study the three-dimensional architecture of genomes?
Steps of Hi-C?
- Cells are fixed with formaldehyde, causes interacting loci to be bound to one another by covalent DNA-protein cross links
- DNA is digested with a restriction enzyme, the linked loci remain intact
- Blunt-end ligation is performed to ligate cross linked DNA fragments
- Results in genome-wide library of ligation products, corresponding to pairs of fragments that were originally in close proximity to each other in the nucleus
- Library is sheared, junctions are pulled down with streptavidin beads (bind to biotin)
- Purified junctions are sequenced
Transcriptome sequencing platforms
454
Pacific Biosciences (PacBio)
Ion Torrent
Oxford
Nanopore’s Minion – longer reads
Short reads: Illumina
Genomic studies of cis-regulatory elements
DNase1-seq: identifies protein binding sites that are potential regulatory elements
Chromatin immunoprecipitation followed by tiling array (ChIP-chip) or Illumina sequencing (ChIP-seq): analyzes binding sites of an individual TF
DAP-seq: DNA affinity purification sequencing
DNase1-seq
Used to identify protein binding sites that are potential regulatory elements
Chromatin immunoprecipitation followed by tiling array or Illumina sequencing
ChIP-chip or ChIP-seq
To analyzes binding sites of an individual TF
To characterize histone modifications
- Can reveal his tone marks in gene coding vs. regulatory sequences
DAP-seq
DNA affinity purification sequencing
- TF binding site discovery assay
- Couples affinity-purified TFs with Illumina sequencing of
a gDNA library - Identifies genome-wide binding locations for each TF assayed
How DNAse1-seq works
- DNase 1 cuts at open chromatin sites where no protein is bound, locations of bound proteins are protected
- Sequencing DNA to show sites of bound proteins
Assumes bound proteins are regulatory factors
- Does not reveal identity of the proteins
What DNase1-seq reveals
- Reveals protein binding sites on a genome-wide scale at single base resolution
- Can reveal new cis regulatory elements in genes
How to ChIP-chip or ChIP-seq
- Cross-link complex with formaldehyde
- Purify nuclei
- Fragment chromatin proteins bound to DNA
- Immuno-enrich protein-DNA complex antibody to a specific TF
- Reverse cross-links (65 degrees C)
- Hybridized to array or Illumina Sequence!
DAP-seq vs ChIP-seq
DAP-Seq:
- Faster, less expensive, more easily scaled up than ChIP-seq
- DNA library constructed using native gDNA from any source, preserves cell and tissue specific chemical modifications that can affect TF binding such as DNA methylation
How to DAP-Seq
- DNA library is incubated with affinity-tagged in vitro expressed TF
- TF-DNA complexes are purified using magnetic separation of the affinity tag
- Bound genomic DNA is eluted from TF and sequenced
- Identifies genome-wide binding locations for each TF assayed
__________ and ________ ID TFs that bind regulatory elements, but _________ does not
ChIP-seq and DAP-seq ID TFs that bind regulatory elements, but DNase1 does not
_______ identifies only ___ TF at a time
ChIP-seq identifies only 1 TF at a time
DNase1-seq identifies all __________________ in a genome but not _________________.
DNase1-seq identifies all TF binding sites in a genome but not TFs that bind them
Which of the following is not true about microarrays?
A) Transcripts hybridize with the probes on the array
B) Multiple oligonucleotide probes are designed for each gene
C) Transcripts are labeled with fluorescent dyes
D) All are true
D) All are true
Which of the following is not true about RNA deep sequencing (RNA-seq)
A) It can detect currently unrecognized expressed genes
B) It depends on genome annotation for mapping reads
C) It can detect allele-specific expression
D) All are true
B) It depends on genome annotation for mapping reads
Which of the following are drawbacks of RNA-seq?
A) Large sequence file sizes
B) High cost
C) Uneven sequencing depth along the length of a transcript
D) All of these
D) All of these
Which of the following are disadvantages of expressed sequence tag (EST) sequencing?
A) It generally cannot identify mRNAs that are expressed at low levels
B) It is relatively slow
C) It is expensive for the amount of data acquired
D) All of these
D) All of these
Which of the following is not a drawback of tiling microarrays (which are different from regular microarrays)?
A) They may not distinguish between highly similar duplicated genes
B) They may not identify lowly expressed genes
C) The organism must have a sequenced genome
D) They can only identify previously known expressed genes
D) They can only identify previously known expressed genes
Which of the following can not be done with RNA-seq?
A) Alternatively spliced introns can be identified
B) Exon-intron boundaries can be identified
C) Histone modifications can be identified
D) All of these can be done
C) Histone modifications can be identified
What can be done with RNA-seq?
A) Alternatively spliced introns can be identified
B) Exon-intron boundaries can be identified
What are disadvantages of expressed sequence tag (EST) sequencing?
A) EST sequencing generally cannot identify mRNAs that are expressed at low levels
B) It is relatively slow
C) It is expensive for the amount of data acquired
What are drawbacks of RNA-seq?
A) Large sequence file sizes
B) High cost
C) Uneven sequencing depth along the length of a transcript
liquid chromatography
Separate proteins by 2D gels, or liquid chromatography,
cut out spots, elute proteins, digest into peptides with
trypsin protease