L32 Functional Human Genetics 2 Flashcards
GWAS hits
see onenote
- Most GWAS hits aren’t causal but are in LD with causal variant
- majority of GWAS hits are located outside protein coding regions
- How do we prioritise each variant?
What does rs8050136 do?
see onenote
- strong association and large effect on BMI, weight, T2D
- located in intron of FTO gene
How do we know the genome works?
see onenote
- Protein code - triplets
- String of polypeptides
- But most of the genome is not protein-coding genes - only 1.5% codes for proteins but the rest aren’t junk DNA, they have a function
From genotype to function
see onenote slides
- functional genomics, deciphering how the genome actually works
transcriptomics
- Transcriptomics is also referred to as expression profiling, examines the expression level of RNAs in a given cell population
epigenomics
- Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome
Why do we measure RNA?
- Easy to measure at high throughput
- Close to DNA, which we understand fairly well
- But RNA may be far removed from the ultimate phenotype - us
Two main technologies today:
- Microarray
- RNA seq
Expression microarrays
see onenote
- Start with sample of RNA
- if RNA molecule matches to known DNA probe, it will bind
- 25bp DNA probes, we know the sequence of these probes
- The more RNA that binds, the brighter the signal
- Microarray is ultimately a fluorescent based technology
- But you can only go so bright, you lose a bit of definition
RNA-seq
see onenote
- Sequence every single of RNA in that sample
- Map reads back to the genome
- How many reads map to the 1st gene, 2nd gene etc.
- Doesn’t depend on having the right probe, depends on the gene being in your sample
Microarrays vs RNA-seq
- Microarrray
○ Shallow, fast, cheap overviews of expression across many individuals
○ Straightforward
○ Relies on known DNA probes, can miss genes that don’t have probes present - RNA seq
○ Unbiased, can be used to uncover new biology
○ Quantitative, more precision
What can we learn from gene expression levels?
see onenote
- Cell-type specific, foundation of tissue specificity
- Look at same tissue between healthy people vs patients
- Tissues within people
- Tissues within people across time
The human transcriptome
- Fewer protein coding genes than we expected
- but transcriptionally complex, complexity must be somehow encoded in our DNA but how?
Large-scale quantification of human transcription: GTEx
see onenote
Genotype-tissue expression project
- RNA-seq on 40+ human tissues across 400+ (deceased, healthy) humans
- capture both intra- and inter-individual variation
- Main lesson: human tissues are transcriptionally complex and fairly distinct
- Differences between tissues don’t depend on a lot of genes
- Specificity is encoded by a small number of genes in comparison to house keeping genes which are required is most cells and tissues
- transcription can vary in association with age, sex, ethnicity etc.
- Good reference source for QTL identification and GWAS annotation
Quantitative trait loci can impact expression levels
QTL - genetic region associated with a particular quantitative phenotype
- by this definition, all GWAS hits are QTLs
- using RNA-seq we can measure the impact of individual SNPs on nearby gene expression levels, knows as eQTLs
How does eQTL impact expression levels?
see onenote
biological process of going from DNA to RNA to protein can be regulated at many different levels
- eQTL testing requires both genotype and expression measurements in the same individual
- Gene specific
- When, where, how long, how many transcripts are made, how quickly are they made
Post-transcriptional regulation
- effects stability and localisation of transcripts
Identifying eQTLs brings us one step closer to understanding the cellular mechanism that underlie differences between individuals
Gene regulation simultaneously occurs at multiple levels
see onenote
- DNA methylation
- chromatin modifications
- DNAse 1 hypersensitive sites
- TF binding sites
- long range regulatory elements
- promoter architecture
- protein-coding and non-coding transcripts
Chromatin accessibility
see onenote
- DNA coiling around histones => nucleosomes
- euchromatin
- heterochromatin, inaccessible
- accessibility impacts whether a gene can be transcribed