Lecture 9 - Coding Vs Non Coding Variation Flashcards
What is the most common human genetic variation?
Human polymorphism
SNP is Single nucleotide polymorphisms —> single nucleotide substitutions is most common
- Human genome has millions of SNPs
Most types of SNP? Their affect?
Challenge?
Most SNPs are NEUTRAL = no phenotype
Most SNPs are in NON-CODING REGIONS
—> Non-coding SNPs are likely more IMPORTANT IN COMPLEX DISEASES AND PHENOTYPES
The CHALLENGE: Which of the variants alter phenotype?
Types of Functional Variants : 2
- CODING variation
- Non coding/ regulatory variation
Types of Functional Variation: CODING -2
• Amino acid variation
• Splice/reading frame variation
Types of functional variation: Non - Coding Variation - 4
• Transcriptional
• Post-transcriptional (mRNA processing)
• Non-coding RNA (miRNA, lncRNA)
• Epigenetic
Types of functional variation: Non - Coding Variation - 4
• Transcriptional
• Post-transcriptional (mRNA processing)
• Non-coding RNA (miRNA, lncRNA)
• Epigenetic
Every locus in Unqiue …
Every locus unique - requires a different approach
Overview of Functional annotation
A.
- GWAS
- GENOME
- EXOME
- Linkage and sequence analysis
B. Sequencing on Chromosome
C. Variants of various functional classes
D. Comparative genomics
E. Biochemistry/ Structure
F. Experimental function
Overview of Functional Enrichment
Where are the Regulatory elements?
Most functional variation is NON CODING AND REGULATORY
Genome
- coding 0.5%
- UTR 0.8%
- promoter 2.2%
- DHS 15.7%
- enhancer 3.2%
- intergenic DNA 52.0%
- Introns 28.8%
Workflow for Variant Functional Identification - 5
- FINE MAPPING
- In silico annotation
- SNP function
4 . Target gene(s) identification - Target gene function
Fine mapping involves:
• Locus genotyping (sequencing)
• Statistical tests for independent associations
• LD mapping
In silico annotation:
- Integrative tools (ENCODE)
• Exome-translation
• Intron-intergenic
SNP function —> target gene/s identification
- Coding– GERP , PolyPhen
• Non-coding– GERP, RegulomeDB (ENCODE)➢ Transcr. factor element –EMSA, reporter gene, CHA-seq
➢ miRNA - TargetScan, miRanda
➢ lncRNAs– REMSA
➢ Epigenetic variants - MeQTL
Target gene/s function
• Cell culture models, human tissues
• Isogeneic models (CRISPR/Cas)
• Animal models– mouse KOs, zebrafish, Drosophilia
Identification of Coding-Region Variants: 3 types
- Frame-shift variation (indels) etc —> protein truncations, fusions
- Base-substitution: 2 types
- Splice-site variation
- (may change AA sequence, may have
non-coding effects e.g. regulation of RNA processing)