genome variation Flashcards
what are 2 levels of variations?
- major macro - level differences generally associated with disease (aneuploidy, translocations e.t.c)
- micro/molecular level ( point mutation and sickle cell anemia, 3bp deletion in CFTR)
What % of DNA is the same between 2 people?
- 99.7% DNA is same between 2 ppl ( ~ 9 million bases are different)
Define single nucleotide polymorphism (SNP/SNV).
- DNA sequence variation that occurs when a single nucleotide (ATCG) in a genome sequence is altered.
What is polymorphism?
- variation in DNA sequence gene expressed in many allelic form.
What are micro - satellite (short tandem repeat).
- 2 to 5 bases (di, tri, tetra , pent)
- sequence doesn’t usually vary , length varies.
- heterozygous bc of variability tandem repeat
- multi- allelic
what causes short tandem repeat?
- polymerase slipage of replicated strand
=> pause in polymerase during elongation , polymerase reanneals and bases can bind to wrong - non complementary.
What are 3 types of common genetic variants?
common = see a lot of these types of variants throughout the genome
1. single nucleotide polymorphisms (SNPs) ~ 17 million identified , ~ 3 million / genome
- micro-satellites/STRs ~ 3% of the genome
- copy number variants (CNVs) > 2000 identified ~ 100 per genome.
What has higher frequency in biallelic, minor or major allele?
- minor allele is relatively high
=> population frequency
=> proportion of chromosomes that carry each allele in the population
how do we know what is normal and what is variant?
- 4 anonymous individuals averaged out
- if 1 individual has C in position 2 lets say and the other 3 have A
- reference allele is A so anyone who doesn’t have A has minor allele/alternative
- anyone who has A has major allele.
What is heterozygous allele?
- each chromosome has 2 copies
- the base in each position of the 2 chromosomes is same (homozygous)
- however sometimes there is a different base (A instead of C ) then this position is heterozygous.
What are characteristics of single (SNV/ SNP)?
- high frequency : 1 in every 300 nucleotides in reference genome
- one individual : 1 in every 1000 bases
- millions SNVs identified in human genomes
- majority exomes
- generated by mismatch repair during DNA replication.
Where does SNV happen?
1. gene: can lead to : => no amino acid change (synonymous) => amino acid change (non-synonymous/ missense) => stop codon (non-sense) => splice site => UTR (gene expression)
- promoter:
=> protein expression - non - coding region
- without deleterious effect or population annihilation, SNVs do not disappear.
Give an example of a mutation that is both deleterious and beneficial.
- sickle cell anemia
- deleterious = sickle cell anemia
- benefit = heterozygous advantage against malaria , this is why SCA allele is more common in African countries as malaria is a bigger issue (1 in 20 chromosomes) compared to the European countries 1 in 10, 000 chromosomes.
What is the genetic basis of sickle cell anemia?
- single base change /point mutation
- codon GAG => GTG
- glutamic acid => valine
What minor allele frequency is needed for rare and common polymorphism/variant?
- rare polymorphism = MAF 1-5%
- common polymorphism = MAF > 5%
What evolutionary forces create single nucleotide variation (SNV)?
- mutation = new alleles
- gene flow = migration introduce new variants into population
- genetic drift = random change in variant allele frequency between generations
- selection = non - random change in variant allele frequency between generation, driven by benefit or deleterious effect.
Where do micro- setellites/ STRs occur?
- part of 98% of genome not coding for protein
=> intronic or UTR : may affect hene expression
=> intergenic - EXONS
=> extra AAs in proteins
What is an STRs expansion disorder?
=> Huntingtons disease
- mono genie disorder as tandem repeat is within exon, alters size of the genome region.
- 35 repeats + = Huntington disease
- autosomal dominant, one parent has it child has 50% chance of having it.
How do tandem repeats vary?
- the bases are the same/ sequence is the same but the number of repeats varies.
How are lengths of micro satellites named?
- Dinucleotides (CA)(CA)…
- Pentanucleotide (AGAAA) (AGAAA) …
Why does slippage tend to occur in tandem repeats?
- when there is a breakage and the bases are reattaching to complimentary bases they reattach in the wrong position as the sequence is repettive.
Define locus.
unique position in the genome.
Define allele.
particular form of a given position in the genome.
Define single nucleotide variant (SNV).
a variant arising due to mismatch repair going wrong.
Define micro satellite (short tandem repeats).
a variant arising due to polymerase slippage.
Define copy number variant (CNV).
- a variant arising due to non-allelic homologous recombination.
- simplest type of copy number variation is the presence or absence of a gene (2, 1, 0)
- duplication of genomic segment could result in diploid copy numbers of 2, 3, 4.
Define polymorphism.
AKA variant
one or more variants of a particular DNA sequence.
What are genetic variants used for?
- can be used as gene markers to help find disease causing genes/mutations
=> linkage analysis (microsatellite, SNPs)
=> association analysis
(SNPs, CNVs)
what is the book analogy?
whole book = genome chapter = chromosome paragraph = CNV sentence = STRs letter = SNP