module 1 Flashcards
Mutation (M):
Inheritable change in organism / cell’s DNA sequence
Mutagen:
Agent causes Genetic Mutation
Deletion (M):
- of 1 or more bases
Insertion (M):
+ 1 or more bases
Substitution (M):
Replace one base with another
Inversion (M):
Segment of DNA flipped & reinserted in opposite direction
Mutation causes:
Spontaneous – error in DNA replication or repair
Induced – caused by mutagens e.g. radiation, chemicals, viruses
Mutation importance:
Introduce genetic variation into populations
Some beneficial, too many can be harmful
Balance is key = some help species survive, some cause disease
and genetic disorders
Mutation causes:
Emergence of new viruses / viral strains (e.g. influenza, COVID-19
strains)
Cancer development (M in oncogenes & tumour suppressor
genes)
Resistance to treatments (e.g. antibiotic resistance in bacteria,
chemotherapy resistance)
Mutation impacts:
Some M advantages in specific environ
Stressful conditions = increase M rate, driving evolution
Too many M lead to diseases (e.g. cancer, genetic disorders)
Genetic variation (GV):
Drives evolution & adaption over time
Genetic Variation – Adaptation:
E.g. peppered moth + industrial revolution
Pre revolution = light-coloured ->; genetic mutation (GM) = dark
(melanic) form
Revolution = soot darken surfaces ->; dark moths survival
advantage = more common (natural selection)
Post-revolution = light moths increase ->; adaptation to
environmental change
Antibiotic resistance:
Experiments ->; cells plated on agar + antibiotic
Outcome ->; resistant mutants survive & grow
Phage Resistance Mutants:
Experiment ->; cells plated on agar coated with bacteriophage
Outcome ->; phage-resistant mutants survive
Sugar non-utilisation mutants:
Experiment ->; colour-based assay detect sugar metabolism
E.g. Lactose fermentation on MacConkey agar (Lac+ = pink, Lac-
= colourless)
Spontaneous Mutations (occur naturally):
M rate vary between genes
Causes ->; DNA polymerase errors (looping out / skipping bases),
chemical changes (depurination, deamination)
No external mutagens required
Induced Mutations (caused by external factors)
Mutagens e.g. radiation (UV, X-rays) / chemicals (dietary,
environmental, lifestyle factors)
Missense Mutations (MM):
Single amino acid (AA) replaced
E.g. sickle cell anaemia (Glutamic acid (Glu) -> Valine (Val) in
haemoglobin)
One letter (AA) substituted = change meaning
Nonsense Mutations:
AA codon -> stop codon
= premature termination of translation -> truncated protein
E.g. Duchenne muscular dystrophy
AA codon substituted by STOP codon = truncating the sentence
(protein)
Frameshift Mutations:
= insertion/deletion of nucleotides shift reading frame
= nonfunctional protein / early termination
E.g. cystic fibrosis (CFTR gene deletion)
Letter (AA) inserted/deleted. Shifting reading frame (all AA move to
right) = make sentence nonsensical
Reversion/suppression:
Reversion of mutation suppresses og M + restore wild-type
function
When 2 nd M compensate for/ directly reverses 1 st
True reversion:
reverts exactly to og wild-type sequence
Second-site reversion (suppressor Mutation):
= different site but restore og gene function
Intragenic reversion:
M in same gene counteract og M
Mutagen-induced M
o Mutagen increase M frequency in population
o Higher dose = more M
Intergenic Reversion:
M different gene suppress effect of og M
Mutagens = Mutations:
Mutagens = agents that increase mutation frequency
Cell survival & DNA damage
o Mutagens = DNA damage = lethal to cells
o Mutagen increase = % of surviving cells decrease
SCREENING FOR MUTAGENS:
Constantly exposed to new chemicals in enviro
Crucial to identify mutagens, assess their cancer-causing potential
(carcinogenicity)
Why should we screen?
Many mutagens in bacteria = cancer in animals
Mutagenicity tests screen for carcinogens before human exposure
Some non-mutagenic substances become mutagenic after
metabolic activation -> occur frequently in liver
o E.g. Benzopyrene (cigarette smoke) converted to DNA-
damaging agent in liver = mutation
Ames Test:
1 st widely used test for cancer screening
Determine if chemical is mutagenic by testing ability to induce M in bacteria
Concept behind test:
o Cancer & M induction shares fundamental process
o Uses a His - (histidine-requiring) mutant strain of salmonella typhimurium
o If chemical = M -> bacteria revert to wild-type (His + ) & grow without histidine
If substance causes more reversions = likely mutagen & possibly Carcinogen
Process
1. His - mutant of salmonella typhimurium
2. Treat with test substance
3. Plate cells on minimal agar without HIS
4. Colonies / no colonies
Low colonies/ no colonies = not a mutagen /
spontaneous background level of mutants
More colonies = substances are mutagenic.
Metabolic activation of mutagens:
Non mutagenic in original form -> mutagenic after metabolism
WHY DOES THIS MATTER
o Bacteria & humans have diff metabolic enzymes,
(mutagenicity tests in bacteria may not always reflect human risk)
o Aromatic & nitroaromatic compounds = common examples that require metabolic activation
o Ames test often performed with liver enzymes (S9 fraction) to simulate human metabolism
- His - mutant of salmonella typhimurium
- S9 (liver enzyme) extract added in
- Treat with test substance
- Plate cells on minimal agar without HIS
- Colonies / no colonies
DNA repair mechanisms:
Concern for organism’s survival & fecundity (biological ability to reproduce) is genomic stability
Errors in DNA during synthesis / post replication
Direct reversal:
Repair damage without cutting DNA strand (e.g. photoreactivation fixes UV-induced thymine dimers)
Base Excision Repair (BER):
Remove damaged single bases replace with correct nucleotide
Nucleotide Excision Repair (NER):
Remove bulky lesions (e.g. UV-induced thymine dimers) by cutting out short DNA segment
Mismatch Repair (MMR):
Fix errors that escape DNA pol proofreading, ensure accurate replication
Error Prone Polymerases (EPP):
DNA pol -> copy damaged DNA when high-fidelity pol fail
Alone or with accessory proteins (to bypass lesions)
High error rates – 10 -1 to 10 -3 (10 -10 normal pol)
Mispairing tendency = form mis-pairs
Can replicate unpaired DNA = can continue even if terminal base is unpaired
Lesion-specific activity – diff EPP specialize in replicating diff DNA lesions
Biological role: Translesion DNA synthesis (TLS)
o Allow replication continue past DNA damage, prevent stalled forks
o Last-resort survival mechanism in stressed cells
o Trade-off: +M rates = GV / disease
SOS system – induction of EPP by DNA damage:
DNA damage response in BACTERIA activates EPP (bypass
lesions) -> induced when DNA replica is blocked by DNA damage.
DNA damage RecA activation:
Lesions stall replication = activate RecA
Bind to ssDNA, trigger SOS response
LexA repression lifted:
LexA represses SOS genes (e.g. umuC, umuD)
Activated RecA cleaves LexA = SOS gene expressed
Induction of EPP
umuC -> encodes Pol V (EPP)
umuD -> accessory protein for Pol V -> help bypass lesions
= allow replication to continue = high mutation rates
Genetic variation:
Arise from mutations at different levels
o Alter gene activity, protein function, traits, evolution
Patterns of inheritance:
Hereditary variants occur in germ cells
Some manifest later: huntington’s
Other confer some benefit (sickle cell disease)
Mitochondrial DNA
Biological diversity & evolution:
Driven by diff in DNA sequences among individuals in populations
(below species level)
Sources of genetic variation:
- Mutations = random changes in DNA (e.g. point mutations,
insertion, deletions) - Genetic Recombination = cross over during meiosis -> new allele
combinations - Gene flow = gene movement between pop
Importance of Genetic variation:
Influences ->traits + disease susceptibility + drug responses
Essential for -> natural selection + adaptation to environments
Used in -> personalized medicine + forensic science + ancestry
studies
Fidelity:
important property of DNA -> accuracy of replication + repair
Single nucleotide polymorphism (SNP):
~90% of human genetic variation
Most no impact on cell function, some affect disease risk & drug response
Usually bi-allelic = on or two possible nucleotides at a given position
o E.g. A/G SNP = nucleotide can be either A or G
Polymorphisms:
variants of DNA sequences appearing in >1% of population
1. Single nucleotide polymorphism (SNP) = single base change (most
common type, major source of heterogeneity
2. Short tandem repeat (STR) = 2+ DNA bases repeated numerous
times, head to-to-tail
Allele:
New version of genes
Allele terminology:
Risk allele = associated with risk for disease e.g. APOE ε4
o ε4 heterozygotes = 5% AD risk, ε4 homozygotes = 20% AD risk
Protective allele = protective against disease e.g. APOE ε2
Alternative allele = can be more than one e.g. APOE ε3 (neutral)
Major allele = more common
Minor allele = less comon
Structural variants:
Large-scale DNA variations between individuals
Range 50 base pairs to over 1 million
Can involve insertions, deletions, duplications, inversions & translocation
May influence gene function, disease risk & genetic diversity
Copy number variants:
Where sections of DNA are deleted (loss) or duplicated (gained)
Most common type of structural variant
Typically >1000 base pairs but not detectable on Karyotype (too small)
Most CNVs occur in non-coding region (~97% of genome) -> may affect regulation rather than direct protein function
Coding CNVs (in protein-coding genes) often have stronger effects
on gene function & are easier to interpret
Genetic variation summary:
Structural variant (SV) = broad term for DNA alterations >1kb in size
Neutral descriptor = SVs are defined without implying frequency,
disease association or phenotypic impact
Short structural variants = smaller in size but still classified as SVs
Common & challenging to detect with standard sequencing methods
Individual’s variants:
Variant frequency in population -> most variants rare
Variant frequency within individuals -> most variants are common
Differences from the reference genome = each individual has 4-5
million variants different -> >99.9% are SNPs or short indels
Haplotype:
Genetically linked SNPs inherited together on same chromosome
Binary string:
Can represent Haplotype = each SNP has two possible alleles (e.g. 0 or 1)
Help in tracing ancestry, understanding genetic associations & studying recombination patterns
Haplotype blocks:
Segments of DNA with little/ no recombination
Within the SNPs highly linked + inherited together
Tag SNPs:
Efficiently represent most SNPs in Haplotype block
Instead of genotyping every SNP identify tag SNPs
reduce genotyping effort
used in genome-wide association studies
different populations = different patterns of linkage
Linkage Disequilibrium (LD):
occur when different SNPs inherited together non randomly
SNPs physically close on chromosome decreased likelihood of recombination
LD allows identification of groups of highly correlated SNPs (measure correlation between SNPs = detect association) (one
SNP needs to be genotyped from each set = can predict others (i.e. tag SNPs)
Important measure in population genetics:
o Identify regions under natural selection -> when natural selection favours a particular allele, nearby SNPs “hitchhike”
with it due to low recombination
o Reconstruct ancestry & population history -> people with shared ancestry have similar haplotype blocks due to limited
recombination over generations
o Improve genetic association studies (GWAS) by reducing the number of SNPs that need to be analysed
Occurs when 2 points on a chromosome remain linked
Disequilibrium eventually moved to linkage equilibrium (no correlation between 2 SNPs) over time
o Recombination eventually occurs between every possible point
Linkage Disequilibrium (LD):
The Friends Who Always Stick Together
Beginning of party (Og chromosome) -> 2 close friends (SNPs),
Maria + Lishan, always arrive + stay together
If you see Maria at party, predict Lishan nearby -> never separate
Presence is not random – they are linked
Recombination:
The social mixer that splits groups
Night goes on -> party host (recombination) encourages people to
meet new friends (SNPs) + move around
Maria + Lishan end up in different group because Ryan randomly separates them -> strong connection weakens over time
More parties attended over time = more chance they are to mix with different people
In genetics = recombination breaks linkage over generations
Linkage Equilibrium (LE):
Friends who mix randomly
After many parties (generations of recombination), Maria + Lishan no longer always arrive or leave together
If you see Maria at a party, you can’t predict whether Lishan is
there – she could be with a different group or not even invited
Their presence at a party is now independent of each other – their
relationship has equilibrated
Write-Fisher Model:
Describe how allele frequencies change over generations due to
genetic drift in small populations
Assumes finite population size, random mating, & no selection,
mutation, or migration.
Genetic drift:
variation in the relative frequency of differentgenotypesin a small
population,owingto the chancedisappearanceof particular genes
as individuals die or do not reproduce.
Genetic drift scenario in an isolated human population:
Scenario:
20 shipwreck survivors (10 M/ 10 F) stranded on a remote island
Key effects:
o Random allele changes – some alleles become common others disappear
o Loss of genetic diversity – reduced ability to adapt to environmental changes
o Inbreeding – higher risk of genetic disorders from recessive alleles
o Founder effect – future generations reflect only the survivors’ genetic traits
o Higher extinction risk – a disease or environmental shift could wipe out the population
Genetic drift:
Reduces diversity by causing haplotypes fluctuate in frequency
o = allele loss + increased similarity in population
o = create linkage disequilibrium (LD)
Mutations:
~60 per diploid genome per generation
Most lost due to genetic drift
Recombination:
Help break down LD by reshuffling alleles
Occur in highly non-uniform manner, concentrated in recombination hotspots
Natural selection:
Can introduce strong genetic differences between populations, shaping adaptation
Genotyping arrays:
Use SNP microarrays for high-throughput screening
Cost-effective to analyze common variants across large population
Next-Generation Sequencing (NGS):
Whole Genome Sequencing (WGS) = detect all SNPs across genome
Targeted sequencing = examine specific regions of interests
Polymerase chain reaction (PCR):
qPCR * TaqMan Assays = used for targeted SNP detection -e.g. forensics
Sanger sequencing:
Gold standard for validating SNPs in small sample sizes
Limited by cost & scalability for large datasets
Used for the human genome project
Microarray chip:
Contains thousands of DNA probes, each targeting a specific mutation or SNP
Microarray (Mass screening) process:
- Patient’s DNA fragmented using restriction enzymes
- DNA is denatured (ssDNA) + labeled with fluorescent dye
- Hybridization = DNA sample washed over the chip
- Detection = if patients’ DNA binds to probe, fluorescence emitted
- Fluorescence intensity measured = determine which SNPs present
Advantages
o High throughput – analyses 1000s of SNPs in single experiment
o Automated & scalable – suitable for clinical & research applications
o Cost-effective – cheaper than whole-genome sequencing for targeted variant detection
Limitations:
o Cannot detect Novel mutations n- limited to known genetic variants
o Lower sensitivity for rare variants – less effective for detecting low-frequency mutations