Chapter 9 - Intro to Genomics Flashcards
What is a Polymerase Chain Reaction
A method of copying DNA (in test tube-> Vitro) by mimicking DNA synthesis
In PCR instead of Helicase it uses….
separates DNA strands by heat and then uses DNA polymerase
What is the process of copying DNA by PCR called
Amplification
What is the copied chunk of DNA called in PCR
Amplicon
What does PCR require
The same thing as DNA Synthesis requires
- dNTP: DNA nucleotides with three phosphates
- Denatured DNA (heat) template
- Primers; specific to DNA being amplified
What is Exponential amplification
when the copies are made, they serve as a template for future replications
Temperatures of denaturalization, amplification, and elongation
- 94-96 degrees Celsius
2.(Negative) -68 degrees Celsius - 72 degrees Celsius
Elongation
Addition of nucleotides to a new DNA strand
DNA Sequencing is a PCR reaction but ..
there’s a mix of dNTPs and ddNTPs
ddNTPs= dideoxynucleoside triphosphates
It has a fluorescent label
(Chain terminator bc there’s no OH group to bond to)
Electrophoresis
Separating the fragments by size
The color of the fluorescent colors signifies
the sequence of nucleotides of the DNA template
Genomes contain __ or __ of base pairs
Millions or billions
Sequencing reactions generate
500-800 bp of sequence per reaction
In the future there hopes for
No chain termination step,
no separation step
Still PCR-based, but products are fully synthesized
Everything is done in-solution
Millions of reads in a single reaction
In order to build an entire chromosome
Sequences must overlap
Entire genomes can be sequences in
A couple of weeks
Genome Annotation
Process of identifying and mapping genes and functional regions in a genome.
cDNA Sequencing
cDNA is reverse-transcribed DNA from mRNA (only coding regions, no introns), used to identify protein-coding genes.
Challenges of cDNA Sequencing
- Not all genes are expressed in all cell types.
- Lowly expressed genes may go undetected.
- Does not reveal RNA genes (non-protein-coding).
Computational Tools
Used to detect genes and analyze natural selection signals in the genome.
Natural Selection & Genome
Natural selection conserves important genomic regions, causing them to be more similar across species, while nonfunctional regions diverge faster.
Protein-Coding Gene Identification (Finding Protein-Coding Genes)
Tools analyze DNA sequence patterns to identify regions that may encode proteins (Open Reading Frames, ORFs).
Noncoding DNA Distribution (Finding Protein-Coding Genes)
Noncoding DNA behaves randomly in nucleotide distribution (e.g., ATG should appear every ~200bp, STOP codons every ~63bp).
Characteristics of ORFs (Finding Protein-Coding Genes)
- Underrepresentation of ATG and STOP codons.
- ATG without a STOP codon for longer than 63bp.
- Codon usage bias (certain codons are used more frequently).
Six Reading Frames (Finding Protein-Coding Genes)
Each stretch of DNA has six possible reading frames (three on each strand, + and -).
Gene Orientation (Finding Protein-Coding Genes)
Genes can be on either strand of the DNA (+ or - strand).
Splice Sites (Finding Protein-Coding Genes)
Splice donor and acceptor sites tend to be near each other more often than expected.
Regulatory Elements (Finding Protein-Coding Genes)
- CAAT box and TATA box often appear near each other.
- Poly-A signal (AATAAA) is also near other regulatory elements.
Regulatory elements are binding sites for transcription factors, which are involved in gene regulation
Computational Tools (Finding Protein-Coding Genes)
Used to detect patterns such as codon usage, splice sites, and regulatory sequences, and analyze their proximity to each other.
Mutations in noncoding DNA is always
Neutral
Within coding regions, the ratio of synonymous-to-nonsynonymous mutations
higher than expected
Non-synonymous mutations
Harmful and eliminated
Synonymous mutations
Silent and persist
Computational tools
Detect the type of mutation
non0syn
syn