Genomics Flashcards
PCR
Polymerase chain reaction (PCR) is a method widely used to rapidly make millions to billions of copies of a specific DNA sample, allowing scientists to take a very small sample of DNA and amplify it to a large enough amount to study in detail.
Microsatellites
A microsatellite is a tract of repetitive DNA in which certain DNA motifs (ranging in length from one to six or more base pairs) are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism’s genome. https://www.youtube.com/watch?v=lQSi84xFsrY
RFLP and AFLP
RFLP: In molecular biology, restriction fragment length polymorphism (RFLP) is a technique that exploits variations in homologous DNA sequences, known as polymorphisms, in order to distinguish individuals, populations, or species or to pinpoint the locations of genes within a sequence.
AFLP: Amplified fragment length polymorphism (AFLP) is a PCR-based technique that uses selective amplification of a subset of digested DNA fragments to generate and compare unique fingerprints for genomes of interest.
Sanger Sequencing
Sanger sequencing, also known as the “chain termination method”, is a method for determining the nucleotide sequence of DNA.
Single Nucleotide Polymorphisms
Single nucleotide polymorphisms, frequently called SNPs (pronounced “snips”), are the most common type of genetic variation among people. Each SNP represents a difference in a single DNA building block, called a nucleotide.
Genome
The total DNA of an organism
(3.2 billion base pairs)
(20k genes - the unit of heredity)
Allele
Different versions of the same gene (different nucleotide sequence) which codes for slightly different versions of a protein
Homozygous allele
Two of the same alleles (Ex: TT or tt)
Heterozygous allele
Two different alleles (Tt)
Phenotype
Physical manifestation of a genotype
Genotype
The two alleles a person has for a given gene
Genetic Markers
Polymorphic regions of the genome
Allozymes
Enzymes coded by different alleles of the same gene
Locus
In genetics, a locus is a specific, fixed position on a chromosome where a particular gene or genetic marker is located.
Recombination
Random mixing of the DNA of two creatures
Mutation
Random changes in DNA or “copying errors”
Microevolution
Change below the level of species
Macroevolution
Change above the level of species
Intron
Sequences that will not be part of the mature RNA but are primary transcripts synthesized by RNA polymerase. They are removed before the mature mRNA leaves the nucleus.
Exon
The remaining regions of the transcript, which include the protein-coding regions. They are spliced together to produce the mature mRNA.
Synonymous mutation
The codon codes for the same protein
Missense mutation
The codon codes for a different protein which may change physical characteristics, function or genotype
Nonsense mutation
The codon codes for a “termination codon”. Produces shortened final protein.
Evolution
Change of the genetic composition of populations (change of allele frequencies over time)
Transitions
C to T or G to A
Transversions
C to G, C to A, T to G or T to A
Fixation
Mutations fixed in a population substitute (replace) the former ones
Homoplasy
When a trait is gained or lost independently in separate lineages (parallel, convergent, reversal evolution)
Saturation
Genetic saturation is the result of multiple substitutions at the same site in a sequence, or identical substitutions in different sequences, such that the apparent sequence divergence rate is lower than the actual divergence that has occurred
Degenerate positions
A position of a codon is said to be a twofold degenerate site if only two of four possible nucleotides at this position specify the same amino acid. For example, the third position of the glutamic acid codons (GAA, GAG) is a twofold degenerate site.
Substitution rate
r=K/2T
Jukes-Cantor
?
Kimura-2 p
?
Pseudogenes
Genes that have lost function, and are useful for studying substitution rates
Phylogenetics
Study of evolutionary relationship
Nodes
Taxonomic units
Indel
Insertion-deletion gaps
Homology
Homology means that two sequences descend from a
common ancestor. Homology is all or nothing; two genes can’t be 50% homologs.For molecular sequence data, homology means that two sequences or even
two characters within sequences are descended from a common ancestor.
Ortholog
Homologous genes that have
diverged from each other after
speciation events (e.g., human β- and
chimp β-globin).
Paralog
Homologous genes that have
diverged from each other after gene
duplication events (e.g., β- and γglobin)
Xenolog
Homologous genes that have diverged from each other after lateral gene transfer events (e.g., antibiotic resistance genes in bacteria).
Homolog
Genes that are descended from a common ancestor (e.g.,all globins).
Neutral evolutionary forces
Selection
Migration (gene flow)
Mutation
Genetic drift
Hardy-Weinberg equilibrium
If allele frequencies do not change over time.
Criteria
- Population must be large
- No mutations
- No migrations
- Mating must be random
- No natural selection can occur
Population
mono-specific group of individuals that reproduces randomly, and that shares the
same space during at least one part of their life cycle
F_is = 0
Population at Hardy-Weinberg Equilibrium (HWE) at a given locus
F_is > 0
Deficiency in heterozygotes in comparison to HWE (less heterozygotes than expected according to HWE)
F_is < 0
Excess in heterozygotes in comparison to HWE (more heterozygotes than expected)
Genetic drift
Random evolution of allele frequencies from generation to generation
Bottleneck effect
Reduction in genetic diversity due to strong genetic drift induced by a drastic decrease in the size of a population (ex: disease, natural disaster, overexploitation)
Founder effect
Organisms that “founded” an area that did not represent the original population they came from
Metapopulation
Group of spatially separated sub-populations, with independent dynamics, and interconnected by possible migration.
Fitness
Fitness = Viability x Fertility
Viability: Probability of reaching a reproductive age
Fertility: Number of offspring reaching reproductive age
Directional selection
Selection of the extreme value of a phenotypic trait (positive selection)
Stabilizing selection
Selection against the extreme values of a phenotypic trait (purifying or negative selection)
Disruptive selection
Selection against the intermediate values of a phenotypic trait (diversifying selection)
Balancing selection
Selection pressure resulting in active maintenance of multiple alleles in the population.
Wrights F Statistics
The correlation between genes drawn at different levels of a (hierarchically) subdivided population. The measures FIS, FST, and FIT are related to the amounts of heterozygosity at various levels of population structure. Together, they are called F-statistics, and are derived from F, the inbreeding coefficient.
Gene flow
Exchange of genetic material between populations
Migration
Exchange of individuals between populations
Population genetic structure depends on:
-Population efficient size
-Population divergence rate
-Migration rate
+Natural selection for some loci
Panmixia
Random mating within a breeding population
Polymorphic
A gene is said to be polymorphic if more than one allele occupies that gene’s locus within a population. In addition to having more than one allele at a specific locus, each allele must also occur in the population at a rate of at least 1% to generally be considered polymorphic.