Population and Comparative Genomics Flashcards
what is population genomics?
gives a comprehensive picture of genetic variation within species by looking at whole genomes
what features can we characterize using population genetics?
- demogrpahy
- natural selection (purifying, adpative, balancing)
what is the first stage of gathering population genetics data and what does it entail?
- hypothesis/query
- need to know what you want to find out
what is the second stage of gathering population genetics data and what does it entail?
- sample collection and DNA extraction
- choose 100s/1000s of individuals information
- choose geographic/habitat of interest
- extract genomic DNA
what is the third stage of gathering population genetics data and what does it entail?
- genome sequencing
- sequence the DNA, reads are from sections of the genome
- want lots of reads
- obtain sequence coveraring 5-40x coverage
- sequene genome using ‘short’ read technology
- main issue here is cost
what is the fourth stage of gathering population genetics data and what does it entail?
- read mapping and ‘variant calling’
- locate genetic variants (sites of the genome that differ)
- find where each read matches to the genome
- looking for polymorphisms
- use SNPs and indels
- can map sequence reads to a reference genome and identify sites that differ
what is the fifth stage of gathering population genetics data and what does it entail?
- segregating genetic vairants
- as a result of read mapping you want a list of positions that vary
- alleles/polymorphisms/variants
what is the sixth stage of gathering population genetics data and what does it entail?
- analysis
- analyse certain sites and use their traits to determine which alleles have an effect on a particular trait
- describing demogrpah
- detecting selection
- quantitative genetics like GWAS
what is sanger seqeuncing?
- small scale (not high throughput)
- technology of hcoice for low-medium output sequencing
- can use it for one gene
what is illumina?
- produces vast numbers of reads
- much quicker, short lengths of sequences
- technology of choice for genome re-sequencing
what is PACBIO?
- pacific biosciences
- produces larger reads
- fairly accurate
- one technology of choice for genome assemblies
what is the oxford nanopore?
- produces very long reads (up to 40,000 nucleotides long)
- advancing fast but more expensive
- has the worst error rate
what is meant by demography?
- estimates of population size (can also estimate population size backwards through time)
- population structure (which individuals are more or less closely related)
- migration and ‘gene flow’ between populations
- inbreeding/outbreeding rates
what is selection in population genetics?
which regions of the genome are subject to strong purifying selection (remove bad mutation)
what is an example of quantitative genetics?
GWAS: which alleles contribute to traits
how are demography, selection and quantitative genetics interrelated?
- expanding and shrinking population sizes effect selection
what is the concept of genetic diversity in population genetics?
within a region of a genome there are different amounts of diversity
what are polymorphisms/alleles/variants?
- sites in the genome that differ between individuals of a species
what are SNPs?
- single nucleotide polymorphisms
- these are the most common
what are indels?
- small insertions or deletions
what is the human genome comosed mostly of?
transposons
what are examples of structural variants?
duplications, rearrangements, large inserrtions/deletions
what is the initial origin of variation?
a mutation in one individual
- all polymorphisms start with a single mutation in the popultaion
how can polymorphisms move?
through space and time within a population
- their frequency will change
how will a polymorphism occur in a population?
- get two separated population
- one gene gets across
- a mutation is shared
- over time it would increase
what are most mutations?
- neutral
- deleterious adnd therefor elost
what happens if variants are physically linked on the chromosome?
they tend to travel together but can become unlinked through recombinations
what is the concept GWAS?
- GWAS
- lots of data is in a matrix (0s and 1s)
- want to use summary statistics - summarising information in one number
- average pairwise similarity
what does S stand for?
the number of segreagating sites
what is MAF?
- minor allele frequency
what is DAF?
derived allele frequency (frequency of new allele in populate)
- need to know the ancestral genome
- DAFs are rare as they tend to get lost - suggests adaptation
what is the concept of Tajimas D?
describes whether you have more or less rare alleles than expected
what happens if you have a negative tajimas D?
- have more rare alleles then you expect
- happens when theres a selective sweep (new mutations throughout the population)
- or expanding population
what happens if you have a positive tajimas D?
- too few rare alleles
- signal of balancing selection
- shrinking population
- population structure
what happens if tajimas D =0?
neutrally evolving, stable population
what is the concept of population structure?
- when you have individuals more likely to breed with each other than another set
- can see this through genomes