genome variation Flashcards
what does human genetic variation show?
we are a young species
human evolution is characterised by constriction and frequent bottlenecks
why are there fixed mutations in the human genome?
certain selective pressures
what is the result of genetic bottle necks?
reduced diversity - the diverse population by chance will turn into a less diverse population and when this re-expands there will be reduced diversity
what causes genetic bottlenecks?
speciation, migration, environment and disease
what is the result of mutation?
it promotes diversity
how can we work out how fast individuals mutate?
sequence the offspring and the parents - there are 50-100 new mutations in offspring that are not present in children
what happens as paternal age increases?
the rate of mutation increases as there is a higher chance of passing on mutated sperms - there are some dominant genetic diseases during spermatogenesis - chance of this occurring is dependent on the age of the father
how commonly do mutations occur in each generation?
a chance of 10^-8 per position per haploid genome - chance that that position will mutate every time that individual has offspring - around 70 new mutations in each diploid genome
where are mutations higher?
mutations are higher in the testes and ovaries than the mutation rates for inherited disease
how many base pairs are in the haploid genome?
3x10^9
what are the origins of mutation?
mutation is due to failure to correct the errors that usually occur during replication but can also be down to exogenous factors
what are uncorrected errors caused by?
exogenous and endogenous factors (segregation, recombination, DNA replication and inadequate DNA repair mechanisms)
where are more mutations accumulated?
where there is poorly conserved DNA - 90% of our DNA is poorly conserved
what are the two classes of mutation?
variation that does not change the DNA content - the nucleotides are unchanged - single nucleotide replacement or a balanced translocation or inversion
variation that results in a net loss or gain of DNA sequence - large (chromosome) or small (single nucleotide)
what is neutral variation?
most DNA changes are small scale and have no obvious effect on the phenotype
what types of variation are there and how common are these?
there are single nucleotide variations - rare variants are less common than 1% and single nucleotide polymorphisms that are more common that 1%
what are restriction fragment length polymorphisms?
they are when the polymorphism or indel is a target of nuclease enzymes
what are indels?
they are insertions or deletions of one or more nucleotides
what are CNVs?
they are copy number variants - technically large indels
what are the most common variations?
SNVS (include SNPs)- 90%
small (1-10 nucleotides) indels - 9%
large (10-100 nucleotides) - 0.9%
CNVs (>100 nucleotides) - 0.1%
what is a polymorphism?
it is anything that is present at a frequency over over 1%
what is the result of rare alleles that cause mendelian diseases?
they have a large impact on gene function and therefore are rare because these alleles are selected against in evolution
what is the characteristic of a common genetic variation?
they have low or no impact on gene function
what is the frequency of variants within a population?
it is determined by the functional impact - silent is common and detrimental is rare
what are the usual two alleles for SNPs?
C and T
how many possible genotypes are there for a diallelic SNP?
3
what is an SNP?
it occurs when a single nucleotide in the genome differs between individuals of the same species resulting in a slight change to DNA sequence
what is the minor allele frequency?
it is the frequency of the less common variant in a population
when do SNPS with an MAF of >1% occur?
roughly every 300 bases
how can base pair substitutions be classified?
missense, nonsense or silent
how can we characterise these base pair substitutions based on their location?
intronic/intergenic variations - they are between genes or exons - do not directly affect the resulting protein but could affect the splicing or transcription regulation
coding exonic variants - missense (AA change), nonsense (stop-gain - truncation), silent (same AA coded for)
what are simple repeats?
they are part of normal human variation - there are lots of tandem repeats that are di or tri nucleotide
what are the complications of repeats?
they are unstable and they are prone to replication slippage resulting in mutations that are variable
how can STRs be used in fingerprinting?
there are a large number of possible sequence lengths and lots of different alleles - chance of someone being heterozygous for 2 sequences that are different in length is likely - only one individual will have the same combination of STRs in different lengths in a particular regions
what are microsatellites comprised of?
simple sequence repeats, variable number tandem repeats and simple tandem repeats
how would you type a simple repeat?
separation by size on a gel
what are the characteristics of repeats?
they are refractive to sequencing due to poor cell rates, they can cause disease and they are subject to anticipation
what does the copy number variation map of the human genome show?
it documents the extent and the characteristics of CNVs among healthy people
where is there a particularly high rate of CNV variation?
subtelomeric regions and pericentromeric regions of the chromosome - they are distributed unevenly across the genome
what are CNVs?
there are large regions of the genome where the number of copies of that region will vary across different chromosomes in healthy individuals
what is the mechanism for losing and gaining in CNV?
recombination
give an example of a gene family where CNVs are enriched?
in those gene families involved in certain functions - immune response, T cell receptor genes, olfactory receptor and drug metabolism is an example
where is there very little variation in copy number?
functional groups of proteins that are integral to cellular function - signalling groups
which genes have the most CNVs and which are least affected by CNVs?
most - paralogous genes
least - those genes involved in disease
how are T cell receptor genes and Ig genes adapted for their role?
they are encoded by a large family of very similar duplicated genes allowing them to promote diversity and increase defence against pathogens and toxins
what tends to be in CNS?
protein degradation, phosphorylation, signal transduction, transcriptional machinery and regulatory genes
what is negative selection?
it is when a strongly deleterious mutation is removed by natural selection therefore it is very rare
why may later onset disease be more common than neonatal?
the mutation causing this may not be eliminated by natural purifying selection as it is not immediate and therefore there may be little neonatal or early development disease but later onset
what is the result of selection and give an example?
reduced diversity - inverse correlation between skin pigmentation and latitude - lighter means ability to better synthesise vitamin D where further from equator
reduced diversity is apparent in regions of genome that have selective pressure on them
what is positive selection?
mutations that have helped a change occur are selected for
what is FOXp2 for?
the neural control of orofacial regions and vocalisation - highly conserved
what is an example of positive selection?
responsible for synthesis of alpha amylase for starch digestion in saliva - those populations with a high starch diet will have a higher number of copies of gene in the CNV region of genome
where is most variation found?
within populations - only around 10% extra found between them
what variants are common to all populations?
those variants at a 10% frequency across combined samples
where will rare variants often be found?
single population