Genetics Pt 2 Flashcards
True of False: All diseases arise from a mutation to a single gene
False
True or false: HTS can detect genetic variants associated with disease using multiple samples
True
How much DNA does Forensic Casework typically use to detect loci
<1 nanogram
True or false: In forensic profiling cases are all markers sequenced simultaneously
Yes
What is a genomic library
contains fragmentary inserts of DNA generated from a variety of processes
Includes elements required for sequencing
What is a gene
Unit of heredity which is transferred from a parent to offspring and is held to determine a characteristic of the offsrping
What is an allele
One of two or more alternative forms of a gene that arise by mutation and are found at the same place in a chromosome
What is a genotype
The genetic constitution of an individual organism
What is a haplotype
a set of DNA variations on a chromosome that are inherited together because they are located close together
What is forensic DNA phenotyping
Prediction of the human appearance
What trait is a highly polymorphic phenotype in people of european descent
Eye colour
What was Irisplex developed to do and when was it developed
2009 - developed to predict eye colour from genetics
What were the methods in irisplex
used the 6 most eye-colour informative SNPS that previously reveled prediction accuries over 90% for blue and brown eye colour in dutch europeans
What was the major determining factor whether eye colour will be brown vs non-brown in the irisplex case
The rs12913832 (HERC2) with its AA/TT versus GG/CC homozygote genotypes
Can irisplex handle mixtures and complex dna
Yes
What is HIrisplex adn when was it developed
2013 - Includes a single multiplex genotyping assay for 24 eye and hair colour predicting SNPs
How many prediction models does HIrisplex have
2 - one for hair colour and one for eye colour
How many SNPS does HIrisplex include
23 SNPS and 1 insertion/deletion polymorphism
Prediction acuracies for hair colour in Hirisplex model
69.5% for blond, 78.5% for brown, 80% for red and 87.5% for black
What is the first fully validated sequencing system designed for forensic genomics applications
Illumina MiSeq FGx
Explain steps of illumina
PCR cycles link the tags to copies of each target to form DNA templates consisting of the regions of interest flanked by universal tag primer sequences. (200 primer pairs)
The tags are used to attach indexed adapters (barcoded), which are then amplified using PCR, purified, pooled into a single tube, and then sequenced
The index sequences allow the sequencing system to separate and isolate the data generated from each sample (sample multiplexing)
What do iiSNPs do
Identity-informative single nucleotide polymorphisms inform source (ideal for degraded samples)
waht do piSNPS do
Phenotypic-informative SNPs estimate eye colour and hair colour
what do aiSNPs do
ancestry informative SNPs estimate biogeographical ancestry
HTS amplicon workflow four steps:
Library preparation (PCR)
CLuster generation
Sequencing
Data analysis
Amplicon Library preparation steps
Sequence-specific/universal-tagged primer PCR for each forensically relevant target sequence in the DNA sample.
Indexes and adapters are incorporated into the amplicons.
Amplicons are then purified, pooled, and linearized.
Cluster generation steps
fragments are bound to surface oligos complementary to the library adapters on the flow cell.
Each fragment is then amplified in distinct clonal clusters through bridge amplification.
Sequencing steps
method that detects single bases as they are incorporated into DNA template strands
When was the first effort at generating facial composites from DNA
2014 originally based off 24 SNPs
Challenges of Facial models
Extract information out of an evidentiary DNA sample
Convert this information to values
Create shape transformations from these values
Combine multiple shape transformations into a single facial composite
Three future avenues of research in DNA-based facial composites
Expanding knowledge on the genetic architecture of facial morphology
Improving the predictive modeling of facial morphology
Perceptual interpretation of the results
Legal and ethical issues of DNA facial composites
Not permitted in court as science not well established
Concerns over racial profiling - It seems possible that instead of making suspect searches more exact, the vagueness of FDP descriptions might make them more vulnerable to stereotyping (Edmonton Police Services)
Creates a “suspect population” and onus on individuals to provide their DNA to prove innocence
What are some limitations of DNA phenotyping studies
Incomplete genetic knowledge - some not understood
Prediction accuracy - accurate for simple traits
Population Bias - only european
Mixtures - hard to interpret
Data quality - degraded DNA
What assays are used in phenotyping and how do they work?
Irisplex
HIrisplex
Illumina
What is comparative genomics
Comparison of intra and interspecific genomic variation used to increase our understanding of evolution, genomic structure, and function of genes and proteins
What three topics does comparitive genomics give insight to
Evolution – better understand the evolutionary and adaptive histories of organisms
Disease/Medicine/Health – better understand genes involved or associated with disease resistance, tolerance, or prevention; identify model organisms to test and better understand potential gene therapies
Conservation Biology, Biotechnology, Agriculture, Biomolecular Structure & Function (less focus here today)
Timeline of Comparitive genomics
1980’s – genome sequences of viruses and organelle’s available (comparatively small genomes!)
1992 – first chromosomes of baker’s yeast and large bacterial genome fragments
1995 – complete genome of bacteria Haemophilus influenzae and Mycoplasma genitalium
1996 – archaeon genomes and first complete eukaryote (Saccharomyces cerevisiae)
1999 – the first genome of a multicellular eukaryote (Caenorhabiditis elegans; nematode)
2001 – Human Genome Project First Draft
What is a phylogeny
representation of the evolutionary history and relationships between organisms
What is monophyletic
grouping of all organisms sharing a common ancestor
what is paraphyletic
a group of some, but not all, organisms sharing a common ancestor
What is polyphyletic
a group of organisms derived from more than one common ancestor
What is a homolog
genes with a common ancestry
what is a paralog
divergence of homologous genes due to duplication
What is an ortholog
divergence of homologous genes due to speciation
What is synteny
two genetic loci have been assigned to the same chromosome
What is collinearity
a particular type of synteny that preserves the same order of genes on a chromosome from a shared ancestor (recent genomic usage)
What is conserved synteny
the collection of orthologs within the same genomic region, regardless of order (recent genomic usage)
What does orthofinder do
infer the orthogroups for your species
infer a complete set of rooted gene trees
infer a rooted species tree
infer all orthology relationships between the genes using the gene trees
infer gene duplication events and cross references them to the corresponding nodes on the gene and species trees
provide comparative genomics statistics for your species
What is phylogenetics
Estimating relatedness between species in relation to observed sequence variation
What are conserved regions how can we use them
areas of high sequence similarity that has been preserved across distantly related organisms
putatively linked to important biological functions
We can use these regions to paint a picture of relatedness between organisms
Negative inpacts of gene editing
Genotoxicity (while being inserted)
Gene silencing (after insertion)
Expression disruptions (after insertion)
Dysregulated cellular proliferation (after insertion; i.e. cancer)
What are genomic safe harbour sites
Regions of the genome where segments can be introduced without impacting typical cell functions
application of comperative genomics
Conservation biology
Agriculture
Biotechnology & Biomolecules
True or false: All tissues hold the same amount of RNA
False
What make sup the largest RNA family and how much
28S, 18S and 5.8S (80-85%)
Challenges with working with RNA
working in a RNase-free environment
In contrast to DNases, RNases do not need any cofactors (like Mg2+), are extremely stable, and are highly reactive
RNases are everywhere and are produced by all organisms
Common sources of RNases
Body fluids (e.g. perspiration)
Dead cells (e.g. skin), or ‘finger-ases’
Tips and tubes
Contaminated solutions/buffers
Laboratory surfaces & equipment (glassware, centrifuges - especially those used in DNA extractions with solutions containing RNase, fridges, etc.)
Endogenous RNases
Intensity Ratio of RNA
28S rRNA band : 18S rRNA band ~ 2:1 intensity
If 18S is more intense than 28S what does this cause
degredation
What is RT-DNA used for
Reverse transcriptase polymerase chain reaction (RT-PCR) is used to compare gene expression between samples
What do microarrays do
Monitors the level of each gene on the array
Microarray is a rectangular grid of spots printed on a glass microscope slide, where each spot contains DNA for a different gene
What do cDNA microarrays do
Isolate mRNA
Make cDNA by reverse transcription, using fluorescently labelled nucleotideS
Apply the cDNA mixture to a microarray, a different gene in each spot. The cDNA hybridizes with any complementary DNA on the microarray
Rinse off excess cDNA; scan microarray for fluorescence. Each fluorescent spot represents a gene expressed in the tissue sample
Assume the cDNA on the array is in excess of the hybridized sample—thus the kinetics are linear and the spot intensity reflects that amount of hybridized sample
Limitations of microarrays
Reliance upon existing knowledge about the genome sequence
Designed to target protein-coding regions of DNA
Background noise is high (non-specific hybridizations)
Limited dynamic detecting range (highly detected transcripts versus lowly detected transcripts)
Require complicated normalization methods (to get rare transcripts)
What molecular features can only be seen at an RNA level
Alternative splicing isoforms, fusion transcripts.
Predicting transcript sequence from genome sequence is difficult
RNA sequencing workflow
Prepare for sequencing
Sequencing the range
Only difference is the beginning - extra considerations for sample prep
Modifications to library prep
Sequencing primers
What does a map of count do
Assess how much or the quantity of expression of certain genes
By mapping we can get a better idea of the difference in expression of genes
What percent of humans show alternative splicing
35-60%
HOw many biological replicates do you typically want to aim for in RNA-seq experimental design
4 replicates for simple designs
studies show it is better to do more independent biological replications(5 samples each at 20 mil) rather than depth (vs 2 samples at 50 mil)
RNA seq detection of expression vs microarrays
In low expression - RNA better at detecting expression
In high expression - microarrays better at detecting expresssion
Medium - strong correlation of both
Pearson correlation coefficient
1 is strong correlation
above .5 is moderate
below is weak
What can RNA seq be used for
PMI esitmate
personalized medicine
Prediction and prevention
Prediction concerns of RNA use
Penetrance is the proportion of people with a particular genetic change who exhibit signs and symptoms of a disorder
Variable expressivity is the range of severity of symptoms among different people with the same condition
Time lag (from test to clinic), pleiotropy (other phenotypes)
Data often limited and statistical issues remain (previous lectures)
ACEE model
Analytic validity (technical accuracy and reliability)
Clinical validity (ability to detect or predict an outcome, disorder, or phenotype)
Clinical utility (whether test ultimately leads to improved patient outcomes)
Ethical, legal, and social implications
What is warfarin
prescribed oral anticoagulent - blood thinner
what does OncoType DX do
analyzes by qPCR, mRNA expression of a panel of genes within a tumor to determine a recurrence score
What does allomap heart do
qPCR-based expression profile of 11 genes to assist physicians in managing heart transplant patients for potential organ rejection
What is the link between RNA and forensic applications? PMI specifically?
Lots of confounding factors 9age, sex, gender)
Mouse studies have shown links ot PMI
Developed a model with predictive value for PMI estimation(confidence interval of +/- 51 minutes at 95%) that can become an important complementary tool for traditional methods
Impact of expression on mouse colour
What is epigenetics
Epigenetics is the study of how your behaviours and environment can cause changes that affect the way your genes function
True or false: are epigenetic changes reversible
Yes
True or False: Do epigenetic changes change your DNA sequence
NO
True or false: epigenetic changes can change how you read a DNA sequence
True
What does the Non-coding region on RNA do
Non-coding RNA helps control gene expression by attaching coding RNA, along with proteins to break down the coding RNA
What is histone modification
Histone interactions can influence expression
unwrapped = open = expressed
The binding of epigenetic factors to histone tails alters the extent to which DNA is wrapped around histones
What is DNA methylation
Methyl group can tag DNA and activate or repress genes
What is a histone
histones are proteins aorund which DNA can wind for compaction and gene regulation
What is DNA methylation GENERALLY added to
the addition of methyl group to cytosine
what is methyltransferase
enzymes that catalyze the addition of a methyl group to DNA
True or false: Can nutrition affect DNA methylation
TRUE
What percent of the genome is methylated in mammals
1-3%
Methylated locus
C–>C
non methylated locus
C–>U–>T
True or False: DNA polymerase reads Uracil as adenine
True
Challenges of quantifying methylated DNA
can be a very complicated library preperation
Need to compare to a reference
What HTS technique is used for DNA methylation
Illumina
What is the most common liver disease and how much of the population does it affect
Metabolic-associated Fatty Liver Disease
25-30%
What enzymes are involved in histone-DNA wrapping
Histone Deacetylases (HDAC)
Agouti Mouse diet
Healthy mice kepy in off position by epigenome
Yellow mice are obese mice - same genes are not methylated thus genes are expressed
Agouti protein binds to melanocortin receptor
Melanocortin receptors in an area of mouse brain are feeding behaviour
Can differences. in methylation be detected by standard genome sequencing
No
True or false: Methylation turns on/off a gene and influences a phenotype
True
true or false: Methylation is not a response to rapid change
False
What is gene editing
genetic approach in which DNA is inserted, removed or replaced at a precise location within the genome
Who was crispr cas9 made by
Emmanuelle charpentier and jennifer Doudna
How does crispr cas 9 work
the guide RNA directs the cas9 protein to a target site
creating guide RNA is as simple as ordering it from a company but you must know the target sequences
What is PAM
Photospacer adjacent motif
How long is the PAM typically
3-5bp
True or false: the PAM is not required for targeting in gene editing
False
What are the genomic locations that can be targeted for editing by crispr limited by?
The presence and locations of nuclease-specific PAM sequence
Waht does the most commonly used Cas9 system recognize the PAM sequence as
5’-NGG-3’ where N is any nucleotide base
What do indel mutations do
encode target proteins open reading frame
What do indel mutations give rise to
inactivating frameshift mutations, resulting in complete loss of function
In Zinc finger gene editing what is the specificity of the base pairs
> 24 bp each molecule
What are some outcomes of gene editing
Non-homolgous end joining (NHEJ)
If a donor template with homology to the targeted locus is supplied, the DSB may be repaired by the homology-directed repair (HDR) pathway allowing for precise replacement mutations to be made.
What are two ways that gene editing could be helpful
Considered for conservation
Bringing back extinct species
Three ethics points to consider on gene editing
Genetic tests identify SNPs associated with risk
meaning the medical condition has not affected the individual yet
Genetic information is not just about you – it contains information relevant to your family. How does this influence consent / confidentiality?
A large commercial entity
HD case study info
On chromosome 4 - there is a reliable genetic test
Damaging of nerve cells
Laura Purdy’s arguement: if you are a carrier some have argued it is immortal to try and have children
Risks of being tested for HD
What if positive for a disease-linked mutation:
-psychological burden
-more tests are not like HD
-how reliable
How do Crispr or TALEN facilitate gene editing? What are the key components that allow each to function?
What is genetic genealogy
Combines the use of DNA analyses with traditional geneaolgy
ex. track the inheritance of DNA variants
What are used to identify genetic variants
All use standard chip +custom SNPs
What are 4 applications of genetic genealogy
Expand family histories
Identify familial connections
Understanding family ancestry
Forensic context
What are the bermuda principles and why are they important to genetics?
Ensure that all human genomic sequence data generated by publicly funded projects would be made freely available within 24 hours of generation.
This encouraged open science and accelerated research progress in genomics.
Four causes of genetic variation
Substitutions – single base changes
Insertions – addition of one or more nucleotides
Deletions – removal of one or more nucleotides
Translocations – segments of DNA moved between chromosomes
C value paradox
Genome size does not correlate with organismal complexity.
G value paradox
The number of protein-coding genes does not reflect organism complexity.
What does N50 score indicate
The N50 score is the length at which 50% of the total assembly length is contained in contigs or scaffolds of that size or larger. It measures assembly continuity.
What is Q score
A Q score represents the quality of a base call, calculated as Q = -10 log₁₀(P) where P is the probability of an incorrect base call.
What does phred score mean
Phred score is a quality score for nucleotide base calls. For example, a Phred score of 30 means there is a 1 in 1000 chance of an error in the base call (99.9% accuracy).
Whaat does a BUSCO score evaluate
BUSCO (Benchmarking Universal Single-Copy Orthologs) scores assess genome completeness by checking for the presence of expected single-copy orthologous genes in the assembly.
Chargaffs rules
amount fo A=T and C=G, base composition varies between species
How is genetic variation changed
Mutation (point mutations, insertions, deletions, translocations)
Recombination and crossing over during meiosis
Independent assortment of chromosomes
Random fertilization
What is the order of events for SARS COV2
1 Spike protein on the virion binds to ACE2 a cell surface protein
2 The Virion releases its RNA
3 Some RNA is translated into proteins by the cell’s machinery
4 Some of these proteins form a replication complex to make more RNA
5 Proteins and RNA are assembled into a new virion in the GOLGI
6 Released
Problems using microsatellites
Time and labour consuming to develop primers for non-model species
Often do not transfer well between species - i.e. may amplify or may not be variable
Null alleles or PCR-induced mutations can cause problems
Difficulties in modeling the mutation process poses problems for population genetics
Why are SNPs popular as genetic markers
they are abundant
They can be genotyped in a high-throughput manner
the mutation mechanism is well established
Difference between sanger and HTS
Sequencing volume
Sanger only sequences a single DNA fragment at a time
HTS is massively parallel, sequencing millions of fragments simultaneously
What DNA was used in chilean sea bass case
mtDNA
Why does soil serve as good trace evidence
It is highly individualistic and has a high transfer and retention rate
What is sorensons index
a statistical measure used to quantify the similarity between two samples,
Limitations of DGGE
There is a strong bias for dominant populations
Biases generated by differential DNA extraction and PCR amplification and bands can migrate to the same gel positions
Limitations of eDNA
Assay development & bioinformatics not straightforward
no information can be collected on life stages, demography, fecundity or health of the target species – all critical to management
eDNA is not homogeneously distributed throughout a water body
What affects eDNAs persistance in the environment
Environmental conditions
pH levels
UV radiation
Habitat
What species do the CITES Appendices cover
What is required for identifying transmission events
What were the main human genome project sequencing strategies
Public (HGP): Hierarchical shotgun sequencing
Celera (Private): Whole genome shotgun sequencing
Public had mapped clones; Celera used computational assembly
limitations of DNA phenotyping
Not all traits are strongly heritable or genetically mapped
Environmental influences are not captured
Complex traits involve many loci (polygenic)
Poor knowledge
what is bisulfate conversion
Converts unmethylated cytosines to uracil
FastQ format
line 1: identifier
line 2: sequence
Line 3: + (seperator)
Line 4: ASCII-encoded quality scores