General biology Flashcards
Why sequence DNA?
- To understand DNA
- Diagnose Disease
- Understand the biological mechanisms
- Evolutionary studies
- Predict disease
Structural Genomics
physical mapping of the genome and gene products
Functional Genomics
Determination of sequence function
Comparative Genomics
applying structural and functional information to the comparison of genomes across species and individuals
Why do comparative genomics?
Evolutionalry relations (if something remains preserved it is an indication that it is evolutionary important), finding novel genes, to study genomic structure, organization and architecture. Can also be used to create ecolutionary trees
What are we looking for in comparative genomics?
genomic characteristics such as size, coding and non-coding regions which can be found in Ensemble. Syntheny (gene arangement in relation to other genes which can be found in Ensemble). The sequence similarities within genes and proteins which can be dound in BLAST. Genetic constraint, which refers to how resistant the gene is to change whcih can be found in gnomAD.
exons
coding regions that are transcribed into mRNA and eventually into proteins.
Introns
Non-coding regions, can contain regularoty elements, are not included in the mRNA
regulatory elements
- promotors
- enhancers
- silencers
- insulators
promotors
Located upstream from the coding region where RNA polymerase can bind
Enhancers
can be far away from the DNA sequence, upstream, downstream or in an intron. these regulatory elements can have binding sites for transcription factors or other regulatory proteins that increase the efficiency of transcription.
Silencers
can be located in a lot of places (upstream, downstream or in an intron for example). They suppress gene expression.
Insulators
Borders between enhancers and promotors which can prevent enhancers from enhancing the wrong gene and prevent a gene from being silenced.
(alternative) splicing
The process where intronic regions are removed from the pre-mRNA leading to mature mRNA. Alternative splicing refers to mechanisms behind the fact that a single gene can produce multiple gene products. For example, through exon skipping or mutually exclusive exons.
rNTP
ribonucleotide tri-phosphate –> found in RNA, the building blocks used to make and repair RNA
& for cellular energy transfer (e.g. ATP) and as substrates of various cell signaling pathway enzymes (e.g. ATP and GTP).
dNTP
Deoxynucleoside triphosphate –> the bulding blocks used to make DNA
ddNTP
dideoxynucleotides triphosphates. It includes four types of nucleotides namely ddATP, ddTTP, ddCTP and ddGTP. DdNTP is used in Sanger sequencing, also known as chain-termination sequencing.
Used to terminate DNA synthesis because the DNA polymerase cannot bind to it.
Building blocks
A , T, C, G
(Adenine, thymine, Cytosine, Guanine)
Model organisms
is an organism of a representative example of a larger group or concept. Examples: some plants, fruitflies, mouse, zebrafish, bacteria, yeast and more…
what makes a good model organism?
- practical considerations such as: size, cost & husbandry requirements
- many offspring (many samples)
- short lifespan (many samples)
- rapid development (makes it easier to study development)
- relatively accessible to other researchers (so the results can be replicated)
Disease modeling: How can we model diseases?
1) In vivo
2) ex vivo
3) in vitro
4) in silico
In vivo (+/-)
inside living organisms
+: physiologically relevant, allow for long term studies, relatively good translation to human health
-: can be ethically complex, relatively high cost, limited control and less control of confounding variables
Ex vivo (+/-)
Sample taken from animal, while trying to preserve the natural state
+: more controlled conditions, simplify model systems, easier to replicate studies, high versatillity and accesibillity
-:less of in vivo context, shorter viability, limited tissue, artifacts from tissue extraction
In vitro (+/-)
cell culture (in glass)
+: more controlled conditions, simplify model systems, easier to replicate studies, high versatillity and accessibility
-: simplified, limited physiological relevance, variability because of adaptation, dependency on cell culture conditions
In silico (+/-)
In a computer
+: useful for prediction, good hypothesis forming and testing, large data sets
-: based on a lot of assumptions, simplified, requires accurate validation, depended on quality of data, complex analysis
How do we make disease simulations?
- Genetic manipulation using CRISPR-CAS
- Chemical induction
- Lots of data through programming and AI
What makes a good disease model?
- able to replicate elements of the disease
- it is practical to use
- it is ethical to use
- the model is appropriate for the question (do not use worms or plants for eye disorders)
Animal research elements (the 3 R’s)
Replace: get rid of animal models if possible
Reduce: use them as little as possible
Refine: make the procedures as painless as possible
Disease models can be used to…
1) confirm observations
2) predict phenotypes
3) identify mechanisms of disease
4) determine pathogenesis (how does the disease start)
5) Develop treatment
DNA vectors
Carrier of genetic material that can transfer its material in host cells. (think of plasmids)
Genomic Library
collection of DNA fragments in DNA vectors representing the total genomic data of an organism. For larger libraries you need larger vectors.
Method for making genomic libraries…
1) extract and isolate DNA
2) Digest (cut in pieces) DNA and the vector with restriction enzymes
3) Ligate DNA fragments to digest vector and transform the bacteria
4) amplify and sequence the clones
De novo genome construction
a strategy for genome assembly, representing the genome assembly of a novel genome from scratch without the aid of reference genomic data. De novo genome assemblies assume no prior knowledge of the sequence length, layout or composition of the source DNA.
De novo genome construction method:
1) DNA extraction and sequencing
2) Preprocessing: removing low quality sequences
3) scaffolding
4) Gap filling and polishing
5) Validation
Limitations for de novo genome construction
- where to begin?
- difficulty in repetitive sequences
- difficult in highly similar sequences
- takes a long time
- very high or very low GC content can be difficult
Multiplexing
sequencing multiple DNA samples at the same time. Can be done with Next generation sequencing (NGS)
cell free DNA (cf-DNA)
DNA that is outside of the cell, can be caused by:
1) apoptosis (controlled cell death),
2) necrosis (uncontrolled cell death)
3) and DNA secretion
characteristics of cf-DNA
1) highly fragmented about 167 basepairs, because it is wrapped up around histones that protect those basepairs
2) Low concentration
3) elevated levels in cancer patients (cell death), pregnancy (baby) and other disease states
4) isolating is difficult
5) can be very relevant because it is precise and gives a snapshot of right now because it degrades quickly
Where can we find cf-DNA?
- Blood (most commonly used)
- urine
- cerebrospinal fluid
- saliva
- stool
- semen
Clinical applications of cf-DNA
- can be used for non-invasive prenatal testing (NIPT)
- cancer detection
- assessing treatment response or resistace
- transplant rejection monitoring (because those celss might die)
- infectious disease diagnosis (virus has DNA)
- Neurodegenerative disease (some specific mutations can be found)
identidying cf-DNA (how?)
1) blood collection
2) plasma seperation
3) cf-DNA extraction
4) quantification of cf-DNA
5) size distribution analysis
6) biomarker analysis: genetic mutations, epigenetic modifications
What makes a good biomarker?
1) safe, easy and quick to measure
2) specific for particular disease
3) robust
4) reproductive
5) cost effective
6) predictive value
7) sensitive to disease
Fusion genes
2 seperated genes joined together often because of chromosomal rearrangements. This can lead to novel gene products.
Neutral theory (Kimra, 1968)
Most mutations are due to genetic drift (random fluctuations in allele frequency) rather than neutral selection. This is because most mutations are neural and do not lead to significant advantages or disadvantage
Nearly neutral theory (Otha, 1973)
Random genetic drift occurs at a constant rate and is the predominant cause of molecular evolution
his theory emphasizes the importance of slightly deleterious mutations by recognizing their ability to segregate and eventually get fixed due to genetic drift in spite of the presence of purifying selection.
Linked selection (Smith and Haigh, 1974)
positive selection, leads to positive selection of neutral mutations within genetically linked loci
Homology
genes/ proteins that have a common evolutionary ancestor
orthologs (Homology)
arise from seperation events
Paralogs (homology)
arise from duplication events
Xenologs (homology)
arise through horizontal gene transfer (occurs between different species)
Analogy
non-homologous genes/proteins with similar function that arise through convergent evolution