Big picture concepts Flashcards
In a sentence, describe the central dogma of molecular biology
• DNA is transcripted into mRNA, which is translated into proteins that, after protein modifications, have function
What -omic is used to analyse genes and how?
o Genes are analysed through genomics, involving DNA sequencing
What -omic is used to analyse mRNA and how?
o mRNA is analysed through transcriptomics, involving microarrays and next-gen sequencing
What -omic is used to analyse proteins and how?
o Proteins are analysed through proteomics, involving electrophoresis, chromatography and mass spectrometry
What -omic is used to analyse function and how?
o Function is analysed through metabolics, lipidomics and mass spectrometry
What does transcriptional regulation involve?
o Transcriptional regulation involves alternative splicing, cell type specific expression…
What does translational regulation involve?
o Translational regulation involves masking, mRNA stability…
What does post-translation regulation involve?
o Post-translational regulation involves modification by O-GlcNAc, phosphate, ubiquitin…
What percentage of the human genome codes for protein coding regions?
• Around 1% is protein coding regions
Describe the first estimate of the number of genes in the human genome and how it compared to reality
• First draft (2001) of the human genome estimated 30-40000 genes but by 2007 it was found that there were about 20,500 genes in the human genome
How are we similar to E.Coli and yeast?
o Metabolically, we are similar to E.Coli and yeast
Are the genes in the human genome unique?
o Most of our genes are shared with close and some with distant relatives
How many more genes do we have more than unicellular organisms?
we have 4-5x more genes than unicellular organisms
How many genes do dogs have? Do they have more or less genes than humans?
o We have more genes than dogs (19000)
How many genes does the worm have? Do they have more or less genes than humans?
We have less than the worm (25000)
How many genes does the arabidopsis have? Do they have more or less genes than humans?
We have less than the arabidopsis (28000)
How many genes does rice have? Does it have more or less genes than humans?
We have less genes than rice (75000)
What percentage are we identical to chimps?
o We are 96% identical to chimps
Do humans know the function of all their genes?
• Almost half the genes have an unknown function
Which is more complex, the genome or the proteome? Why?
• Complexity resides in the proteome
o Whilst the genome is static, the proteome can exhibit temporal and spatial differences
• The proteome is constantly changing as cells respond to environmental conditions
o DNA is chemically homogenous whilst proteins are heterogenous
• The proteome may be as complex as a whole organism, a tissue or a single cell type
o Proteins are cellular effectors
What is the proteome?
• Proteome- the proteins expressed by the genome at any one time
What is the functional proteome?
o Functional proteome- part of protein that is expressed at this point in time
What is the theoretical proteome?
o Theoretical proteome- the genetic basis of the proteome
What is proteomics?
• Proteomics is the study of the proteome
What are metabolites?
• Metabolites- small molecules that are chemically transformed during metabolism and that, as such, provide a functional readout of cellular states.
Why are metabolites easier to correlate with phenotype compared to genes and proteins?
o Unlike genes and proteins, the functions of which are subject to epigenetic regulation and post-translational modifications, respectively, metabolites serve as direct signatures of biochemical activity and are therefore easier to correlate with phenotype
What is metabolic targeting?
• Metabolic targeting- quantification of a specific metabolite
What is profiling?
• Profiling- quantification of a group of related compounds or those found in a single biochemical pathway
What are the definitions of systems biology and why are there so many?
o Systems biology- study of living systems/ecosystems (e.g. gut microflora)
o Systems biology- using a global systematic approach studying a living system
• Systems biology is defined by Leroy Hood as:
o Hypothesis-driven
o Requires global/big data acquisition
o Need to integrate different types of data
o Need to delineate biological network dynamics
Network has spatial and temporal aspects that need to be understood
o Know how every single element in the network influences all other elements-allows for deeper understanding of the system
o Formulate models that are predictive and actionable- hypothesis generating
• But there is no concise definition of systems biology that all system biologists agree upon
What are the two main philosophies towards systems biology
- The reductionist approach towards systems biology
* The expansionist approach towards systems biology
What is the reductionist approach towards systems biology?
• The reductionist approach towards systems biology
o Systems biology is molecular biology, which is a continuation of mechanistic Darwinism, at a larger scale
o Reductionism-the practice of analysing and describing a complex phenomenon in terms of its simple or fundamental constituents, especially when this is said to provide a sufficient explanation.
What is the expansionist approach towards systems biology?
• The expansionist approach towards systems biology
o Emergence- complex systems have emergent properties which can’t be deduced from a reductionist approach
Individual components in a living system interact with each other
o If components have to interact with each other, there cannot be an understanding of the living system by only looking at individual parts
What are Koch’s postulates?
o Koch’s postulates
The microorganism must be found in abundance in all organisms suffering from the disease, but should not be found in healthy organisms
The microorganism must be isolated from a diseased organism and grown in pure culture
The cultured microorganism should cause disease when introduced into a healthy organism
The microorganism must be reisolated from the inoculated, diseased experimental host and identified as being identical to the original specific causative agent
Describe Falkow 1988’s Koch’s molecular postulates
- The phenotype (sign or symptom of disease) should be associated only with pathogenic strains of a species
- Inactivation of the suspected gene(s) associated with pathogenicity should result in a measurable loss of pathogenicity
- Reversion of the inactive gene should restore the disease phenotype
Are Koch’s molecular postulates reductionist or expansionist?
Inherently reductionist-relies on a single gene being reasonable for a complete phenotype
Does not (until recently) consider all the off-target effects of knocking out the single gene
Many genes have multiple protein functions-no elucidation of the specific protein function affecting the disease
What -omics are primarily used in systems biology?
- Genomics
- Transcriptomics
- Proteomics
- Metabolomics
What is the difference between genomics and genetics
o Organism-scale rather than single-gene (genomics vs genetics)
o Genetics and molecular biology is reductionist
o Genomics is expansionist (how all parts work together)
What are genome wide association studies, their procedure and their purpose?
o Large scale SNP and mutation analysis (e.g. GWAS) provide associations
Genome-wide association studies
• Aims to identify genetic component of multifactorial diseases
• Hypothesis-free or unbiased testing of the genome for association with disease or observable traits
• Using DNA samples from many people
o Disease cases vs matched controls
Matched controls- people of the same ethnic background
• Rapid scanning of genetic markers (SNPs)
o Across DNA subsets or whole genomes
o DNA microarrays
o Next-generation DNA sequencing
• Searching for variation associated with disease
What is genomics enabled by?
o Enabled by high-throughput sequencing technology
What was the 1000 genome project, what did it aim to, what it achieved and what it cost
Launched 2008, published in 2012
Spent about $30-50 million for 1092 genomes (about $50 000/genome)
Identify >98% genetic variants which have a frequency of >1%
Achieved by light sequencing of the whole genome and heavy (high replicates) sequencing of the exome
Aims:
• To characterise the geographic and functional spectrum and to understand genetic contributions to disease by comparing these 1000s of genomes to each other
• Can tell us about evolution and sequence diversity
What was the 10,000 genome project, what did it do and what did it find?
Over 10,000 genomes with 30x-40x exome coverage
Presented the distribution of over 150 million single-nucleotide variants in the coding and noncoding genome
Each new sequenced genome contributed an average of 8579 novel variants
Found that single nucleotide variants (SNVs) are generally rare in transcription factors (due to their essentiality) and occur more frequently in non-protein coding regions and outside of transmembrane receptors
What is Miller’s syndrome?
Miller’s syndrome is a rare inheritable disease that causes facial and limb abnormalities
How was genomics essential for elucidating genetic associations in Miller’s syndrome? Give an example
Genome sequencing has been essential for elucidating genetic associations in Miller’s syndrome
Roach et al. 2010
• Sequenced 4 genomes (both parents and two affected offspring) at 99.999% accuracy
o Removes noise as nucleotide variants are accounted for due to familial relationships
• 3.6M single nucleotide polymorphisms within the group
• Clustered the single nucleotide polymorphisms to identify 4 candidate genes that may be responsible for Miller’s syndrome
o Those genes code for proteins
o The major gene associated with Miller’s syndrome is:
Dihydroorotate dehydrogenase (DHODH)