Genomic Analysis Flashcards
Define genomics
The study of all the nucleotide sequences including structural genes, regulatory sequences and non-coding DNA segments in the chromosomes of an organism
What part of the protein cycle does genomics look at
Just DNA
What part of the protein cycle does functional genomics look at
The characterization of protein-DNA interactions on the genome of an organism
Looks at - DNA, RNA, Proteins
Define structural genomics
the dissection of the architectural features of genes and chromosomes - how they’re packaged up and their location
Define comparative genomics
the evolutionary relationships between the genes and proteins of different species
What does epigenomics/epigenetics look at
DNA methylation patterns, imprinting and DNA packaging
Define pharmacogenomics
new biological targets and new ways to design drugs and vaccines using genes.
E.g viral knock in etc
What is a genome
The single nucleotide sequence of an organisms hereditary information (DNA in humans).
How many base pairs are in the human genome
3.2X10[11] -3.2 billion
What was the first RNA genome sequenced
Bacteriophage MS2 - 1976
What was the first DNA genome to be sequenced
Phage Phi-X174 - 1977
Small with only 11 genes - this is to get in and out infected cells ASAP (smash and grab approach).
What is the trend between genome size and the number of genes
There is no real trend
What 3 things did the human genome project discover
Large centromeres of unsequenced repetitive data
21,700 genes
Only 1.5% actually code for proteins, the lowest % of all organisms
Of the 98.5% of the genome that don’t code for proteins, what does the rest do
Introns
Regulatory sequences (promoters etc)
Unique non-coding DNA
Repetitive DNA
How do mutations help calculate the age of an organism and its divergence from the evolutionary tree
DNA incorporates mutations at roughly an equal rate - 10[-5]-10[-6] mutations per base pair per generation.
This can act as a molecular clock - more mutations means more divergence from a common ancestor therefore the “newer” the species. (e.g. humans have more mutations than dinosaurs).
How do you compare genomes
Use a genome browser - can compare many species DNA sequences with each other
Computer algorithms align the sequences and provide a visual output of how alike they are.
What can genome browsers help show
Areas of the genome that are conserved throughout time across species - meaning the regions that are highly similar must be important to survival (Evolutionary conserved regions -ECR)
These similar sites are assessed to look for transcription factor binding sites.
Why is functional genomics, the study of DNA-protein interaction, important
Mis-regulation of transcription factors will change gene activity (up/down/on or off) and ultimately lead to changed protein levels = disease
How is a gene turned on (gene regulation)
Dependent on surface receptors, these allow substrates to bind which triggers intracellular pathways to recruit RNA polymerase by binding to the enhancer and start transcription and translation to make a protein from the desired gene.
What is the role of an enhancer
Fine tunes the expression of a gene by turning its transcription activity up and down, dependent on the proteins that bind to that receptor
What is the role of a promoter
Turns gene on
How does DNA footprinting work (used for finding transcription factor binding sites)
Label a short sequence with a radiolabel/florescent label (32p) with the transcription factor on this sequence.
Mix these sequences with transcription factors
Cleave this mixed sample where it shall cut around the transcription factor due to not being able to get into cut it because it is bound to the transcription factor.
Control - Cut up the same bit of DNA without a transcription factor, will now be cut randomly and through the desired gene
Carry out gel electrophoresis - these fragments form a ladder based on size. Missing area where the gene should be (showing that the transcription factors do bind to your desired gene).
How does chIP seq work (chromatin immunoprecipitation sequencing) - used for finding transcription factor binding sites
Find binding sites using genetic browser (enhancer and promoter usually).
Treat with formaldehyde to fix tissue (covalent link between everything in the cell).
Ultrasonic waves to break open cells and smash the DNA into small fragments (500-1000BP)
Add antibody to bind to protein of interest
Magnetic beads coated in specific protein for antibody are fished out with magnet.
Wash beads (with antibody, transcription factor and DNA) to remove everything else
DNA is purified with chloroform extraction and used.
Pure DNA sample that used to be bound to TF is then sequenced.
Gives you billions of 50 base pair sequences
Computers take these genome parts and align them with human genome
Because antibodies have been used to find the DNA originally, these DNA sequences will be much more common than the other non-attached parts of DNA in that gene.
This large quantity of attachment DNA sections show the areas where TFs bind in the gene.
What two factors are vital for ChIP seq
Must have a completed genome as a reference to start with and reliable antibodies to bind to desired protein to fish it out.
Why is chIP seq good
Can look in whole genome
Can see transcription factor working and therefore helps to determine if it is affected by the genes activation process
Define epigenetics
The study of heritable changes in gene function that can occur without the DNA sequence changing
Give 3 examples of epigenetics
DNA methylation – can activate or repress regions of genome
Chromatin remodeling – euchromatin (open) and heterochromatin (closed) caused by histone changes
Gene silencing – through the above mechanisms
How can methylation be inherited
Methyltransferase enzyme only adds methyl group to C when in CG sequence that is opposite an already methylated site on the opposite strand.
How can changes to chromatin be inherited?
The heterochromatin proteins are bound to the histones. It unbinds to allow chromosome duplication and in the two new daughter cells it rebinds in the same area.
How can epigenetics be linked to disease
Accidental packaging of DNA into hetrochromatin so isnt transcribed and translated in that specific cell type and its daughter cells, meaning the protein isnt produced.
Methylation of a cytosine residue when it shouldn’t can silence or activate genes
How does epigenetics differ for genetic mutations that cause disease
Epigenetics - usually individuals
Genetics - usually population (CF)
What is the aim of pharmacogenomics and why
Tailored drugs to an individual to give more effective treatments
Every patient will metabolise drugs differently, so understanding genetic differences may help to overcome problems with patients being unresponsive to certain treatments
Give an example of how asthma can be treated with pharmacogenetics
Multifactoral so hard to do maintenance treatment.
Can help to treat based on targeting their specific genomes e.g. ADRB2 gene
What can DNA microarrays do for pharmacogenetics
Can give genome wide expression profiles to see what genes are over expressed and under expressed in specific diseases
How do microarrays work
Put a normal cDNA sample from undiseased person and diseased cDNA sample from diseased person and carry out RT-PCR (reverse PCR) to form DNA and then carry out PCR to amplify it.
Label the amplified DNA with florescent dyes (green - normal and red - diseased).
Mix them in equal amounts to create a hybrid (yellow).
Scan -
- more red - over-expression of diseased gene
- more green - under-expression of diseased gene (sliding scale so can give a quantity of imbalance).
This does this for all the genes in the human genome - allowing us to look at over/under-expression of any gene for a person.
How is genome-wide association study (GWAS) used to identify diseases
Look for single nucleotide polymorphisms (SNPs) within the genes that code for the proteins.
Can be used to tailor medicine to treat disease.
SNPs are just different alleles not necessarily mutations
What is the future of genomics
Wider reading - gibson 2010
Trying to remove genome from one organism yeast and replace it with M. mycoides genome (bacteria).
This process, if successful, would allow easy manipulation of its genome to do various things - produce oil, clean up oil, produce drugs
How are these bacterial cells with the chemically synthesized genome made
1078bp cassettes were overlapped by synthetic oligonucleotides into sets of 10 to produce 10Kbp assembilies.
These 10Kbp were then combined into sets of 100kbp.
These 10Kbp were combined with another 9 to form a plasmid for yeast implantation.
What did comparitive genomics discover regarding motif regions
WIDER READING - Xie 2005
Motif regions - short, recurring patterns in DNA that are presumed to have a biological function. Often they indicate sequence-specific binding sites for proteins such as nucleases and transcription factors (6-10bp)
They carried out comparative analysis of humans, mice rat and dogs genomes and found 105 new motid sites likely to be involved in post-transcriptional regulation - half code for microRNA (non-coding RNA that binds to 3 end and regulates gene expression and many mRNAs by degredation of them. Meaning that microRNA may be more abundant than previously believed.
What role does epigenetics have in cancer
WIDER READING - Morales ruiz
Cancer cells show abnormal DNA methylation patterns. Cancer is marked by global hypermethylation, which destabilizes the genome and activates oncogenes. Paradoxically, it is also marked by hypermethylation at specific sites. This turns off tumor suppressor genes. One goal of epigenetics, therefore, is to restore normal DNA methylation patterns.
How did they reverse methylation in colorectal cancer cell
WIDER READING - Morales ruiz
Expression of plant 5mC DNA glycosylase induces genome-wide changes in the methylome of CRC cells and important alterations of their phenotype.
Can bind these plant domains with a DNA binding domain to work in humans
Have to use plant as no human alternatives
ChIP seq in identifying heart enhancers
WIDER READING - blow et al
Lack of enhancers identified in heart tissue compared to other tissue types.
Used mice and Chip seq to identify 3000 possible enhancers with P300 binding (transcriptional co-activator) which showed 130 of these that were tested were active. This was compared to midbrain enhancers which showed that heart enhancers were overall, poorly evolutionary conserved.
How was comparative genomics used in identifying pathogen-resistant rapeseed
WIDER READING
Found nucleotide-binding leucine-rich repeat (LRR) receptors (NLRs) control resistance against intracellular (cell-penetrating) pathogens in various locus
Resistance to pathogens -
NLR genes in resistance against the intracellular pathogen P. brassicae and a putative NLR gene in Rlm9-mediated resistance against the extracellular pathogen L. maculans.
What did pharmacogenetics discover regarding abacavir hypersensitivity (HIV patients)
WIDER READING - Mallal 2002
57.1 ancestral haplotype causes abacavir hypersensitivity in this with HLA b5701 gene.
If abacavir is withheld it should reduce hypersensitivity from 9% to 2.5% without denying abacavir to any patients.
What did large scale genomic analysis discover regarding CVD
WIDER READING Schunkert et al
13 new susceptible loci for CVD -
ABO and ADAMTS7 - Atherosclerosis
CNNM2 - hypertension