Part 3 Flashcards
How do you identify a gene?
- Sequencing DNA
- Assembling and annotating the human genome
- Methods of gene production
- Identifying genes by micro array experiements
Analysing gene function
- Reverse genetics
- Forward genetics- Genetic screens, complementation analysis assigns mutations to individual genes and Linkage analysis - genes positionally cloned
Cost per genome
£7.800 for your genome
*used to be $100m
“Data miming” genomic sequence- how do we find genes in the nucleotide sequence
- Use gene prediction software- scanning sequence for promoter, start and stop sequences and intron splicing sites
- Use computer to translate the DNA in all 6 reading frames- search for known similar proteins (BLAST)
BLAST protein alignment
-input the amino acid sequence of the proposed protein the blast program searched databases for proteins with similar sequences
Shows alignment of uncharacteristic protein (query) to a protein called Zen (subject)
What is the PA translation from gene Zen?
CG1046 Length= 353 score 150 Identities 31/57, Positives 39/57 query= 57, subject=147 similarity found between protein sequences suggests that the protein evolved from the same common ancestor and that the protein has similar functions
Looking at the genome online
EST- short sequences from the ends of cDNA expressed sequence tags
- predict homolgy
- predicted genes
- genomic assembly
- reverse strands
Micro arrays
Allow us to compare the transciptomes of different tissues to each other eg. normal liver tissue to cancerous liver tissue
High throughout= small scale, fast and automated
How do micro arrays work?
1.A very precise robot manufactures the array
each position on the grid contains one cDNA (as single strand)
One spot for every gene in the genome
2. Purify mRNA from liver tissue and tag it with fluorescent dye
3. Put the mRNA onto the array (hybridise) then rinse off the exons
House keeping genes and what genes are we looking for?
Most of the genes in the normal and tumour samples are the same
look for:
- genes that are lost in tumour tissue (potential tumour supressor genes)
- genes that are activated in tumour (potential activators)
Using grid coordinates we can look up the identity of these genes
3 different ways to identify genes?
- by making a library of cDNA clones from mRNA
- by making a library of genomic clones then make predictions based upon genomic sequences
- Identify sets of interesting genes using microarrays
All of these use a gene cloned into a plasmid or sequence
Genetic engineering in mice
powerful but time consuming and spenny
from endogenous genes you can either:
A) gene replacement- (maybe using found mutant gene) and test whether the mutation causes the disease symptoms - mutant active gene present ONLY
B) Gene knockout- completely remove the genes to determine its function (must have genomic clone of gene)- no active gene present
How can we knockout a single gene in mice?
- acquire a genomic clone of the gene and insert NEO into an exon (destroying activity of gene)- TK placed one side of it
sequence from taerget= Homologous arms- homology of mouse - Introduce construct into mouse ES using cell culture technique
- Homologous recombination occurs sometimes (KO)when it does occur TK gene is lost
Double selection identifies KO
- selective markers (neo and tk) used to identify colonies that are the result of homolgous remcombination
- cell integrating NEO can now grow on neomycin containing media
- Cells containing TK gene along with neo will die when grown on CANC media
How can we target a single gene in mice?
Selected cells line is re-introduced in mice embryos
- first generation= mosaic (mixture of sc and mother and goads)
- mosaic are bred to generate non-mosaic carries of the transgene (2nd generation)
- carriers then interbred to create homologous mutant analysis (3rd generation)
Transgene
with one copy of target gene replaced by altered gene in germ line
Genetic material or gene transfered naturally or by any of the genetic engineering methods from one organism to another
Injecting it into mice
Wait 3 days
inject into ES cells- early embryo partly formed from ES cells
Introduce into pseuopregnant
Birth
Somatic cells of offspring tested for presence of altered gene
selected mice bred down germline
Forward genetics
- randomly mutate the genome
- look for interesting phenotypes in the offspring
- Identify the gene that causes the defect
Bc random mutagenesis affects whole genome- one has to analysis many mutanergised animals to find defect- easier to use zebrafish, c.elegans and drosophila
Problems for genetic screens
Loss of certain cells or tissues
Disease like phenotype
Biochemical abnormalities
loss of hearing, vision, behaviour and drug addiction
Forward vs reverse
Forward= we find a mutant- start with only phenotype, dont know what the gene encodes (function - genes) Reverse= Know the gene and want to find function (KO)(gene to phenotype)
Forward genetic clone of flies
Mutangenize male= each sperm has a different set of mutations EMS is a chemical mutagen
+/+ x +/+ = PO- heterozygous for mutations
+/+ x +/m = F1
+/+ x +/m = F2 incross to see homozygous embryos (what they look like)
=m/m =F3
-3 generations to make homozygous
If different mutations have the same phenotype, are they different alleles of the same genes?
m1/+ x m2/+ = m1/m2
Mutants fail to complement they are alleles of the same gene
m1/+ x m3/+ = m1/+ ; m3/+
Mutants complement- no offspring with phenotype= mutations of different genes
Complementation analysis allows mutations to be sorted into distinct groups
When we find a mutant, we start with only a phenotype, we dont know what the genes encode- linkage analysis is used to identify genes
By analysising recombination between our alleles a) and b) on the same chromosome, we determine whether the gene and marker are linked
Greater distance between genes= more frequent crossing over occurs in meiosis
Meiosis and genetic recombination
Maternal (AB) and paternal (ab)– meiosis recombination– Ab and aB haploids (egg and sperm)
Calculating recombination frequency
R/T x 100 = centimorgan (cm)
cm= measurement of genetic distance
R= number of recombinant gametes (counting)
T= calculatee recombination frequency
Calculating the genetic distance between a human disease gene and a marker
SNP- single nucleotide polymorphisms
markers that are easy tpo analyse and vary from individual
Over 1 mill SNPs have been placed in the human genome
DNA samples taken and snp tested- if a snp is present in diseased children not normal then know its gene is linked
Once you have found closely linked SNPs you can look for candidate genes
- All of the SNPs have been placed on human genome
- 2 SNPs are the closest to your gene, then gene is good candidate to sequence for mutations
How do mutations affect gene function?
- Changes in regulatory sequence- In DNA that affects transcription
- Changes in non-coding sequence
- Changes in coding sequence- May alter an important aa fold of protein- premature stop codon created= trauncated protein
- missense= amino acid sub situation
- nonsense= early stop codon
Examples where mutation in the coding sequence often affect primary function
- amorphic/ non-functioning
- hypomorphic/ weakened
- anti-morphic/ dominant
- hypermorphic/ overactive
Amorphic/ non- functioning
Missense mutation that completely inactivates the DNA binding domain
+/- = normally there is enough gene product from one wt, copy halosufficient
-/- = stong phenotype due to no transcriptional activation- recessive
Hypomorphic/ weakened
Missense mutation that weakens DNA binding domain
+/- = Normally enough wt , mutant may dimerize
-/- = Mild phenotype due to poor transcriptional activation, complex from on DNA
Antimorphic/ dominant
missence mutation that destroys dimerisation domain
+/+ = mutant binds DNA but doesnt dimerize with wt
-/- = completely inactive
Hypermorphic/ overactive
Missense mutation - overactivation that is independent of dimerization
+/+ = mutant binds DNA all the time
-/- = the same
Types of phenotypes produced by mutations
- Loss of function
- amorphic- Complete loss of function, typically protein nulls or deletion of entire gene (early nonsense)
- hypomorphic- reduction of wt formation (enhancer and missense)
- antimorphic- competitive inhibitors, mutations that affect one domain of protein, heterozygous form still partially active, mutant interact and poison protein (dominant negative) - Gain of function
- Hypermophic- over expression of transcription unit, dominant
Control of gene expression- RNA
- Isoforms
- Subcellular localisation can be used to target translation to the part of the cell where its needed
- Translation can be directly regulated by sequences in UTRs (untranslated region) or globally by regulation of eIFs (Eukaryotic initiation factor)
- Some mRNAs have a 2nd ORF (Open reading frame) that can be regulated independently
- RNA degradation
Isoform gene
A variety of different proteins made from a single gene
Where does regulation of gene expression occur
At almost every level
Transcription - splicing (final mRNA) - translation (RNA)
Alternative splicing creates isoform
40% of drosophila and 75% human genes alternatively spliced
Splice donor and acceptor are only 2 bases so very frequent
Other sequences and secondary structure in RNA affect choice of splice
Choices of splice site
- Optional exon
- Optional intron
- Mutually exclusive exons
- Introns splice site
Regulation of alternative splicing
Sex determinationn in drosophila
3 genes= male and female differentiation- sxl, tra and dsx (sex lethal, transformer, double sex)
Difference in splicing in males and females
Males= ( 1 x chromosome)- sxl and tra spliced out to give rise to inactive proteins, dsx transcripts give rise to male specific repressor proteins which repress transcripts from genes for female development
Female= (2 x chromosomes)- small amount of sxl protein made (alternative promoter) which represses splicing by blocking binding of U2AF, feeds back to itself and makes more sxl, binds to tra= female specific dsx produced
The site of polyadenylation on an mRNA can be regulated
B lympocytes produce 2 antibody isoforms
-This antibody has 2 positive positions for cleavage
Transcription with different size transcripts
Long RNA transcript
- first stop codon is spliced out
- transmembrane domain translation
- membrane bound antibody
- Terminal hydrophobic peptide
Short RNA transcript
- Splice acceptor lost and the first stop codon is not lost (Intron not spliced out)
- Antibody is secreted
- Hydrophilic peptide
*Alternative endings allows different isoforms