L13&14: Genomics and Genome Projects Flashcards
How is organisms’ genomic sequence obtained?
- Obtain the organisms genomic DNA
- Break the DNA into small fragments and obtain the DNA sequence from all fragments
- Search for overlaps of identity between DNA sequence of different fragments to reconstruct genome sequence
- Fill in any missing gaps in sequnece
Why are model organisms used?
Their genome is usually small or easy to manipulate, providing info on fundamental biological processes
Name the main model organisms used
- Yeast (S. cerevisiae)
- Fruit fly (Drosophila melanogaster)
- Nematode worm (C. elegans)
- Western clawed frog (Xenopus tropicalis)
- House Mouse (M. musculus)
- Zebrafish (D. rerio)
- Bacteria (b subtilis)
How big is a valid open reading frame?
This is subjective. A computer cannot identify between random triplets and an actual gene. There can be up to 5o triplets and it doesn’t form a gene. he longer an open reading frame is, the longer you get before you get to a stop codon, the more likely it is to be part of a gene which is coding for a protein
Explain an issue with identifying genes within genomes
The identification of RNA splice sites
- RNA analysis can help but depends on the expression of the gene
What can computer analysis of protein sequence be used for?
- prediction of functions (roles for model organisms)
- prediction of protein localisation
- prediction of protein domains/ modification
Explain the purpose of computer analysis of protein sequences in predicting function
We can use info gained from model organisms to understand function of newly identified human genes
- useful in finding fundamental insights about protein functions and links to disease
- allows functional characterisation of mutant proteins
Explain how computer analysis of sequences can be used to predict protein localisation
Localisation can be used to relate to its function via both computer and lab based techniques
Explain how computer analysis of sequences can be used to predict protein domains/ modification
We can use various programmes including BLAST to identify conserved domains or see if protein is modified using NetPhos programme.
Give an example of how computer analysis can be used to predict protein modification
NetPhos programme can be used to search for potential serine/threonine/ tyrosine phosphorylation sites
- in lab this protein phosphorylation cna be investigated in vivo
- can test genetically & biochemically potenital role of programme predicted kinase
- mutate threonine to glutamic/ aspartic acid or alanine residue to investigate role of phosphorylation
What are the 2 main uses of genome sequence within an organism?
- Identification of regulatory sequences
- Characterisation of protein families
How is genome sequence used to identify regulatory sequences?
Idenfiy all promoters containing a txn factor binding site
- however, computer programming alone isn’t enough solab experiemntation needed to prove predictions
- earlier analysis limits identification of slightly different sequnces
How is genome sequence used to characterise protein families?
Kinases have well characterised homology within catalytic domains so computer can readily identify
- genome analysis allws inference of function of uncharacterised kinases by family studies
- genome analysis allows identification of conserved and organism-specific families of protein kniases (comparisons can provide info)