Comparative Protozoan Genomics Flashcards
Give some reasons why you’d do it
Study host-parasite interactions
Identify drug specific targets/diagnostics
Evolutionary biology of euk in general
How much coding capacity does crypto have compared to trypanosoma 9000
Around 4000 because intracellular
What are reads combined into scaffolds for
To map each chromosome of the organism
Which repetitive genome difficult because too many contigs
Trichomonas (biggest genome)
Annotation has many different levels. Genome annotation includes gene prediction as well as advanced functional characterisation/localisation, evolutionary origin. what does gene prediction mean?
Gene prediction is finding the location of the gene, promoters? Introns or exons?
Under the protein functionalisation category of annotation what does this mean
Cell localisation, exp during lifecycle, domains, structure similarities
Is the protein always annotated based on function
Not if there are no other similar proteins in databases
How can protein evolution be studied using annotation and what can taxonomic distribution tell you
Taxonomic and phylogenetic distribution
Are they orthologue? Paralogue? Xenologues?
If they’re found in particular organisms this potentially gives idea on functionality
What are para, ortho or xenologues
Para- genes derived from same gene but duplication event takes place. Can be same organism
Orthologue- 2 fenes in diff species with same function from commonancestor duplication
Xenologues- type of orthologue but gene from LGT
Why is annotation dynamic when gene sequences are static overtime
Dynamic because of new functional data/experimentation for example trancriptomics data, developing bioinformatics tools which are better, making it as detailed as possible for a hypothesis
Why is gene prediction an in exact science
Could have partial genome sets (not full scaffolds usually)
Don’t know the splice variants if there are any
Difficult to identify introns/exons which can lead to false predictions
Difficult to know where orf starts and ends
Why would transcriptomics help this
Removing introns as it’s only transcribed data
Also leads to alternative transcripts/splicing info
Which ways can transcriptomics work and how can this work together with proteomics/MAss spec
Microarray analysis
Or
Also converted to cDNA but use high throughput sequencing methods(NGS) (rna-seq)
Can even do RT-PCR to quantify
Could find that the different transcripts coincide with specific proteins found
How is in silico protein annotation by homology done and also for taxonomic analysis
Blast search requiring big databases full of annotated proteins - this is for homology
Compare with other proteins in database/ from different taxonomies
For info on para/etc you’d do read alignment and phylogenomic analysis
What info would this give you
Evolutionary info eg if paralogs, xenologs etc
What is the usefulness of this comparative genomics analysis
Can help establish if it is an ORF if not sure by comparing and if species have it then likely yes
Determine ortho,para etc (evolutionary origin of proteins)
Identify taxa-specific genes eg for their virulence
What 2 types of protein databases are there for annotation purposes
Whole Protein databases used after blast search eg ncbi - see known proteins with similar sequences
Domain/motif databases eg pfam,interpro
What are domain ones for
Group domains together to see functionality of proteins eg pfam13402 found to be m60 peptidase like
Give example of how cell localisation can be dynamic of proteins
During cell cycle, SAV , environmental conditions
Why is it important
To determine accessibility for drugs if they were to target them
Localisation indicates function especially during specific contexts eg transporters
What bioinformatics tool is signalP and TMHMM and give example
Predict signal peptide sequences on proteins using their aa sequence
Eg vsg n terminus
predicts tmd eg on vsp
Who usually has signalP
Secretory, membrane or surface proteins
What is predgpi for
Predict gpi anchors eg vsg
How can experimentation be used for annotating proteins
Info on function eg if they have enzyme activity
Host binding info
Cellular location (tagging)
When is it expressed (rnaseq)/translated during life cycle (tagging)
Which parasite has 74% introns meaning need transcriptomcis
Toxoplasma
Identifying and characterising vsp and vsg hemped what
Determine how they do monoallelic exp via experimentation eg deleting dicer or identified histone interactions with vsg
What database covered euk parasites and hosts and can be modified with free access for analysis
VeupathDB
How was comparative genomics essential to find drug resistant and sensitive L.donovani strians differences genomically (17 diff strains from patients) - downing 2011
Compared and found 9 genetic loci different which could potentially be manipulated in future
For example CNVs varied between the strains, with mapk locus amplification
Suggestive of the selective pressures they are put under during drug policies
What was used before ngs to look at diversity of parasites
Micro satellites and snp sequencing