Genomic sequencing Flashcards
What are the two types of protein prediction
Based on homology: characterises new genes based on a sequence of a related gene that already exists
Ab initial (from beginning): algorithms that identify sequences in the DNA that signals a gene (e.g. promoter)
DNA sequence polymorphism
- Mostly single nucleotide polymorphisms
- intergenic: outside coding region
- intragenic: inside coding region > bigger impact
Genome wide associations study example
In cattle
- bulls that have curly hair are more susceptible to ticks and insects
- SNP in sequence that codes for keratin, which changes how the hair falls
Pan genome
describes all genes and genetic variation within a species
- Core genome: common to all individuals
- dispensable genome: shared within only a subset
Comparative genomics
: compare the complete genome sequences of different species
Tool for studying evolution, gene content variation and their function
Genome wide association study
- Identify risk locus > identify which SNPs are more likely to be associated with trait according to case control studies
- validate risk locus with fine mapping and define the lead associating signal
- predict causal SNPs
- Identify disease mechanism > candidate gene/protein function
- Link to phenotype > examine effect of altered gene function on cellular phenotype
Transcriptome
- sequences RNA instead of DNA
- when, why and where are genes expressed
- how treatments effect expression
Single cell RNA-Seq
Identify which genes are most different between single cells
for cancer cells: what genes are causing cancer
Proteomics
: entire set of proteins produced by a cell type - obtained from RNAseq
- used to study how proteins are modified, how they interact in metabolic processes
Protein sequence similarity searches
local alignment system: finds the most similar parts of a DNA or protein, not aiming to align the full length