Gene finding Flashcards
Define open reading frame (ORF)
A stretch of DNA whose length is a multiple of 3, that begins with the start codon (ATG) and ends with one of the 3 stop codons (TAA, TAG, TGA)
Define frameshift mutation
Mutation that inserts or deletes a nucleotide/nucleotides
Explain evidence-based gene finding
Identify RNA or protein sequences and map them back to the genome
Explain ab initio gene prediction
- Find open reading frames (ORFs)
- Test the probability of ORFs appearing by chance via statistical approaches or matching against a database of known motifs
Explain gene finding by comparative genomics approach
detect conserved DNA regions by comparing a large set of related genomes
Define homolog
A gene related to a second gene by descent from a common ancestral DNA sequence.
Define ortholog
Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologs retain the same function in the course of evolution.
Define paralog
Paralogs are genes related by duplication within a genome. Paralogs can have differrent functions.
Explain randomization test
Simulate random data according to null model.
The p-value is the fraction of simulated data that have a higher value for the test statistic than the observed one.