L2 Gene annotation and transcript analysis Flashcards
Gene annotation
Genome annotation involves mapping features such as protein coding genes and their multiple mRNAs, pseudogenes, transposons, repeats, non-coding RNAs, SNPs as well as regions of similarity to other genomes onto the genomic scaffolds.
Exons
Exons are coding sections of an RNA transcript, or the DNA encoding it, that are translated into protein.
Introns
Introns are noncoding sections of an RNA transcript, or the DNA encoding it, that are spliced out before the RNA molecule is translated into a protein.
Open reading frames
Open reading frames (ORFs) are parts of a reading frame that contain no stop codons. A reading frame is a sequence of nucleotide triplets that are read as codons specifying amino acids; a single strand of DNA sequence has three possible reading frames.
A sequence of successive nucleotide triplets that are read as CODONS specifying AMINO ACIDS and begin with an INITIATOR CODON and end with a stop codon (CODON, TERMINATOR).
Sense strand
one strand is the coding strand (or sense strand)
The sense strand is the strand of DNA that has the same sequence as the mRNA
antisense strand
template strand
It is the DNA antisense strand which serves as the source for the protein code
BLAST
Basic local alignment search tool
BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
blastn
nt => nt
blastp
aa => aa
blastx
translated nt => aa
tblastn
aa => translated nt
tblastx
translated nt => translated nt
E value
chance of seeing this score given complexity of database required
- want low e-value, better reliability
ENCODE
Encyclopedia of DNA elements
aims to identify all functional elements in the human and mouse genomes
DNase1 hypersenstivity
- won’t chop up DNA wrapped by histones