L2 Gene annotation and transcript analysis Flashcards
Gene annotation
Genome annotation involves mapping features such as protein coding genes and their multiple mRNAs, pseudogenes, transposons, repeats, non-coding RNAs, SNPs as well as regions of similarity to other genomes onto the genomic scaffolds.
Exons
Exons are coding sections of an RNA transcript, or the DNA encoding it, that are translated into protein.
Introns
Introns are noncoding sections of an RNA transcript, or the DNA encoding it, that are spliced out before the RNA molecule is translated into a protein.
Open reading frames
Open reading frames (ORFs) are parts of a reading frame that contain no stop codons. A reading frame is a sequence of nucleotide triplets that are read as codons specifying amino acids; a single strand of DNA sequence has three possible reading frames.
A sequence of successive nucleotide triplets that are read as CODONS specifying AMINO ACIDS and begin with an INITIATOR CODON and end with a stop codon (CODON, TERMINATOR).
Sense strand
one strand is the coding strand (or sense strand)
The sense strand is the strand of DNA that has the same sequence as the mRNA
antisense strand
template strand
It is the DNA antisense strand which serves as the source for the protein code
BLAST
Basic local alignment search tool
BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
blastn
nt => nt
blastp
aa => aa
blastx
translated nt => aa
tblastn
aa => translated nt
tblastx
translated nt => translated nt
E value
chance of seeing this score given complexity of database required
- want low e-value, better reliability
ENCODE
Encyclopedia of DNA elements
aims to identify all functional elements in the human and mouse genomes
DNase1 hypersenstivity
- won’t chop up DNA wrapped by histones
Polymorphism
A DNA polymorphism is any difference in the nucleotide sequence between individuals. These differences can be single base pair changes, deletions, insertions, or even changes in the number of copies of a given DNA sequence. SNPs (single nucleotide polymorphisms) are the most common type of DNA polymorphism in humans.
Northern blots
A northern blot is a laboratory method used to detect specific RNA molecules among a mixture of RNA. Northern blotting can be used to analyze a sample of RNA from a particular tissue or cell type in order to measure the RNA expression of particular genes.
Southern blot
a procedure for identifying specific sequences of DNA, in which fragments separated on a gel are transferred directly to a second medium on which assay by hybridization may be carried out.
Western blot
Western Blot (WB) is a common method to detect and analyze proteins. It is built on a technique that involves transferring, also known as blotting, proteins separated by electrophoresis from the gel to a membrane where they can be visualized specifically.
qPCR
- method to detect and quantify nucleic acid sequences
In conventional PCR, the amplified DNA product, or amplicon, is detected in an end-point analysis. In real-time PCR, the accumulation of amplification product is measured as the reaction progresses, in real time, with product quantification after each cycle.
Primers
A primer is a short single strand of RNA or DNA (generally about 18-22 bases) that serves as a starting point for DNA synthesis. It is required for DNA replication because the enzymes that catalyze this process, DNA polymerases, can only add new nucleotides to an existing strand of DNA.
Isoform
any of two or more functionally similar proteins that have a similar but not identical amino acid sequence
Isogenic chromosomes
“isogenic chromosome”: in a diploid organism, a chromosome in which both alleles at every locus are identical on both copies
Threshold cycle
Crossing point
Quantification cycle
The threshold line is the level of detection or the point at which a reaction reaches a fluorescent intensity above background levels. Before conducting PCR, you (or the software in your cycler) set a threshold level. This is literally a line in your graph that represents a level above background fluorescence, that also intersects your reaction curve somewhere in the beginning of its exponential phase (Figure 1).
The Cq value or cycle quantification value is the PCR cycle number at which your sample’s reaction curve intersects the threshold line. This value tells how many cycles it took to detect a real signal from your samples. Real-Time PCR runs will have a reaction curve for each sample, and therefore many Cq values. Your cycler’s software calculates and charts the Cq value for each of your samples.