Bioinformatics Flashcards
chromosomal microarray
looks for deletions and duplications
transition
change from one purine to another purine (A>G) or the change from one pyrimidine to another (C>T)
transversion
the change from a purine to a pyrimidine (A >T) or change from pyrimidine to a purine (C>G)
define identity
the extent to which two sequences (nts or AAs) hav the same residues at the same positions in an alignment = expressed as a percentage
define similarity
extent to which nucleotide and protein sequences are related
- similarity between two sequences can be expressed as percent sequence identity and/or percent positive substitutions
homologs
Similarity to nucleotide or protein that can be attributed to a common ancestor
orthologs
types of homologs present in different species that arise from same ancestor; Similar sequences but different functions; may or may not have same function but same origin
paralogs
result of gene duplication from early ancestor; very early on globin gene diverged to become alpha and beta
alignment types (2)
global = align the complete sequences
local = identify only the most similar segments or sequence patterns (motifs)
dynamic programming
best but takes a while
quick good answer = heuristic
T or F. Transversions happen more than transitions
F! Transitions happen more frequently than transversions
= often desirable to score these substitutions differently
simple scoring matrix
match = 5
gap = 1
transition = 3
transversion = 2
BLAST
Basic Local Alignment Search Tool
- finds regions of similarity between biological sequences
- program compares nucleotide or protein sequences to sequence databases and calculates the statistical difference
Blast E-value
the number of expected hits of similar quality (scoe) that could be found just by chance
E- value of 10 = up to 10 hits can be expected to be found just by chance, given the same size of a random database
the smaller the E-value, the better the match!
megablast
default
for sequence ID, intra-species comparison