Bioinformatics Flashcards
chromosomal microarray
looks for deletions and duplications
transition
change from one purine to another purine (A>G) or the change from one pyrimidine to another (C>T)
transversion
the change from a purine to a pyrimidine (A >T) or change from pyrimidine to a purine (C>G)
define identity
the extent to which two sequences (nts or AAs) hav the same residues at the same positions in an alignment = expressed as a percentage
define similarity
extent to which nucleotide and protein sequences are related
- similarity between two sequences can be expressed as percent sequence identity and/or percent positive substitutions
homologs
Similarity to nucleotide or protein that can be attributed to a common ancestor
orthologs
types of homologs present in different species that arise from same ancestor; Similar sequences but different functions; may or may not have same function but same origin
paralogs
result of gene duplication from early ancestor; very early on globin gene diverged to become alpha and beta
alignment types (2)
global = align the complete sequences
local = identify only the most similar segments or sequence patterns (motifs)
dynamic programming
best but takes a while
quick good answer = heuristic
T or F. Transversions happen more than transitions
F! Transitions happen more frequently than transversions
= often desirable to score these substitutions differently
simple scoring matrix
match = 5
gap = 1
transition = 3
transversion = 2
BLAST
Basic Local Alignment Search Tool
- finds regions of similarity between biological sequences
- program compares nucleotide or protein sequences to sequence databases and calculates the statistical difference
Blast E-value
the number of expected hits of similar quality (scoe) that could be found just by chance
E- value of 10 = up to 10 hits can be expected to be found just by chance, given the same size of a random database
the smaller the E-value, the better the match!
megablast
default
for sequence ID, intra-species comparison
blastn
for searching with shorter queries, cross-species comparison
discontiguous megablast
for cross species comparison
T or F. BLAST is a heuristic alignment algorithm
T!
T or F. The lower the E-value, the more likely that two sequences are related
T!
carat
- site between 2 adjoining nucleotides, such as restriction enzyme site, is indicated by listing the to points separated by a carat
- n^n+1 = 55^56 (between 55 & 56, sequence cut there)