Struggle Flashcards
What is bioinformatics?
It is the analysis and conceptualisation of complex biological information.
What us BLOSUM62
is a substitution matrix used for sequence alignment of proteins. BLOSUM matrices are used to score alignments between evolutionarily divergent protein sequences. They are based on local alignments. Pairwise alignment greater than 62%
Explain Affine Gap Penalties
Penalises insertions/ deletions, Penalty for gap openings, gap extensions, length of gap extensions. Gap openings have a higher cost.
What is In Silico
Ligand analysis performed on a computer
Explain BLAST
(basic local alignment search tool) is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of proteins or the nucleotides of DNA and/or RNA sequences. It uses Heuristic to speed us computation
What is Dynamic Programming
dynamic programming is solving complex problems by breaking them into states. It gives a score to find the optimal alignment. This process is very slow. The steps involve 1. initialisation 2. scoring the matrix 3. traceback
Protein vs DNA
Protein has 20 characters rather than 4. Codons are degeneratable. Offers a longer look back in time.
Paralogs
Duplication event
Why is DNA used?
To identify cDNA, non-coding regions of DNA and to identify DNA polymorphorisms.
Types of Algorithms?
- Uniformative
- Ungapped
- Gapped
Describe a hierarchical approach?
- Different groups are given a chromsome to sequence
- The hroups genereate a bacterial artifical chromosome (BAC)
- BAC is divided and shothun sequences
- High fideltiy maps identify motifs and allow detection of overlapping sequences.
How many Genes were found
51k
How many genes code
20k
How many genes non code
20k
What are pseudo genes
genes that seem to be protein coding but mutation renderers them non coding. 18k found