protein feature analysis Flashcards
1
Q
membrane proteins
A
- single spanning segments or entire protein domains
- membrane region is in a hydrophobic environment
- side-chains tend to be hydrophobic
- main chains cna’t have NH/CO groups that don’t form hydrogen bonds
- single spanning must be alpha helical
- most are alpha helical
2
Q
transmembrane helices
A
- hydrophobic section ~35-50A wide
- often positive charges in the section abutting the membrane
- each residue in helix advances structure by 1.8A
- usually 20-30 residues long
3
Q
methods to identify TM regions
A
- early methods
- search for runs of hydrophobic residues
- scale of hydrophobicity
- Hopp & Woods
- Kyte & Doolittle
- hydrophobic plots
- current methods
- machine learning based
4
Q
hydrophobic plots
A
- moving window of 11 residues
- count hydrophobicity according to scale
- above a cutoff indicates TM region
- originally few sequences known
- improved by sequence availability
- build up to create HMM for ML
5
Q
mahcine learning methods of TM identification
A
- HMMs, neural networks or SVM
- OCTOPUS, PHOBIUS
6
Q
homology searching with TM regions
A
- many TM regions have similar hydrophobic residues
- easy to find false matches for your protein that seem significant
- conservative cutoffs needed
- can be better if only intracellular/extracellular parts of sequence put in to e.g. PSI-BLAST
- reduce false matches
7
Q
signal peptides
A
- 15-60 residues at start of protein
- directs to correct cellular location
- generally cleaved
- often hydrophobic region followed by pattern (cleavage site)
- SignalP:
- HMMs and neural networks to rpedict location
8
Q
low complexity regions
A
- composition biased strongly to a small number of amino acids
- in many proteins
- distorts statistical significance of alignments
- SEG
- replaces low complexity with lower case letters in blast
9
Q
coiled coils
A
- 2/3 intertwined alpha helices
- often hydrophobic where wthey pack together
- short (20 res) or much longer
- identify with COILS
10
Q
disordered proteins
A
- small disordered regions that can’t be identified experimentally with px/NMR
- often C/N term, or long loop
- or fleixble proteins that adopts multiple structures
- may be structured upon binding
- often lower fraction of hydrophobic residues than folded protein with hydrophobic core
- use neural networks or SVM
- DISOPRED2
- IUPRED