Módulo 1 Flashcards
António
O que são genes parálogos?
Genes present in a particular organism that are related to each other through a gene duplication event (come from the same gene but have different functions)
What is MSA?
Multiple Sequence Alignment. Método computacional que dá highlight a sequência mais longa possível que seja semelhante entre múltiplas proteínas.
O alinhamento múltiplo é a principal ferramenta de pesquisa de semelhanças entre elementos da mesma família.
O que são padrões conservados (Conserved patterns)?
Sequências que surgem do alinhamento e podem ser utilizadas para definir assinaturas que caracterizam uma família ou domínio.
What does the following string represent?
ATOM 400 N ALA A 53 36.594 24.706 31.023 1.00 15.56 N
Protein Data Bank PDB format (text)
400 Atom number
N The actual atom
ALA A Amino Acid chain
53 Amino acid number
36.594 X coordinate
24.706 Y coordinate
31.023 Z coordinate
1.00 15.56 N Factor Beta (Error in the position of the atoms)
O que é Alternative Splicing?
O splicing alternativo é um processo celular no qual os exões do mesmo gene são unidos em diferentes combinações, levando a transcrições de mRNA diferentes, mas relacionados. Estes mRNA podem ser traduzidos para produzir diferentes proteínas com estruturas e funções distintas – tudo a partir de um único gene
What can Alternative Splicing be used for?
It’s a way to use the same gene to encode proteins that are more or less related.
NEXT
NEXT
Why are there more proteins in the cell than there are genes for said proteins? (Example, why does a cell have 100 proteins but only 10 genes?)
Through Alternative Splicing, a gene can encode for different proteins by removing different parts of itself, producing different proteins.
What is a domain?
They are areas of the protein that are more or less different or independent from each other. Sufficiently stable to fold independently.
What is the difference between a domain and a subunit?
A domain is a discrete function and/or structural section of a polypeptide. This differs from a subunit, which is a single polypeptide in a protein which is in turn composed of multiple polypeptides. It is important to note that subunits can have domains
How is it possible for a protein to have 2 subunits but one domain?
For example, if both subunits have the same function, it can be considered that the protein has only one domain.
Why does automatic genome annotation work?
Because similar sequences imply similar functions.
What is the basis for our understanding how structural domains evolve in proteins?
Similar sequences imply similar structures.
What happens when we have a lower sequence similarity?
Higher structure dissimilarity
Why is the function of an enzyme easy to estabilish?
Because they catalize the same reaction
What does %ID measure? What about RMSD?
%ID: Sequence Similarity
RMSD: Structure Similarity
What happens to the RMSD when proteins are similar?
Tends to be lower
What are some measurements of RMSD?
Posições mais semelhantes de aa
Distância entre alfa-C nas cadeias peptídicas, média do quadrado da distância
What is the seed alignment?
The alignment we use first. A hand-curated alignment of known members of the family
What is the structural unit of the protein?
The domain.
The number of possible domain folds known so far is theoretically finite but in reality it’s practically infinite. True or false?
False. Very large but seems to be finite
O que são domain folds?
O número de formas diferentes de empacotar os elementos da estrutura secundária nos domínios.
Sequence based predictions predict structural features based on the secondary structure, the aminoacid sequence. True or false?
False. It’s the primary structure, (aa sequence)
Give an example of a false positive in a deterministic pattern
The sequence verifies the pattern although it has the rarest possibilities in all of the amino acids
Give an example of a false negative in a deterministic pattern
The sequence verifies everything in the pattern except one completely unexpected amino acid.
What are some downsides of DP?
Extremely high rigidity, which means it doesn’t accept any deviations to the pattern.
What is a Confusion Matrix?
A Table with two rows and two columns that reports the number of true positives, false negatives, false positives, and true negatives
Num classificador binário, a sensibilidade é dada pela taxa de verdadeiros positivos e a especificidade é dada pela taxa de falsos negativos. Verdadeiro ou falso.
FALSO. A especificidade é dada pela taxa de verdadeiros negativos.
A consensus sequence is likely a true positive. True or false?
True.
Name one big advantage of PSSM.
It has a lot less false negatives because it can put any amino acid in any position
What’s PSSM and what’s its deal?
Position specific scoring matrix attributes a value to each amino acid, and depending on the score they receive, they conclude if the sequence belongs or not to a certain family or domain.
PSSM attributes a score of 0 to any gap/transition in the sequence. True or false?
False. It does not attribute scores to gaps/transitions.
PSSM doesn’t attribute a score to any gaps in the sequence, but it does attribute weight. True or false?
True
In a weight matrix, the higher the score between two aminoacids, the higher the chances of them being interchangable. True or false?
True
What is homology modelling?
The use of information from homologous proteins from databases and other sequences to model a protein.
What are the three mains steps of homology modelling?
Alignment and superposition with the reference structure, improvement and preliminary structure, and optimization of energy.
Swiss model is unreliable only in the position of loops and the side groups of amino acids, and is only reliable for the position of the backbone (alfaCarbon)
True
What is the Rosetta Method?
Splitting a sequence into several shorter sequences, and trying to fill the gaps with “randomly generated” sequences, until something fits.
Alphaphold is so good it’s similar to structures provided by…
X-Ray crystallography
Num classificador binário, a sensibilidade é dada pela taxa de verdadeiros positivos e a especificidade é dada pela taxa de verdadeiros negativos. Verdadeiro ou falso
Verdade
Num classificador binário, a especificidade é dada pela taxa de verdadeiros positivos e a sensibilidade é dada pela taxa de verdadeiros negativos. Verdadeiro ou falso
Falso.
Sensibilidade é dada pela taxa de verdadeiros positivos
Especificidade é dada pela taxa de verdadeiros negativos
O que significa pontuação (local) de QMEAN4
O QMEAN4 é uma função de scoring usada assessar a qualidade da previsão de modelos de estrutura proteica, que compões outras 4 métricas, como a interação entre os Cb, a interação entre todos os átomos, exposição ao solvente e a torsão entre 3 aa consecutivos. Uma pontuação local iria significar a qualidade de previsão de um aa ou local em específico, tendo em conta as métricas.