Lecture 2 Flashcards

1
Q

What is protein bioinformatics

A

Analysis of protein sequences and structure to get insight on the properties and function of the protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do we use to compare sequences

A

Blast (from the NCBI website)

It looks for other sequences in the data base that match the one you put in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can you put in the query of a blast

A

The accession number

The gi

The bare sequence

Or the FASTA formatted sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a GI number (gi)

A

It’s a simple series of numbers that are assigned to each sequence process by NCBI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is fasta format

How long are the lines

A

Starts with > then a single line description of the sequence on top

All lines of sequence are shorter than 80 characters

No blank lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do B U X Z * - stand for in fasta

A

Aspartate/asparagine

Selenocysteine

And amino acid residue

Glutamate/glutamine

Translation stop

Gap of any length to align the sequence better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is selenocysteine

A

Another AA after bacteria hijack 1 of 3 stop codons and replace them with pyrolysine or selenocysteine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are metagenomic proteins

A

Extract RNA/DNA for bulk sample (like ocean water)

Takes that sequence and do blastp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When does quick blastP (accelerated protein protein blast) work best

A

If the target is more than 50% identical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the other type of blasts

A

Psi blast (position specific scoring matrix based on first run)

Phi blast (alignments that are limited to one that match a pattern in the query)

Delta blast (position specific scoring using results of a conserved domain database)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is BLOSUM62

A

The matrix assigns a score for aligning pairs of residues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Negative charged amino acids

A

Aspartate glutamate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Postive charged amino acids

A

Lysine, histidine, arginine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Polar uncharged amino acids

A

Serine, threonine ,asparagine, glutamine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Amino acids with hydrophobic side chains

A

Leucine, valine, isoleucine, alanine, methionine, phenylalanine, tyrosine, tryptophan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which amino acids are special cases

A

Cysteine, selenocysteine (U), glycine, proline (helix breaker)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why do the unique amino acids get higher score during BLOSUM

A

Because since they’re so unique, they’re in the position for a reason meaning they get a higher score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When scoring, what is the affect of putting gaps in the sequence to match the amino acids

A

That match gets a -1 score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In scoring, cysteine with any other amino acid gets what score

A

A negative score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why does the algorithm really like to align tryptophans with tryptophans? (give it a very high score)

A

Because it’s such an unusual amino acid

21
Q

In the graphic summary of a blast p, the first line is

If one line start shorter then other lines what does this mean

A

The top hit

The first half of that sequence doesn’t match so it’s a gap

22
Q

What is the expected or E value of a blastp

A

Tells the number of hits (matches) expected to be got by chance

It’s used to create a threshold of significance (like how likely is it that it got aligned by chance)

If low that means that the sequence is a signifanct match and should be

23
Q

What do the positives mean in a blastp sequence alignment

A

If it puts + in this means it aligned a conserved substitution

Ex. F to Y, it matched these with a plus because they have the some properties but are different amino acids

24
Q

If in a sequence there are AV

How many gaps is it

A

2

25
Q

What is a blastp clustal alignment

A

Shows all the different sequence alignments of all matches

26
Q

What is similarity between sequences quantified by

A

% identity
% similarity (similar amino acids, Leucine, isoleucine)

27
Q

What is homologous in matcheing sequences

A

The products of 2 genes have a shared ancestry

Meaning it matches sequences that may have come from a common ancestor

28
Q

In a table with amino acids and their preference to adopt a specific secondary structure, what does a value greater than one mean

A

Show that that amino acid has a tendency to adopt that secondary structure

29
Q

What are the helix breakers

A

Glycine and proline

30
Q

What are IDR’s

A

Intrinsically disordered regions

31
Q

Why would something want intrinsically disordered regions

A

Exposes short linear motifs that mediated protien protein interactions

Allows for regulation of the protein funtion due to PTM at this IDR

Regulates the proteins half life by engaging proteins that have been targeted for degredation by the proteosome (so adds ubiquitin to the IDR)

Adopts different confirmations when binding to different interaction partner

32
Q

What are traits of intrinsically disordered protiens (IDP)

A

They are fully disordered

Can be boiled and stay soluble (instead of precipitating)

33
Q

IDR are ____ than loops and turns

A

Longer

34
Q

Example of a protein with IDR

A

PP2B/calcineurin

35
Q

What are sequence signatures

A

A sequence that has certain key amino acids in specific positions that only are there to do a specific role (like fold specifically or a have specific property)

36
Q

[LMFY]

{EF}

x

In a sequence signature means what

A

Any amino acid in the brackets

Any amino acid except the ones in the brackets

Any amino acid

37
Q

What are motifs

How long are they

A

Short sequence pattern that has a specific function

Usually 3-8 aA , max is 20aa

38
Q

Give example of motifs

A

Transit peptides (n term sequence that takes the protein to a specific area in The cell)

Binding sequence (the sequence makes the protien complex with another protien, specific)

Motif is recognized for covalent modification

39
Q

What are domains

A

A region of the protiens polypeptide chain that folds independently and has a specific function

Like a parts list for proteins

Ex. SH2/SH3 domains

40
Q

What does the website PROSITE tell us

A

About the proteins signatures, domains, and motifs

41
Q

What does < and > mean is prosite

A

Amino terminal element

Carboxy terminal element

42
Q

What does x(2,4) mean in prosite

A

x-x

Or x-x-x

Or x-x-x-x

So any number from 2 to 4 of any amino acid

43
Q

What is the rule for x(2,4)

A

Only for x and not allowed at the amino of carboxy terminus unless anchored to the terminus

44
Q

What website lets us see transmembrane regions/prediction of a protein

A

DeepTMHMM

45
Q

What are SLiMs

A

Short linear interaction motifs

They drive specific protein protein interactions

46
Q

Give 2 examples of what a SLiM does

A

The motif RVxF on one protein docks PP1 (protein phosphotase 1) on to that protein

It’s a 5 residue motif

Peroxisime targeting: signals are located at the c termini of the protein (ex. SKL coo-)

This makes it go to the peroxisome

47
Q

What is pY

A

Phosphotyrosine

48
Q

Once a transit peptide takes its protein to a certain area in the cell what happens

A

A protease cleaves the transit peptide

49
Q

What are transit peptides used for

A

To go to chloroplast, mitochondria, secretion From cell