L4 - Bioinformatics 1 Flashcards

1
Q

what are bioinformatics

A

bridges the gap between BIG daya sets ajnd actual biological understanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

overview of bioninformatixs proscess

A

have a query/question

probe database

evaluate results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

2 types of nucleotide databases

A

primary:
experimental data depoisioted direcetly by scientists
= GenBank

secondly:
info from a primary database but processed
= RefSeq

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

name a primary and secondary database

A

Genbank:
interbational database from EMBL,NCBI and Japan.
can have multiple copies of the same nucleotide sequnce each with a ‘UNIQUE’ accession number

RefSeq:
manually curated database from GenBank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is UniProtKB comprised of

A

TreEMBL:
protein sequences automatically annotated by computer from nucletide sequnces = unrevewed and redundant

Swiss-Prot:
manual,high quality annotatuion,reviewed, non-redundant gold standard.

non-redundant = 1 record per molecule per species for fully sequnced organisms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what else does Uniprot do

A

cross-referebce and link to other resources

useful entry point to start investigating a protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is FASTA format

A

displays both DNA and protein sequnces without spaces

commonly usedf for analysis programmes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

the linking of sequnce , structure and function

A

amino acid sequnce determines the structure which in turn dictates a proteins function

Homologous proteins share conserved amino acid patterns adopting similar folds with related functions

= we can use this to predict the function of a protein if we know its amino acid sequnce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are protein domains

A

structural units of abount 50 amino acids

proteins can contain multiple domains

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is InterproScan

A

comapres query sequence to all other sequnces

assigns probability for any amino acid at a particular position within a domain whether they are identical or non-conserved

a threshold score determines if a certain domain is likely to be present

= InterproScan does all this and tells you which part of your protein is likely to have which domains

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is Prosite

A

identifies post-translational modifications

finds patterns using short mtofs linked ton modifications

generates hypothesis about your protein that you can then go and test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly