Bioinformatics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

When was the structure of DNA determined?

A

1953

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Since the determination of the structure of DNA in 1953 and the realisation that this molecule is the carrier of genetic information, it became a scientific priority to …

A

…determine the precise sequence of nucleotides within chromosomes and find out the relationship between this sequence and the workings of the cell.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In 1977 Fred Sanger published…

A

… the first “rapid” DNA sequencing method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In 1977 Fred Sanger published the first “rapid” DNA sequencing method

The same year, he published the first …

A

…complete DNA genome, which was of the Phi X 174 (ΦX174) bacteriophage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In 1995 the first complete genome was sequenced and published of the …

A

…bacterium Haemophilus influenzae.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the genome of the bacterium Haemophilus influenzae.

A

Circular DNA genome consisting of
1,830,140 base pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How many protein encoding genes does Haemophilus influenzae encode?

A

Encodes 1740 protein encoding genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In 1997, the first complete eukaryotic genome was sequence and published of …

A

…the yeast Saccharomyces cerevisiae

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is DNA organised in the yeast Saccharomyces cerevisiae?

A

DNA organised on 16 chromosomes consisting of:
12,156,677 base pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How many potential genes does the yeast Saccharomyces cerevisiae encode?

A

6275

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In 2003 the complete human genome was …

A

… sequenced and published.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Since the completion of the human genome there has been an explosion in the amount of …

A

…DNA sequence data available due to advances in DNA sequencing techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Since 1995 the number of DNA sequences deposited into DNA databases has been …

A

…growing exponentially

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

February 2021 GenBank sequence database contains:-

A

776,291,211,106 bases
in
226,241,476 sequence records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

there are 3 principal comprehensive databases of nucleic acid sequences in the World which are:

A

1) EMBL – European Molecular Biology Laboratory
2) GenBank – National Centre for Biotechnology
3) DDBJ – DNA Data Bank of Japan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The 3 principal comprehensive databases share…

A

…information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The 3 principal comprehensive databases share information and hence…

A

…contain almost identical sequences, and store sequence information that is publicly and freely accessible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define bioinformatics?

A

the use of computational methods to study biological data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the first definition of bioinformatics?

A

1) The development of computational methods for studying the structure, function and evolution of genes, proteins and whole genomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the second definition of bioinformatics?

A

2) The development of methods for the management and analysis of biological information arising from genomics and high-throughput experiments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is genomics?

A

the study of whole sets of genes rather than a single gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are high-throughput experiments?

A

development of experimental techniques that allow the study of thousands of genes simultaneously e.g. microarray technology/proteomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

by understanding the process of mutation and selection that act on the DNA sequences, molecular biologists can …

A

…compare the DNA and protein sequences of common genes between different species to develop molecular phylogenetic trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

in fact, evolutionary ideas underlie…

A

…many of the methods used in bioinformatics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

in fact, evolutionary ideas underlie many of the methods used in bioinformatics -we use them to …

A

…compare sequences, identify families of genes and proteins and establish homology between genes in different organisms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is meant by degenerate?

A

several codons may code for a single amino acid meaning that a nucleotide change may not result in a change of amino acid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Since the genetic code is degenerate, sometimes it is more informative to …

A

…examine the amino acid sequence of the protein gene product.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Computer software can convert …

A

… a dna sequence into an amino acid sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How many possible reading frames are there?

A

3 possible reading frames

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Three possible reading frames x two strands of DNA = ?

A

Six possible translations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

As well as reading the code, the software looks for …

A

… start signals (AUG) and stop signals (UGA, UAG, UAA), to find the open reading frames (ORFs).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

one of the most fundamental and frequent bioinformatic analyses are …

A

…sequence alignments

33
Q

How do sequence alignments work?

A

here we take two (or more) DNA/protein sequences and compare them using a scoring system to determine the degree of homology (identity and similarity).

34
Q

What is the first step of sequence alignments?

A

the first step is to compare sequences to find out how alike they are.

35
Q

we are often interested in parts of the sequence which are …

A

… well conserved for a particular type of protein - these are called motifs.

36
Q

We are often interested in parts of the sequence which are well conserved for a particular type of protein - these are called motifs.

For example …

A

… members of the thioredoxin protein family have a C-X-X-C motif ( Cysteine – any aa – any aa – Cysteine ).

37
Q

within the above sequence, there is some…

A

…homology between the two amino acid sequences

38
Q

within the above sequence, there is some homology between the two amino acid sequences - the test sequence contains a …

A

…C-X-X-C motif and shares some identity (exact matches) between itself and the hPDI sequence.

39
Q

What is identity?

A

Exact matches between two sequences.

40
Q

within the above sequence, there is some homology between the two amino acid sequences - the test sequence contains a C-X-X-C motif and shares some identity (exact matches) between itself and the hPDI sequence

10 (red letters) out of the 20 amino acids match giving a score of …

A

…50 % identity

41
Q

Amino acids differ in their …

A

… R groups

42
Q

Amino acids differ in their R- groups, which can be classified as …

A

…hydrophobic, polar, positively charged or negatively charged

43
Q

Amino acids differ in their R- groups, which can be classified as hydrophobic, polar, positively charged or negatively charged. Therefore, some amino acid changes are …

A

…more severe than others

44
Q

What happens if we change one amino acid for another with similar properties?

A

may not affect protein function.

45
Q

amino acids differ in their R- groups and changing one amino acid for another may not be so detrimental to the protein if the amino acid is …

A

… similar in chemical/physical character

46
Q

in this case, the green coloured amino acids are…

A

…similar in chemical character

47
Q

in this case, the green coloured amino acids are similar in chemical character.

we include these as …

A

…‘positives’ along with the identical (red) matches

48
Q

so 13 (red and green letters) out of the 20 amino acids within the sequence match, giving a ‘positives’ or ‘similarity’ score of …

A

…65%

49
Q

What does a positives or similarity score of 65% mean?

A

these high values imply that the two proteins belong to the same protein family - are likely to share some functionality and that they were derived from the same evolutionary ancestor

50
Q

we may want to compare DNA/amino acid sequences from many different species to…

A

…see how homologous they are

51
Q

we may want to compare DNA/amino acid sequences from many different species to see how homologous they are

we can do this using a web-based program called …

A

…ClustalW2 at the EBI

52
Q

ClustalW2 at the EBI performs…

A

…multiple sequence alignments

53
Q

once the alignment is performed you can use…

A

…various tools within the program to highlight areas of percent identity/similarity

54
Q

Scientists have found patterns in amino acid sequences which recur in …

A

…proteins with the same function.

55
Q

Small sequences of conserved amino acids are called …

A

…motif

56
Q

Define motif

A

Small sequences of conserved amino acids.

57
Q

Purpose of C-X-X-C motif?

A

This motif is often used in proteins to take part in oxidation and reduction reactions (redox). Therefore, we can deduce that the test gene which we have just identified may have a role in redox reactions.

58
Q

we can use programs to compare a DNA/protein sequence to find …

A

…others that are similar within and between different species

59
Q

What does BLAST stand for?

A

Basic Local Alignment Search Tool

60
Q

What is BLAST (Basic Local Alignment Search Tool)?

A

A statistically driven searching and alignment tool that searches ALL available sequence databases for similarity to the input sequence

61
Q

Function of Translate?

A

this program translates a DNA sequence into the corresponding amino acid sequence in all six possible open reading frames

62
Q

Function of ProtParam ?

A

analyses the primary amino acid sequence to give useful data such as the size of the protein, its isoelectric point, its extinction coefficient, the number of hydrophobic/hydrophillic residues and how stable it may be

63
Q

Function of Psort

A

using the amino acid sequence, this program looks for known signals within the sequence and predicts where your protein will end up in the cell – i.e. nucleus, ER, plasma membrane, secreted etc

64
Q

Function of Tmpred or TMHMM ?

A

searches for regions of hydrophobic amino acids to predict if the resulting protein is likely to be integral within a membrane

65
Q

Function of PSIpred or PredictProtein?

A

Predicts what regions of the primary sequence fold into secondary structures (a-helices and b-sheets) or even tertiary/quaternary structures.

66
Q

Function of Protein Data Bank (PDB)?

A

stores data about the structure of proteins that come from either X-ray crystallography or NMR experiments

67
Q

The protein data bank stores…

A

…the coordinates of every atom within a protein and allows you to build 3D models of proteins that you can examine using a Jmol viewer

68
Q

structural data is very important in trying to understand…

A

…how proteins work as ‘machines’ at the molecular level especially when considering inhibitors or mutations that may alter the structure and therefore activity

69
Q

we can use the structures of known proteins to build …

A

…structural models for new amino acid sequences to get an idea of what the eventual protein could look like and thus what function it may perform

70
Q

despite becoming more sophisticated and reliable, many of these bioinformatic programs are based on …

A

…statistical packages and can only PREDICT the structure/function/localisation of a protein - laboratory experiments are STILL required to confirm the predictions made by the programs

71
Q

The activities of the cell are determined by …

A

…when the genes are expressed or stop being expressed.

72
Q

DNA microarray technology enables us to …

A

…examine gene expression in different circumstances.

73
Q

How does DNA microarray technology work?

A

Oligonucleotides specific to each gene from the genome are fixed onto a ‘chip’ and then probed with free fluorescent oligonucleotides derived from mRNA of a control and test cell.

74
Q

In microarrays, what does green represent?

A

gene expressed only under test conditions

75
Q

In microarrays, what does red represent?

A

gene expressed only under control conditions

76
Q

In microarrays, what does yellow represent?

A

genes expressed under control and test conditions

77
Q

What is RNA-seq

A

RNA-Seq is a sequencing technique which uses next-generation sequencing to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing cellular transcriptome.

78
Q

A number of databases provide…

A

…open access RNA-seq data for analysis.