Bioinformatics. Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Define the DNA reading frame?

A

Each strand of DNA has 3 possible open reading frames where the strand is read in codons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How many reading frames does a DNA molecule have?

A

6.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define genome annotation?

A

The process of obtaining biological information from unprocessed, sequenced genetic data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the MASCOT search engine?

A

A bioinformatic database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is bioinformatics?

A

The process of solving biological problems by utilising information stored on computer databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is used in bioinformatics to increase biological understanding?

A

Biological databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Who coined the term bioinformatics in 1979?

A

Paulien Hogeweg.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How has bioinformatics helped the field of medicine?

A

By allowing us to compare biological processes in healthy and diseased bodies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is the information in bioinformatic databases used to help advance medicine?

A

The information has been collected from past patients and can be used to diagnose the same disease in others.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What kind of maps has the information from bioinformatic databases helped to create?

A

Genetic maps that show heritable traits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can bioinformatic databases help taxonomists classify species?

A

They can store the genome sequences of different organisms.

This allows for comparisons to be made between different organisms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How has bioinformatics helped law enforcement companies?

A

Police forces use databases to store DNA profiles of convicted offenders making it easier to catch repeat offenders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can bioinformatics help molecular biologists conduct their experiments?

A

Primers designs are stored in databases. This allows scientists to easily build a primer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How has bioinformatics helped pharmacologists?

A

They can use bioinformatics to design new drugs that are personalised for a persons genome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How has bioinformatics helped farmers?

A

It has helped farmers develop new strains of crops which are disease or pest resistant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

From what biological sources will bioinformatics use data?

A

DNA.

RNA.

Protein.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How can bioinformatics help scientists sequence DNA strands?

A

By storing the information from past DNA experiments, this allows scientists to compare sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How has bioinformatics helped with the study of proteins?

A

The storing of information related proteins allows other researchers to identify the same protein quickly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does the storage of information relating to proteins allow scientists to study about how proteins are changing?

A

It allows them to study evolution of proteins and also the mutations that can arise within their structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are 5 ways that bioinformatics can help to study DNA?

A

Analysis of a DNA sequence.

The discovery of new genes.

The discovery of regulatory regions within the DNA strand.

The ability to annotate whole genomes.

To carry out comparative genomics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The storage of DNA sequences in bioinformatic databses allows scientists to make what comparisons?

A

It allows scientists to compare genome sequences between different people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does the comparison of DNA sequences from different people allow for?

A

The detection of areas in the genome that code for genetic diseases such as sickle cell.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How does the study of different DNA strands help pharmacologists?

A

It allows them to develop new drugs that are likely to be absorbed and metabolised by the patient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How does the storage of different genomes help us to study the physical genome?

A

It helps us find new genes or regulatory regions such as a TATA box or a binding domain for a regulatory protein.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are 6 factors about RNA that bioinformatics can help study?

A

The different products of RNA splicing.

The expression of different RNA’s in different tissues.

The structure of different RNA’s.

The types of RNA that are produced by a single gene.

The RNA’s that are produced by thousands of genes at the same time.

The creation of specific DNA chips and microarrays for RNA analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Why is it important to store the information that relates to the products of RNA splicing?

A

To know all of the products that can be produced by a single gene.

To know which RNAs are produced in response to an external stimulus.

To build genetic probes and microarrays.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are 6 factors about proteins that bioinformatics can help with?

A

The identification of protein families.

The identification of various protein domains and regions.

The identification of various protein structures.

The identification of various protein functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How is bioinformatics used in the identifcation of proteins?

A

The results from electrophoresis and mass spectrometry are fed into a database for protein identification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are the 2 categories if bioinformatic databanks that store information relating to DNA, RNA and proteins

A

Databases that store information relating to nucleic acids.

Databases that store information relating to proteins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is GenBank?

A

The NIH genetic sequence database and is part of the International Nucleotide Sequence Database Collaboration (INSDC).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What 3 databanks help to make up the INSDC?

A

The DNA Data Bank of Japan (DDBJ).

The European Molecular Biology Laboratory (EMBL).

The GenBank at NCBI in the USA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What does the INSDC allow researcher to do?

A

To identify a particular nucleotide sequence by searching through millions of different nucleotides.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

In what format is genetic information at the NCBI stored?

A

In a format that displays information about the nucleic acid such as function etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is Uni-prot?

A

A database that combines all of the information from major international databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What 5 major databnaks does Uni-prot obtain information from?

A

European Bioinformatics Institute (EBI).

Protein Information Resource (PIR).

Georgetown University Medical Centre (GUMC).

National Biomedical Research Foundation (NBRF).

The Swiss Institute of Bioinformatics (SIB).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is ExPasy?

A

A tool from the SIB and it provides access to databases and software tools that cover all areas of life sciences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What does ExPasy stand for?

A

The expert Protein Analysis System.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is a query sequence?

A

A particular sequence that is chosen by the experimenter and instead into a BLAST search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What can a query sequence be made up of?

A

Of amino acids or nucleotides.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

When is a BLAST search performed?

A

When a researcher wants to discover more information that relates to their query sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

BLAST searches require query sequences to be of what length?

A

At least 15 nucleotides or amino acids.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

How do scientists perform a BLAST search?

A

They insert the query sequence into a database.

This allows them to compare their sequence to all of the known sequences within the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What are the 2 formats that a query sequence can be entered into a BLAST search?

A

The FASTA forma.

The identifier format.

44
Q

What does the FASTA format of a BLAST search consist of?

A

Only the nucleotide or amino acid sequence.

45
Q

How does the identifier format differ from the FASTA format?

A

It contains the sequence with an accession number or gene ID information.

46
Q

What is alignment?

A

The presentation of 2 sequences that can be compared to show the regions of similarity.

47
Q

How is alignment performed?

A

When a researcher compares their query sequence with a known sequence.

48
Q

What is the score value?

A

A term that is used to measure the quality of alignment between a query sequence and the search results.

49
Q

What is the score in the score value usually based on?

A

On the number of nucleotide or amino acid matches between the query sequence and the search results.

50
Q

Does a high score value mean good or bad alignment?

A

The higher the score the better the alignment.

51
Q

What does the score value allow for when selceting matches between search results and the query sequence?

A

For the selection of the best match between the query sequence and the search results.

52
Q

What is the E-value?

A

The expectation value.

It measures the amount of possible outcomes.

53
Q

What does the expectation value highlight?

A

The significance of an alignment.

Many alignments mean that there are many possible outcomes.

A single alignment means there is only one possible outcome.

54
Q

Does a low or high e-value indicate a good match between the query sequence and the search results?

A

The lower the E-Value the better the match.

55
Q

What is a genome annotation?

A

The process of obtaining biological information from unprocessed sequence data.

56
Q

What is the ultimate goal of genome annotation?

A

To create a labelled genome, where biological information is linked to a particular genetic sequence.

57
Q

What will a genome annotated map tell us?

A

Exactly what each gene does.

58
Q

What 2 categories can genome annotation be divided into?

A

Structural and functional annotations.

59
Q

What do strucutral genome annotations attempt to identify?

A

Genomic elements such as the promoter sequence or the TATA box.

60
Q

What 3 things do strucutural genome annotation involve?

A

Researching the structure of genes.

Researching the areas of the genome that code for certain products.

Reaserching the location of regulatory motifs such as the TATA box.

61
Q

What does the results from strucutral genome annotations allow for?

A

The identification of the cis factors that are used in transcription and translation.

62
Q

What do functional genome annotations allow for?

A

To identify the biological functions of genomic products such as proteins.

63
Q

What is the basic tool of genome annotations?

A

The BLAST program which allows us to identify similarities between genes and proteins.

64
Q

What is gene prediction software used for?

A

To identify the genes found in long DNA sequences that code for amino acids and have no STOP-codons.

65
Q

What factor inherent to DNA is used by gene prediction software when identifying genes on an unknown strand?

A

The DNA reading frame which allows for the strand to be read in triplets or codons.

66
Q

How many reading frames does each DNA strand have?

A

3.

67
Q

What does the correct reading frame from a DNA strand tell you about the strand?

A

The knowledge of the amino acid products that are provided by the DNA strand.

68
Q

How many reading frames must gene prediction software evaluate if it is analysing an entire DNA molecule?

A

6 possible reading frames for a DNA molecule as it consists of 2 strands.

69
Q

What happens once the correct reading frame has been interpreted by gene prediction software?

A

We can identify the protein that is created by a gene by using the list of amino acids that are created.

70
Q

What are 2 tools that can help identify the open reading frame?

A

The Open Reading Frame Finder at ORF FINDER or at GEN-scan.

71
Q

What complicates gene prediction software?

A

The fact that most of a genome is made from non-coding DNA.

This means gene prediction software must identify coding DNA from non-coding DNA.

72
Q

What are 10 common features found in coding DNA?

A

The open reading frame.

A start codon.

A stop codon.

A terminator sequence (prokaryotes).

A TATA box (eukaryotes).

A Shine Delgano sequence (prokaryotes).

Kozak sequence (eukaryotes)

A poly-A addition sequence (eukaryotes).

Intron and exon boundaries.

CPG islands.

73
Q

What kind of genome is the ORF very good at analysing?

A

The bacterial genome.

74
Q

Why is the ORF not good for analysing the eukaryotic genome?

A

Because they contain introns and exons that are spliced into and out of mRNA’s that code for proteins.

75
Q

What makes a good ORF?

A

It should begin with a START-codon (a methionine residue) and end with an in frame STOP-codon.

76
Q

How does the presence many STOP-codons that are located close together on an unknown strand affect the ORF?

A

It suggests that an ORF is not present.

77
Q

Why are longer ORFs better than short ORFs?

A

As the longer the ORF, the less likely it is to occur by chance.

78
Q

What does sequence alignment software allow for?

A

Ffor a newly sequenced gene to be analysed to see if it is already known and stored in a database.

79
Q

What are the 2 most popular tools that are used for sequence alignment?

A

BLAST (Basic Local Alignment Search Tool).

FASTA (Fast All).

80
Q

What is the most widely used program in bioinformatics?

A

The BLAST program.

81
Q

What does the BLAST program do?

A

It searches through a database to find matching or similar sequences the one that is being tested.

82
Q

How do results from the BLAST program appear?

A

As high scoring segment pairs (HSPs).

Where the score is the amount of matches between the query sequence and the database sequence.

83
Q

What can we do after the matches have been produced by BLAST?

A

We can evaluate the matches.

84
Q

What is the main idea behind the BLAST tool?

A

To find regions of similarity between the sample sequence and the known sequences from the database.

85
Q

What are local similarities that have been detected by BLAST?

A

Where both sequences have a region of similarity that is based in a single location.

86
Q

What are global similarities that have been detected by BLAST?

A

Where the 2 sequences have regions of similarity all over the sequence.

87
Q

What kind of database is BLAST-P and what kind of query is used to search through the database?

A

A protein database.

A protein query.

88
Q

What kind of database is BLAST-X and what kind of query is used to search through the database?

A

A protein database.

A translated nucleotide query.

89
Q

What is compared via the use of BLAST-X?

A

This method compares the 6-frame translations of DNA to a protein database.

90
Q

What kind of database is tblastn and what kind of query is used to search through the database?

A

A translated nucleotide database.

A protein query.

91
Q

What kind of database is tblastx and what kind of query is used to search through the database?

A

A translated nucleotide database.

A translated nucleotide query.

92
Q

What is compared via the use of tblastx?

A

This method compares the 6 frame translations of a DNA query to the 6 frame translations of a DNA database.

93
Q

Each sequence of tblastx is comaprable to sequences from which other BLAST technique?

A

To BLAST-P sequences.

94
Q

What kind of database is FASTA and what kind of query is used to search through the database?

A

DNA or protein database.

A DNA or protein query.

95
Q

What kind of database is FAST-X and what kind of query is used to search through the database?

A

A protein database.

A translated DNA sequence.

96
Q

What kind of database is TFASTA and what kind of query is used to search through the database?

A

A translated DNA database.

A protein query.

97
Q

What marks the beginning of a query sequence that uses the FASTA format?

A

A single line description that is followed by the lines of sequence data.

98
Q

How is the description line distinguished from the sequence data line in a query sequence in FASTA format?

A

Becuase the description line has a greater than (“>”) symbol in the first column.

99
Q

How many characters should a FAST input have?

A

It should not exceed 80 characters.

100
Q

Can blank lines be entered into the FASTA format?

A

No.

101
Q

What is the input sequence for the BARE format of a BLAST search?

A

Lines of sequencing data without the FASTA definition line.

102
Q

What is the input sequence for the IDENTIFIER format of a BLAST search?

A

They are accession numbers, accession versions or gi’s.

These are sequence ID tags that the database has attached to a particular gene or protein.

103
Q

What is sequence homology?

A

The analysis of DNA sequences from different organisms to determine the evolutionary relationships.

104
Q

What is one flaw to sequence homology?

A

Bacteria can exchange DNA sequences via horizontal gene transfer.

105
Q

What software is often used by scientists to investigate phylogenetic relationships?

A

Multiple sequence alignment software such as CLUSTAL and COBALT.

106
Q

What do taxonomists create to show how closely different organisms are related?

A

Phylogenetic trees.