Bioinformatics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

interdisciplinary
field that combines biology, computer science,
statistics, mathematics, and engineering to analyze
and interpret biological data, particularly data from
large datasets like genomes or protein sequences

A

Bioinformatics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

It is a widely-used format for
representing nucleotide or protein sequences.

A

FASTA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

It consists of a header line starting with ‘>’, followed by the sequence data on subsequent lines.

A

FASTA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

in sequence alignment, a ________ represents a position where one sequence has an insertion or
deletion relative to another sequence.

A

Gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

____________ are
introduced to optimize alignment and account for
evolutionary changes

A

Gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

___________ are
introduced to optimize alignment and account for
evolutionary changes.

A

Gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

It is the
sequence for which you are searching for similarities
or matches within a database

A

Query sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

It’s the sequence you
are using as a reference

A

Query sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

it is the
sequence(s) in a database against which the query
sequence is compared during sequence alignment or
similarity searches

A

Subject sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

it is a branching
diagram that depicts the evolutionary relationships
among a set of organisms, genes, or species

A

Phylogenetic tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

It
shows the inferred evolutionary history and
relatedness based on genetic or sequence data

A

Phylogenetic tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

it is a
unique numerical identifier assigned to each
sequence entry in the NCBI (National Center for
Biotechnology Information) databases.

A

GI number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

It provides a
stable and unique way to refer to a specific sequence
entry.

A

GI number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

It is a
unique identifier assigned to a sequence record in a
public sequence database (like GenBank, EMBL, or
DDBJ)

A

Accession number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Typically consist of letters
and numbers and are used to reference specific
sequence entries.

A

Accession number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Involves
identifying and labeling the features of a genome such as genes, regulatory sequences, and other
functional elements.

A

Genome annotation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

This process helps in
understanding the biological significance of the DNA
sequence.

A

Genome annotation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In sequence alignment or similarity searches, it is a numerical value that quantifies the level
of similarity or quality of alignment between two
sequences.

A

Score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Higher scores generally indicate more
significant similarity.(T or F)

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

It is a statistical
measure that estimates the number of different
alignments with scores equivalent to or better than a
given score that would occur by chance in a database
search.

A

Expect value (E-value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

A ___________ indicates a more significant
match or similarity.

A

lower E-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

A field which uses computers to store and analyze
molecular biological information

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

It is about finding and interpreting biological data
online

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

It is a field in which biology, mathematics, statistics, computer
science, information technology, and other health sciences are
merged into a single discipline to process biological data

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

It uses complex machines to read biological data at a much
faster rate than before.

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

There is a marriage between biology and informatics. (T or F)

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

The science of collecting and analyzing complex
biological data

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Allows the storage and management of large biological data sets

A

THE CREATION OF DATABASES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Data is being generated at a much greater pace than
its analysis (e.g. Human Genome Project)

A

THE CREATION OF DATABASES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

These are repositories so it’s like a bank of biologic
information and are designed to collect, archive, visualize, and
organize biologic data.

A

Databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

This is to enable scientists to have an
intelligent data description, interpretation, or retrieval.

A

Databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

There is
much data that has been generated especially since the
completion of the

A

Human Genome Project

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

When was Human Genome Project launched?

A

1990s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Objective of human genome project

A

To sequence
the entire human genome which consists of about 3.2 billion
base pairs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

It was completed in 2003 because of this there’s a
large amount of data that have to be interpreted or analyzed.

A

Human Genome Project

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Aside from the human genome, many other organisms were
completely sequenced. So there is again an enormous amount
of data that has to be understood that is why databases have
been created. (T or F)

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

PRINCIPAL COMPONENTS OF BIOINFORMATICS

A

*THE CREATION OF DATABASES
*THE DEVELOPMENT OF ALGORITHMS AND STATISTICS
*THE USE OF THESE TOOLS FOR THE ANALYSIS AND
INTERPRETATION OF VARIOUS TYPES OF
BIOLOGICAL DATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Determine relationships among members of large
data sets

A

THE DEVELOPMENT OF ALGORITHMS AND
STATISTICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

The large set of data are organized so that relationships can
be determined that is called

A

Algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Algorithm is applied in ________

A

Statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

including DNA, RNA and protein sequences, protein
structures, gene expression profiles, and biochemical
pathways

A

THE USE OF THESE TOOLS FOR THE ANALYSIS AND
INTERPRETATION OF VARIOUS TYPES OF
BIOLOGICAL DATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Sciences that attempt to describe a living organism
in terms of ‘omics’

A

BRANCHES OF BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

BRANCHES OF BIOINFORMATICS

A

Genomics
Transcriptomics
Proteomics
Microbiomics
Metabolomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

  • involves the description of sequences of
    the entire genome of an organism
A

Genomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

study of all RNA molecules in a
living organism

A

Transcriptomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

the description of the entire
complement of proteins in a living organism.

A

Proteomics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

They
study the sequence, 3D structures, and other
properties of proteins.

A

Proteomics

48
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

It is the entire proteins found in a living organism.

A

Proteomics

49
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

Pertains to microbes, viruses, fungi,
parasites, bacteria.

A

Microbiomics

50
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

The genomes of these
microorganisms are described within a specific environmental niche

A

Microbiomics

51
Q

IDENTIFY THE BRANCH OF BIOINFORMATICS

involves description of the chemical
processes involving metabolites.

A

Metabolomics

52
Q

DNA/RNA BIOINFORMATICS APPLICATIONS

A

● Retrieving DNA sequences from databases
● Computing nucleotide compositions
● Identifying restriction sites
● Designing polymerase chain-reaction (PCR) primers
● Identifying open reading frames (ORFs).
● Predicting elements of DNA/RNA secondary structure
● Finding repeats
● Computing the optimal alignment between two or
more DNA sequences
● Finding polymorphic sites in genes (single nucleotide
polymorphisms, SNPs)
● Assembling sequence fragments

53
Q

Identifying open reading frames (ORFs) - Open reading frames means that you have a sequence
which includes the

A

start codon until a stop codon

54
Q

WHY DO BIOINFORMATICS?

A

● It serves to save time when doing real experiments.
design primers
● You might want to do a simulated experiment on a
computer (‘ in silico’) instead of a real environment.

55
Q

Bioinformatics is very convenient for a scientist because it
serves to

A

Save him time when he wants to do a real
experiment. As the experiment or the research study may start by
simulating it in a computer first.

56
Q

When you do simulated
experiments in a computer, that is described as “in silico” so it
is done in a computer rather than a real environment. For
example, when you do PCR and you want to amplify a
particular DNA fragment, you design primers using
bioinformatic tools or software. (T or F)

A

TRUE

57
Q

Once you have designed a
primer, then you can do your actual laboratory experiment, we
call it the ____________

A

Wet lab

58
Q

Where the primer would be optimized and
eventually used in the amplification reaction.

A

Wet lab

59
Q

APPLICATIONS OF BIOINFORMATICS

A

● Sequence alignment and analysis
● Mapping and analyzing DNA, RNA, Protein, Amino
Acid, and Lipid sequences
● Creation and visualization of 3-D structure models for
biological molecules of significance, e.g., proteins
● Genome annotation
● Genetic diseases
● Designer Medicine

60
Q

APPLICATIONS IN VARIOUS FIELDS

A

● Microbial genome applications
● Molecular medicine
● Personalized medicine
● Gene therapy
● Drug development
● Antibiotic resistance
● Evolutionary studies
● Waste cleanup
● Biotechnology
● Climate change studies
● Alternative energy sources
● Crop improvement
● Forensic analysis
● Bio-weapon creation
● Insect resistance
● Improve nutritional quality
● Veterinary science

61
Q

The earliest databases
for DNA sequences and proteins were developed by three
groups of scientists from different parts of the world:

A

● Nucleic Acids (International Nucleotide Sequence
Database)
● Protein (Worldwide Protein Data Bank)

62
Q

IDENTIFY THE DATABASE

DDBJ (DNA Data Bank of Japan)

A

Nucleic Acids (International Nucleotide Sequence
Database)

63
Q

IDENTIFY THE DATABASE

EMBL (European Molecular Biology Lab)

A

Nucleic Acids (International Nucleotide Sequence
Database)

64
Q

IDENTIFY THE DATABASE

EMBL (European Molecular Biology Lab)

A

Nucleic Acids (International Nucleotide Sequence
Database)

65
Q

IDENTIFY THE DATABASE

Genbank (USA)

A

Nucleic Acids (International Nucleotide Sequence Database)

66
Q

IDENTIFY THE DATABASE

PDBj (Japan)

A

Protein (Worldwide Protein Data Bank)

67
Q

IDENTIFY THE DATABASE

RCSB PDB (USA)

A

Protein (Worldwide Protein Data Bank)

68
Q

DNA Data Bank of Japan

A

DDBJ

69
Q

Other databases

A

● Ensembl
● Human metabolome Database (HMDB)
● Gene Expression Databases - Mostly Microarray data
● Phenotypic Databases
● RNA Databases
● Amino Acid/Protein Databases
● Protein-Protein and other Molecular interactions
● Signal Transduction Pathway Databases
● Metabolic Pathway and Protein Function Databases
● Bacterial DNA Databases

70
Q

Database that provides data on the genome of
characteristic organisms

A

Ensembl

71
Q

Very useful particularly if you want to determine the
boundary of exons and introns in a eukaryotic gene.

A

Ensembl

72
Q

GENETIC ANALYSIS APPLICATION

A

● A disease may arise due to changes the sequence of
the gene being expressed
● Single Nucleotide Mutation: Sickle Cell Anemia

73
Q

A consequence of a change that has
occurred in the gene of hemoglobin particularly the beta
portion of hemoglobin.

A

Sickle cell anemia

74
Q

Mutations occurred in some individuals such that A is substituted by U so that the codon became GUG which codes for Vaseline. (T or F)

A

FALSE (Valine NOT VASELINE)

75
Q

In sickle cell anemia there was a point
mutation that occurred involving the codon GAG which codes

A

Glutamic acid

76
Q

Genetic characteristic

A

Genotype

77
Q

Physical characteristic

A

Phenotype

78
Q

Recessive trait

A

Sickle-Cell Anemia

79
Q

REVIEW THE FINDING THE DNA SEQUENCE OF A GENE, OWKI??

A

OWKI

80
Q

A way of rearranging sequences of DNA, RNA or
protein to identify regions of similarity

A

SEQUENCE ALIGNMENT

81
Q

Sequence alignment is made between

A

a known sequence (reference sequence)
and unknown sequence (query sequence)

82
Q

Reference sequence

A

Known sequence

83
Q

Query sequence

A

Unknown sequence

84
Q

TYPES OF SEQUENCE ALIGNMENT

A

Pairwise
Multiple

85
Q

Compare two sequences

A

Pairwise

86
Q

Compare more than two sequences

A

Multiple

87
Q

Pairwise

A

○ EMBOSS WATER
○ BLAST

88
Q

Multiple

A

○ MUSCLE
○ MAFFT
○ CLUSTAL Omega

89
Q

TYPES OF PAIRWISE SEQUENCE ALIGNMENT

A

Global alignment
Local alignment

90
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

Matching the residues (bases or
amino acids) of two sequences across their entire length.

A

Global alignment

91
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

matches the identical sequences

A

Global alignment

92
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

The two sequences are treated as potentially
equivalent

A

Global alignment

93
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

Comparing two genes with the
same function (in human vs.
mouse)

A

Global alignment

94
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

Comparing two proteins with similar
functions

A

Global alignment

95
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

Matching of two sequences from
regions which have more similarity with each other

A

Local alignment

96
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

○ The two sequences may or may not be
related

A

Local alignment

97
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

to see whether a substring (a part)
in one sequence aligns well with a substring
(a part) in the other sequence

A

Local alignment

98
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

Searching for local similarities in
large sequences (e.g., newly
sequenced genomes)

A

Local alignment

99
Q

IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT

Looking for conserved domains of
motifs in two proteins

A

Local alignment

100
Q

The residues are colored so that you can
easily see if there is difference if there is any variation among
the sequences.

A

Clustal omega

101
Q

When you have a multiple sequence
alignment, you will be able to determine if all of the sequences
are identical by the presence of an __________

A

Asterisk

102
Q

if there is a variation, there is no asterisk. (T or F)

A

TRUE

103
Q

MULTIPLE ALIGNMENT TOOLS: Analysis of more than 2 sequences

A

MUSCLE
MAFFT
Clustal Omega

104
Q

MUSCLE

A

Multiple Sequence Comparison by Log
Expectation

105
Q

MAFFT

A

Multiple Alignment using Fast Fourier
Transform

106
Q

It is a multiple sequence alignment tool that
arranges the sequences of DNA, RNA or protein to
identify regions of similarity

A

MUSCLE (Multiple Sequence Comparison by Log Expectation)

107
Q

Finds regions of local similarity between sequences just like MUSCLE and MAFT

A

NCBI: Basic Local Alignment Search Tool (BLAST)

108
Q

The amino acid sequences of proteins or the nucleotides of DNA sequences.

A

NCBI: Basic Local Alignment Search Tool (BLAST)

109
Q

Compare a query sequence with a library or database
of sequences, and identify library sequences that
resemble the query sequence above a certain
threshold

A

NCBI: Basic Local Alignment Search Tool (BLAST)

110
Q

Can be used to infer functional and evolutionary
relationships between sequences as well as help
identify members of gene families

A

NCBI: Basic Local Alignment Search Tool (BLAST)

111
Q

Read additional notes about NCBI: Basic Local Alignment Search Tool (BLAST), owki??

A

OWKIII

112
Q

Used to infer functional and evolutionary
relationships between sequences as well as help identify members of gene families

A

BLAST

113
Q

You supply multiple sequences to be aligned to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences

A

MULTIPLE ALIGNMENT

114
Q

Here you supply all the
sequences with the tools that we used like MUSCLE.

A

MULTIPLE ALIGNMENT

115
Q

it will align the sequences that you
uploaded and it does not necessarily look for
sequences in the database

A

MULTIPLE ALIGNMENT

116
Q

Read and analyze the difference of multiple sequence alignment and BLAST, and the summary. OWKI??

A

OWKIII