2: Introduction to Bioinformatics (FINALS) Flashcards by Ma. Clariza Luna

→ A field which uses computers to store and analyze molecular biological information

→ It is about finding and interpreting biological data online

→ Marriage between biology and informatics

→ Science of collecting and analyzing complex biological data

Bioinformatics

How well did you know this?

Not at all

Perfectly

→ A field where biology, mathematics, statistics, computer science, information technology, and other health sciences are merged into a single discipline to process biological data

→ Uses complex machines to read biological data at a much faster rate than before

Bioinformatics

How well did you know this?

Not at all

Perfectly

What are the 3 principal components of bioinformatics?

Creation of Databases
Development of Algorithms and Statistics
The use of these tools for Analysis and Interpretation of various types of biological data

How well did you know this?

Not at all

Perfectly

3 Principal Components of Bioinformatics:

→ are like repositories or banks of biologic informations and are designed to collect archive, visualize, and arrange biologic data

→ Allowed the storage and management of large biological data sets

→ Enable scientist to have an intelligent data description, interpretation or retrieval of dat

Databases

How well did you know this?

Not at all

Perfectly

3 Principal Components of Bioinformatics:

T or F

Data is being generated at a much greater pace than its analysis

How well did you know this?

Not at all

Perfectly

Example of bioinformatics:

→ made in the 1990’s

→objective is to sequence the entire human genome

→ Consist of about 3.2 billion base pairs

→ Finished in 2003

Human Genome Project

How well did you know this?

Not at all

Perfectly

3 Principal Components of Bioinformatics:

Determine relationships among members of large data sets

Development of algorithms and statistics

How well did you know this?

Not at all

Perfectly

3 Principal Components of Bioinformatics: Development of algorithms and statistics

Large set of data are organized so relationships can be determined

Algorithm

How well did you know this?

Not at all

Perfectly

3 Principal Components of Bioinformatics:

A concept under this is biological data

The use of these tools for analysis and interpretation of various types of biological data

How well did you know this?

Not at all

Perfectly

3 Principal Components of Bioinformatics: The use of these tools for analysis and interpretation of various types of biological data

→ Including DNA, RNA, and protein sequences, protein structures, gene expression profiles, and biochemical pathways

Biological data

How well did you know this?

Not at all

Perfectly

Sciences that attempt to describe a living organism in terms of “omics”

Branches of Bioinformatics

How well did you know this?

Not at all

Perfectly

Branches of Bioinformatics:

Involve the description of sequences of entire genome

Genomics

How well did you know this?

Not at all

Perfectly

Branches of Bioinformatics:

Study of all RNA molecules in a living organism

Transcriptomics

How well did you know this?

Not at all

Perfectly

Branches of Bioinformatics:

→ Description of the entire complement of proteins in a living organism

→ Entire proteins found in a living organism

→ Study of the Sequence, 3D Structures, and other Properties of all Proteins

Proteomics

How well did you know this?

Not at all

Perfectly

Branches of Bioinformatics:

→ Pertains to microbes like viruses, fungi, parasites, bacteria

→ Genomes of microorganisms are described within a specific environmental niche?

Microbiomics

How well did you know this?

Not at all

Perfectly

Branches of Bioinformatics:

→ Involved description of chemical processes involving metabolites

Metabolomics

How well did you know this?

Not at all

Perfectly

Branches of Bioinformatics:

→ Pertains to microbes like viruses, fungi, parasites, bacteria

→ Genomes of microorganisms are described within a specific environmental niche?

Microbiomics

How well did you know this?

Not at all

Perfectly

Familiarize the DNA/RNA Bioinformatics Applications

Retrieving DNA sequences from databases
Computing nucleotide compositions
Identifying restriction sites
Designing polymerase-chain reaction (PCR) primers
Identifying open reading frames (ORF)
Finding repeats
Computing the optimal alignment between 2 or more DNA sequences
Finding polymorphic sites in genes (SNPs)
Assembling sequence fragments

How well did you know this?

Not at all

Perfectly

Familiarize other applications in bioinformatics given

Sequence alignment and analysis
Mapping and analyzing DNA, RNA, Protein, Amino acid, and Lipid sequences
Creation and Visualization of 3D structure models for biological molecules of significance
Genome annotation
Genetic diseases
Designer medicine

How well did you know this?

Not at all

Perfectly

Familiarize Applications in Various Fields

tignan niyo nlng sa ppt to huhuhu

How well did you know this?

Not at all

Perfectly

Why do we use Bioninformatics?

Saves time when doing real experiment

How well did you know this?

Not at all

Perfectly

Importance of Bioinformatics:

T or F

Study should end by simulated experiment on computer instead of a real environment

F (Study might START by simulated experiment on computer instead of a real environment)

How well did you know this?

Not at all

Perfectly

Importance of Bioinformatics: Identify the process

Simulated experiment on computer = ?

Primer optimized and used in amplification reaction =

Simulated experiment on computer = In Silico

Primer optimized and used in amplification reaction = Wet Lab

How well did you know this?

Not at all

Perfectly

Importance of Bioinformatics: Identify whether “In Silico” or “Wet Lab”

Target Identification

In Silico

How well did you know this?

Not at all

Perfectly

Importance of Bioinformatics: Identify whether "In Silico" or "Wet Lab" Primer Characterization

Wet lab

Importance of Bioinformatics: Identify whether "In Silico" or "Wet Lab" Assay Optimization

Wet lab

DNA/RNA Bioinformatics Applications: → sequence with start codon (AUG), until a stop codon UAG, UGA, UAA → predicting elements of DNA/RNA secondary structure

open reading frames (ORF)

Three earliest DNA Sequences and Protein Databases?

1. Nucleic acids 2. Protein 3. Other databases?

Three earliest DNA Sequences and Protein Databases: What is the database for Nucleic acids

International Nucleotide Sequence Database

Three earliest DNA Sequences and Protein Databases: Composition of International Nucleotide Sequence Database

1. DDBJ (DNA DataBank of Japan) 2. EMBL (European Molecular Biology Lab) 3. GenBank (USA)

Three earliest DNA Sequences and Protein Databases: What is the database for Protein?

Worldwide Protein Data Bank

Three earliest DNA Sequences and Protein Databases: Familiarize the other databases

1. Ensembl 2. Human metabolome Database 3. Gene Expression Databases 4. Phenotypic Database 5. RNA Databases 6. Amino acid/protein Databases 7. RNA Databases 8. Protein-Protein and other Molecular Interactions 9. Signal Transduction Pathway Databases 10. Bacterial DNA Databases

T or F In Gene Analysis Application, changes the sequence of the gene binge expressed always result to normal and healthy person

F (A DISEASE MAY ARISE due to changes the sequence of the gene binge expressed)

Gene Analysis Application T or F Sickle cell anemia results from point mutation of tyrosine to valine in beta-acid chain

F (substitution of GLUTAMIC ACID to VALINE)

This refers to genetic characteristics

Genotype

This refers to Physical Characteristics

Phenotype

Gene Analysis Application: → leads to sickle cell anemia → A recessive trait

Single Nucleotide mutation

Gene Analysis Application: Single Nucleotide mutation Normal Sequence: G-A-G (Glutamic Acid) Mutated: G-U-G (Valine) Which amino acid became mutated?

Gene Analysis Application: Single Nucleotide mutation If the Father and Mother are Heterozygous for sickle cell gene, how many are: a. children who will manifest the disease b. normal children c. children who are carriers

a. ¼ of children develop sickle cell disease b. ¼ are normal c. ½ are carriers

What are the 2 Bioinformatic Actvities?

1. Finding DNA/Protein Sequence 2. Sequence Alignment

To find gene or protein sequences online, what websites should be used?

1. Genbank 2. Protein Data Bank

To find gene sequences online, what website should be used?

Genbank

To find protein sequences online, what website should be used?

Protein Data Bank

→ A way of rearranging sequences of DNA, RNA, or protein to identify regions of similarity

Sequence Alignment

Sequence Alignment: What are the 2 factors involved where sequence alignment is made

Reference and Unknown Sequence

Sequence Alignment: Reference sequence is also known as what?

Known, Subject sequence

Sequence Alignment: Unknown sequence is also known as?

Query sequence

Sequence Alignment: Familiarize Importance of identifying regions of similarity

1. To understand functional, structural or evolutionary relationships between the sequences 2. help identify dissimilar regions of the DNA sequence useful for designing primers

Sequence Alignment: Familiarize Importance of identifying regions of similarity If sequences are similar, what does it mean?

they have similar functions or structure

Sequence Alignment: Familiarize Importance of identifying regions of similarity Can either mean belonging to the same group or distant relationship

Evolutionary relationship

Sequence Alignment: Familiarize Importance of identifying regions of similarity T or F Identifying similar regions of DNA sequence is useful for designing primers

F (Identifying DISSIMILAR REGIONS of DNA sequence is useful for designing primers)

2 Types of Sequence Alignment

1. Pairwise 2. Multiple

Types of Sequence Alignment: Compare two sequences

Pairwise

Types of Sequence Alignment: Compare more than two sequences

Multiple

What are the websites used in pairwise sequence alignment?

1. EMBOSS WATER 2. BLAST

What are the websites used in multiple sequence alignment?

1. MUSCLE 2. MAFFT 3. CLUSTAL Omega

What are the Types of Pairwise Sequence Alignments?

1. Global alignment 2. Local alignment

Types of Pairwise Sequence Alignments → Matching the residues (bases or amino acids) of two sequences across their entire length → The whole of DNA is aligned

Global Alignment

3 Types of Pairwise Sequence Alignments → Matching of two sequences from regions which have more similarity with each other → The two sequences may or may not be related → to see whether a substring (a part) in one sequence aligned well with substring (a part) in other sequence

Local alignment

Sequence Alignments → There multiple sequences being aligned → The residues are colored that differences can easily be seen

Multiple Sequence alignment: Clustal Omega

What type of Pairwise Sequence Alignments is appropriate for the given application: Comparing two genes or proteins with the same function

Global Alignment

What type of Pairwise Sequence Alignments is appropriate for the given application: Searching for local similarities in large sequences

Local alignment

What type of Pairwise Sequence Alignments is appropriate for the given application: Looking for conserved domains of motifs in two proteins

Local Alignment

What type Sequence Alignments is appropriate for the given application: Determines if all of the sequences are identical by presence of ASTERISK

Multiple Sequence alignment: Clustal Omega

T or F In Multiple Sequence alignment: Clustal Omega, absence if asterisk means the sequence is similar

F (absence of asterisk means sequence is DISSIMILAR/VARIATION)

Pairwise Sequence Alignment: Emboss Water Straight line indicates that the sequences are?

Similar

Pairwise Sequence Alignment: Emboss Water Straight line indicates that the sequences are?

Similar

Pairwise Sequence Alignment: Emboss Water Dot/period indicates that the sequences are?

Dissimilar

Pairwise Sequence Alignment: Emboss Water Meaning of Y in the sequence?

Any pyrimidine

Pairwise Sequence Alignment: Emboss Water Meaning of R in the sequence?

Any purine

Pairwise Sequence Alignment: Emboss Water Meaning of N Ain the sequence?

Any bases

Pairwise Sequence Alignment: Website → Finds regions of local similarity between sequences → The amino acid sequences of proteins or nucleotide of DNA sequences → Compare a query sequence with a library or database of sequence → Identify library sequences that resemble the query sequence above certain threshold → To Identify uncharacterized genes

Basic Local Alignment Search Tool (BLAST)

Query, Database, Remarks of the program BLASTn?

Query: Nucleotide Database: Nucleotide Remarks: for high scoring matches

Query, Database, Remarks of the program BLASTp?

Query: Protein Database: Protein Remarks: uses substitution matrices

Query, Database, Remarks of the program BLASTx?

Query: Nucleotide (trans) Database: Protein Remarks: for novel DNA seqs and EST analysis

Query, Database, Remarks of the program TBLASTx?

Query: Protein Database: Nucleotide Remarks: for STS and EST assignments in databases

What are the 2 possible results of BLAST?

1. Graphic Summary 2. Sequences producing significant alignment

supply multiple sequences to be aligned to identity regions of similarity that may be a consequence of functional, structural, or evolutionary relationships | what program

Multiple Sequence Comparison by Log Expectation (MUSCLE)

Multiple Sequence Comparison by Log Expectation (MUSCLE): T or F Each sequence should have definition line preceded by the">" greater than symbol

Multiple Sequence Comparison by Log Expectation (MUSCLE): T or F Choose part of sequence that is similar for primer

F (DO NOT choose part of sequences that is similar; dapat di sila similar bb)

What appropriate alignment sequence should be used? what website? Aligning the envelope genes of the 4 dengue virus

Multiple Alignment Sequence: MUSCLE

BLAST or Multiple alignment You supply one or more query sequences

BLAST

BLAST or Multiple alignment Compares nucleotide or protein sequences to sequence databases

BLAST

BLAST or Multiple alignment Uses to infer functional and evolutionary relationships between sequences

BLAST | parangboth

BLAST or Multiple alignment Uses to infer functional and evolutionary relationships between sequences

BLAST (both dapat)

BLAST or Multiple alignment Help identify members of gene families

BLAST

BLAST or Multiple alignment You supply multiple sequences to be aligned to identity regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences

Multiple alignment

DEFINITION OF TERMS IN BIO INFORMATICS: → a text-based, bioinformatic data format used to store nucleotide or amino acid sequences (e.g. Deoxyribonucleic Acid [DNA] or Ribonucleic Acid [RNA]). → pronounced "Fast A" ("fast-aye") because the name is a shortening of "FAST-All".

FASTA

DEFINITION OF TERMS IN BIO INFORMATICS: Can be present in one of the sequences wherein one or more amino acid residues have been deleted from the sequence

Gap

DEFINITION OF TERMS IN BIO INFORMATICS: The input sequence that is being compared to others in the database aka sequence of interest

Query Sequence

DEFINITION OF TERMS IN BIO INFORMATICS: The sequence you are comparing to

Subject Sequence

DEFINITION OF TERMS IN BIO INFORMATICS: A diagram that depicts the lines of evolutionary descent of different species, organisms, or genes from a common ancestor

Phylogenetic tree

DEFINITION OF TERMS IN BIO INFORMATICS: A series of digits that are assigned consecutively to each sequence record processed by NCBI

GI number

DEFINITION OF TERMS IN BIO INFORMATICS: A unique identifier assigned to a record in sequence databases such as GenBank

Accession number

DEFINITION OF TERMS IN BIO INFORMATICS: The process of deriving the structural and functional information of a protein or gene from a raw data set using different analysis, comparison, estimation, precision, and other mining techniques

Genome annotation

DEFINITION OF TERMS IN BIO INFORMATICS: → A set of values for qualifying the set of one residue being substituted by another in an alignment → Calculated by adding substitution scores, defined for each aligned pair of letters, and gap scores for each run of letters in one segment aligned with null characters inserted into the other

Score

DEFINITION OF TERMS IN BIO INFORMATICS: A parameter that describes the number of hits one can “expect” to see by chance when searching a database of a particular size

Expect Value

Full name of BLAST?

Basic Local Alignment Search Tool

Full name of MUSCLE?

Multiple Sequence Comparison by Log Expectation

2: Introduction to Bioinformatics (FINALS) Flashcards

(100 cards)