Bioinformatics Flashcards
What is bioinformatics?
A scientific field that uses computer science and biology to study biological data
What is bioinformatics simply used for?
To store, analyse, and share information about DNA, amino acids, and other biological sequences
The 3 components of bioinformatics
- The development of new algorithms + statistics for assessing the relationship among large sets of biological data
- Application of these tools for the analysis and the interpretation of the various biological data.
- The development of database of database for an efficient storage, access and management of various biological informations
How does bioinformatics work?
Derives knowledge from computer analysis of biological data → consists of the information stored in the genetic code & experimental results + scientific literature
The 3 branches of bioinformatics
- Genomics
- Transcriptomics
- Proteomics
Proteomics
The sequencing of amino acids in protein, determining its 3D structure and relating it to the function of the protein
Transcriptomics
The study of transcriptome which includes the whole set of RNA molecules in one or a population of biological cells for a given set of environmental circumstances
Genomics
Extensive analysis of nucleic acids through molecular biology
techniques before the data is ready for processing by computers
cDNA
Obtained by reverse transcription of an RNA molecule
Sequence alignment
A way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
E.g, the identification of Covid-19 variants
Pairwise sequence alignment
Comparing 2 sequences to identify regions of similarity
Multiple sequence alignment
Aligning multiple sequences to study evolutionary relationships between
Tools and Algorithms for Sequence Analysis
- BLAST (Basic Local Alignment Search Tool) → Rapid sequence comparison
- Clustal Omega - Multiple sequences alignment
- Hidden Markov Models (HMM) → Pattern recognition in sequences
- FASTQ & FASTA Formats → File formats for storing sequence data
Genomic database
An online collection that stores and allows access to large amounts of genetic data, including DNA sequences, gene annotations, and variations, enabling researchers to compare and analyse genomic information across different studies and organisms.
E.g, researchers using GenBank to track influenza virus mutations for vaccine development
Protein function prediction
The computational method of determining the biological role of a protein based on its amino acid sequence — comparison to known proteins within similar sequences or structural features
Gene Ontology (GO) analysis
A method used in genome annotation and protein function prediction
Genome annotation
Identifying gene locations and functional elements within a genome
Protein Function Prediction
Using computational models to determine unknown protein functions
Methods for Protein Function Prediction
- Sequence similarity search
- Domain analysis
- Machine learning approaches
- Structural prediction
Sequence similarity search
The comparison of a protein sequence to a database of known proteins within similar sequences to identify homologous proteins with known functions
Domain analysis
The identification of conserved protein domains within a sequence that can indicate specific functions
Machine learning approaches
Using algorithms trained on large datasets of protein sequences and functional; annotation s to predict functions for new proteins
Structural prediction
The prediction of the 3D structure of a protein based ion its sequence, which can provide insights into its function
Understanding biological pathways
Identifying genes involved in specific cellular processes by analysing their predicted functions
Drug discovery
Identifying potential drug targets based on protein function predictions
- Identifying drug targets through genomics & virtual screening of drug compounds
Personalised medicine
Analysing individual genomes to predict disease susceptibility based on genetic variations
*Tailoring treatments based on an individual’s genetic profile
Evolutionary studies
Studying the evolution of protein functions by comparing sequences across different species
Data privacy
Protecting genetic information from missuse
Genetic discrimination
Ethical concerns in employment and insurance