Bioinformatics Lab Flashcards
BLAST (Basic Local Alignment Search Tool)
Used to find regions of local similarity between sequences by comparing a query sequence to a database of sequences. BLAST can identify homologous sequences, predict protein function, and evolutionary relationships.
ClustalW
A multiple sequence alignment program used to align three or more sequences to identify regions of similarity which may be conserved regions important for structure or function.
PDB (protein data bank)
A database of 3-D structural information of proteins, nucleic acids, and complex assemblies. It allows for visualization and analysis of protein structures.
Jmol
An open-source Java viewer for chemical structures in 3D. It can be used to visualize and manipulate molecular structures obtained from the PDB.
InterProScan
A tool that searches a protein sequence against the InterPro database, a comprehensive resource for protein families, domains, and functional sites.
What are the two main types of BLAST searches, and what are the differences between them?
(1) Nucleotide BLAST compares a nucleotide query sequence against a nucleotide database.
(2) Protein BLAST compares a protein query sequence against a protein database.
Explain the concept of an E-value in BLAST searches. What does a lower E-value indicate?
The E-value (Expect value) in BLAST represents the number of hits one can “expect” to see by chance when searching a database of a particular size. A lower E-value indicates a more significant match between the query sequence and the database sequence. In other words, it is less likely that the match occurred due to random chance.
What are Walker A and Walker B motifs, and why are they important in the study of ABC transporters?
Walker A and Walker B motifs are conserved amino acid sequences found in the nucleotide-binding domains (NBDs) of ABC transporters. These motifs are involved in ATP binding and hydrolysis, which is essential for the function of these transporters.
What is the significance of the LSGGQ motif in ABC transporters?
The LSGGQ motif is another conserved sequence found in ABC transporters, specifically in the NBDs. It is located between the Walker A and B motifs and is thought to be involved in interacting with the ATP molecule and in the conformational changes associated with ATP binding and hydrolysis.
You perform a ClustalW alignment of three ABC transporter protein sequences. Describe how you would identify the Walker A, Walker B, and LSGGQ motifs from the alignment output.
(1)
Examine the alignment for conserved regions: These regions will appear as columns with identical or similar amino acids across the sequences.
(2) Look for the specific amino acid sequences of the motifs:
- Walker A: GXXXXGK(S/T), where X represents any amino acid.
-Walker B: hhhhD, where h represents a hydrophobic amino acid.
-LSGGQ: This motif has the specific sequence LSGGQ.
(3) Note the start and end positions of the motifs in each sequence: This will help you understand the relative locations of the motifs and their potential functional significance.
What kind of information can you obtain from InterProScan about a protein sequence?
InterProScan can provide information about the conserved domains within a protein sequence, as well as the potential function of these domains. It can also identify important sites within the protein, such as active sites or binding sites. The results from InterProScan can be used to predict the overall function of the protein and to gain insights into its evolutionary relationships.
Why might there be an overlap in the domains identified by InterProScan for a particular protein sequence?
Overlaps in domains identified by InterProScan may occur because:
(1) Nested domains: Some protein domains are contained within larger domains.
(2) Functional similarity: Different databases or prediction methods used by InterProScan may identify domains with similar functions but slightly different boundaries.
(3) Modular evolution: Proteins often evolve through the combination of pre-existing domains, leading to overlapping or partially shared domains.