NCBI and BLAST Flashcards
What is bioinformatics
collection, classification, storage and analysis of biochemical and biological information
Who controls the nucleotide databases
DNA Data bank of Japan
National Center for Biotechnology Information
European Nucleotide Archive
What is the Structure of FASTA Format
First line = >Accession.1
Second line = nucleotide sequence
What is FASTQ used for
Quality DNA sequencing
What are the 3 parts of GenBank Format in NCBI
Header, feature table, sequence
Break down this into it’s components
SCU49845 5028bp DNA PLN 21-JUN-2000
Locus name aka Accessino
length of sequence
molecule type
GenBank division
Date of last modified
Within an accession, what are RefSeq is NM, NP, NC, and NZ
NM = mRNA
NP = protein
NC = genomic/chromosomal
NZ = mRNA
Where can you find the submitters identity and submission date
Last reference listed as “Direct Submission”
what does ^ symbol indicate
site between two adjoining nucleotides
eg. restriction enzymes site
calculation for # of nucleotides
big# - small # +1
eg. 11…20
20-11+1 = 10 nucleotides
What is the completement of ATT GCT A
TAG CCA T
explain substitution, transition, transversion, insertion, duplication, and deletion
sub = replace one nt for another
transition = purine to purine
Tranversion = purine to pyrimidine
insertion = extra nt
duplication = copy of nt(s) to 3’ end of the original
deletion = deleted nt
how to calculate DNA and amino acid identity
out of all nucleotides, how many are the same
the nucleotides create amino acids, how many are the same
homologs vs orthologs vs paralogs
homologs = encompass all orthologs and paralogs
orthologs = alpha vs betas
paralogs = mouse alpha and mouse beta
what is the definition of alignment
matching two or more biological sequences to get same residues and amino acids at same positions