bioinformatics Flashcards
what does translational bioinformatics use as data
different biological aspects as input and outputs
what is a key element
the analysis of sequence data
why different formats?
different uses and applications of sequences
different databases store diff information associated with the sequences
some formats more common than others
what is FASTA format used for
proteins and nucleic acids
describe FASTA
most common and standard technique
structures in 2 (sometimes 3 ) parts
comment line identified by >
sequence
optional end with *
supports multi-sequence files
what is FASTQ
derived from FASTA and is applied for sequences generated from massive parallel sequencing
describe FASTQ
includes information about the quality of each bases in sequence
4 lines
@identifier and description
raw sequence letters
optional +
encodes quality values for each letter in line 2, therefore same no of charcaters
what is NAR
nucleic acids research
what are genome browsers
system to navigate and visualize genomes and their annotations
complex views using great amounts of info
integrates phenotypical and molecular info
allows customised views of different tracks
what kind of questions we can try to answer
how similar are these 2 sequences
are these sequences related?
in a new infectious disease can we identify the causing organism if we get a sequence
what does BLAST stand for
Basic Local Alignment Search Tool